[#407882] Ruby extremely slow compared to PHP — Mick Jagger <lists@...>

Hello there, how are you? Hope you are fine. I am a PHP programmer

17 messages 2013/06/02

[#407908] TCPServer/Socket and Marshal problem — Panagiotis Atmatzidis <atma@...>

Hello,

18 messages 2013/06/03

[#407946] Is rubyquiz.com dead? — Alphonse 23 <lists@...>

Thread title says everything.

18 messages 2013/06/04

[#408012] Need help understanding recursion. — pedro oliva <lists@...>

Ive been reading Chris Pine's book 'Learn to Program' and its been going

11 messages 2013/06/06

[#408129] Getting Started With Development — Chamila Wijayarathna <cdwijayarathna@...>

I'm new to Ruby Development. I downloaded source from Github, but couldn't

24 messages 2013/06/11
[#408131] Re: Getting Started With Development — Per-erik Martin <lists@...> 2013/06/11

Ruby is often installed on linux, or can be easily installed with the

[#408146] Re: Getting Started With Development — "Chamila W." <lists@...> 2013/06/11

Per-erik Martin wrote in post #1112021:

[#408149] Re: Getting Started With Development — "Carlo E. Prelz" <fluido@...> 2013/06/11

Subject: Re: Getting Started With Development

[#408198] NokoGiri XML Parser — "Devender P." <lists@...>

Hi,

11 messages 2013/06/13

[#408201] trying to load a .rb file in irb — "Eric D." <lists@...>

I am trying to load a ruby program into irb and it will not load.

12 messages 2013/06/13

[#408205] Can I use Sinatra to render dynamic pages? — Ruby Student <ruby.student@...>

Hell Team,

18 messages 2013/06/13
[#408219] Re: Can I use Sinatra to render dynamic pages? — Nicholas Van Weerdenburg <vanweerd@...> 2013/06/14

You should be able to do this without JavaScript by using streaming.

[#408228] Re: Can I use Sinatra to render dynamic pages? — Ruby Student <ruby.student@...> 2013/06/14

Well, I got some good suggestions from everyone here. I thank you all for

[#408275] Compare and sort one array according to another. — masta Blasta <lists@...>

I have two arrays of objects that look something like this:

14 messages 2013/06/17

[#408276] Comparing objects — "Thom T." <lists@...>

How do I compare two objects in Ruby, considering only attributes

15 messages 2013/06/17

[#408307] getting the most out of Ruby — robin wood <lists@...>

I write a lot of scripts in Ruby, most are small simple things but some

13 messages 2013/06/18

[#408309] Creating ruby script exe — Rochit Sen <lists@...>

Hi All,

17 messages 2013/06/18

[#408357] Beginners problem with database and datamapper — cristian cristian <lists@...>

Hi all!

28 messages 2013/06/20

[#408437] How do I input a variable floating point number into Ruby Programs — "Michael P F." <lists@...>

I want to evaluate the following interactively:

10 messages 2013/06/23

[#408518] #!/usr/bin/env: No such file or directory — Todd Sterben <lists@...>

I am new to both linux and ruby. I am using Ubuntu and Ruby 1.9

17 messages 2013/06/27

[#408528] Designing a Cabinet class — Mike Vezzani <lists@...>

Hello all,

12 messages 2013/06/27

[#408561] Find elment in array of hashes — Rodrigo Lueneberg <lists@...>

array = {:id=>1, :price =>0.25} # index[0]

23 messages 2013/06/28

Inconsistent IO character reading when converting encoding

From: "Xiao B." <lists@...>
Date: 2013-06-10 17:31:18 UTC
List: ruby-talk #408116
In Ruby 1.9.3-429, I am trying to parse plain text files with various
encodings that will ultimately be converted to UTF-8 strings. Non-ascii
characters work fine with a file encoded as UTF-8, but problems come up
with non-UTF-8 files.

Simplified example:

File.open(file) do |io|
  io.set_encoding("#{charset.upcase}:#{Encoding::UTF_8}")
  line, char =3D "", nil

  until io.eof? || char =3D=3D ?\n || char =3D=3D ?\r
    char =3D io.readchar
    puts "Character #{char} has #{char.each_codepoint.count} codepoints"
    puts "SLICE FAIL" unless char =3D=3D char.slice(0,1)

    line << char
  end
  line
end

Both files are just a single string =C3=A1=C3=81=C3=B0 encoded appropriat=
ely. I have
checked that the files have been encoded correctly via "$ file -i
<file_name>"

With a UTF-8 file, I get back:
Character =C3=A1 has 1 codepoints
Character =C3=81 has 1 codepoints
Character =C3=B0 has 1 codepoints

With an ISO-8859-1 file:
Character =C3=A1 has 2 codepoints
SLICE FAIL
Character =C3=81 has 2 codepoints
SLICE FAIL
Character =C3=B0 has 2 codepoints
SLICE FAIL

The way I am interpreting this is readchar is returning an incorrectly
converted encoding which is causing slice to return incorrectly.

Is this behavior correct? Or am I specifying the file external encoding
incorrectly? I would rather not rewrite this process so I am hoping I am
making a mistake somewhere. There are reasons why I am parsing files
this way, but I don't think those are relevant to my question.
Specifying the internal and external encoding as an option in File.open
yielded the same results.

-- =

Posted via http://www.ruby-forum.com/.=

In This Thread

Prev Next