[#398788] Constructor or a Method — Rubyist Rohit <lists@...>

Take for instance this code:

13 messages 2012/09/01

[#398896] how to sum element of array — Edward QU <lists@...>

dear all

19 messages 2012/09/04

[#398936] best coding for limiting a value — Regis d'Aubarede <lists@...>

A) result=value<min ? min : (value > max ? max : value)

17 messages 2012/09/04

[#398962] Long calculation & time limit — toto tartemolle <lists@...>

Hello,

17 messages 2012/09/05

[#398964] Compiling ruby from source on windows — GPad <peterpan105105@...>

Hi to all,=0AI'm trying to compile ruby on my windows 7. I have already a r=

10 messages 2012/09/05

[#398997] OpenURI open method problem — "Derek T." <lists@...>

The code I am referring to looks like this:

12 messages 2012/09/05

[#399002] Parsing through downloaded html — Sybren Kooistra <lists@...>

Hi all,

28 messages 2012/09/06

[#399012] "Hiding" pictures(and source code if it's possible) — "Damián M. González" <lists@...>

Ey guys, how are you?

11 messages 2012/09/06

[#399083] regix in grep or something like this — Ferdous ara <lists@...>

Hi

12 messages 2012/09/07

[#399206] please help me with making script — Charmaine Willemsen <lists@...>

In this example i like to parse birthday and sexe

11 messages 2012/09/11

[#399218] Pathname#to_str withdrawn in 1.9? — matt@... (Matt Neuburg)

Just getting started experimenting with Ruby 1.9 (1.9.3) and my scripts

13 messages 2012/09/12

[#399227] Breaking Down the Block — incag neato <lists@...>

Can someone please explain in plain english how this block treats the

20 messages 2012/09/13

[#399244] ruby Range to array that acts like time objects? — "Jermaine O." <lists@...>

Hello everybody,

15 messages 2012/09/13

[#399293] Ruby on Ubuntu 12.04 LST — Bojan Jordanovski <lists@...>

Hello everybody,

13 messages 2012/09/14

[#399298] wow, YAML / Psych in 1.9.3 is *slow*! — matt@... (Matt Neuburg)

I just started trying Ruby 1.9.3, coming from Ruby 1.8.7, and was

12 messages 2012/09/14

[#399304] Ruby 1.9.3 and OS X Mountain Lion — sto.mar@...

Hi all,

16 messages 2012/09/14

[#399343] Class variables or Class singleton variables? — "Damián M. González" <lists@...>

Guys, how are you?

18 messages 2012/09/15

[#399386] Ruby - is it worth the effort? — neomex <neomex@...>

Hello,

19 messages 2012/09/17
[#399406] Re: Ruby - is it worth the effort? — Roger Pack <lists@...> 2012/09/17

Unfortunately with Ruby for me it's typically "fun and fast development"

[#399409] Re: Ruby - is it worth the effort? — Peter Zotov <whitequark@...> 2012/09/17

Roger Pack писал 17.09.2012 22:06:

[#399491] Re: Ruby - is it worth the effort? — Robert Klemme <shortcutter@...> 2012/09/19

On Mon, Sep 17, 2012 at 8:20 PM, Peter Zotov <whitequark@whitequark.org> wr=

[#399421] Encoding question — Thomas Bednarz <lists@...>

I am new to ruby and play around with it a little bit at the moment. I

17 messages 2012/09/17

[#399441] Bug or feature — Damjan Rems <lists@...>

There has probably been some discussion about this problem so sorry if I

13 messages 2012/09/18

[#399451] Class variables — Aleksander Ciesielski <neomex@...>

Is it obligatory to use instance variables in classes? Can't we just

17 messages 2012/09/18

[#399479] Ruby SQL Select Sum 2 Columns? — Courtney Fay <lists@...>

I have the following definition which is looking at an apache database,

12 messages 2012/09/18

[#399556] still learning by doing - connecting rooms in a game — "Sebastjan H." <lists@...>

Hi,

28 messages 2012/09/20
[#399570] Re: still learning by doing - connecting rooms in a game — Henry Maddocks <hmaddocks@...> 2012/09/20

[#399574] Re: still learning by doing - connecting rooms in a game — "Sebastjan H." <lists@...> 2012/09/21

Henry Maddocks wrote in post #1076876:

[#399575] Re: still learning by doing - connecting rooms in a game — Henry Maddocks <hmaddocks@...> 2012/09/21

[#399576] Re: still learning by doing - connecting rooms in a game — "Sebastjan H." <lists@...> 2012/09/21

Could you be so kind as to suggest another book? I mean there are many

[#399585] Re: still learning by doing - connecting rooms in a game — "Sebastjan H." <lists@...> 2012/09/21

Sebastjan H. wrote in post #1076909:

[#399572] How would you allow variable from specific list of Fixnum? — Eliezer Croitoru <eliezer@...>

I have:

11 messages 2012/09/21

[#399623] Very important question - survey — Marc Heiler <lists@...>

Is matz more like a ninja or more like a samurai?

11 messages 2012/09/22

[#399695] inject problem — Roelof Wobben <rwobben@...>

26 messages 2012/09/25

[#399714] could initialize return an existing object instead of a new instance? — Gary Weaver <lists@...>

Is it possible for initialize to return an existing object instead of a

9 messages 2012/09/25

[#399811] Good book for getting started with Ruby? [I code Python!] — Alec Taylor <alec.taylor6@...>

I've learned programming in C++, Python and PHP at University. (also

12 messages 2012/09/28

[#399815] calcaulation with unknown numbers of numbers and options fail — Roelof Wobben <rwobben@...>

11 messages 2012/09/28

Re: Encoding question

From: Brian Candler <lists@...>
Date: 2012-09-22 09:57:32 UTC
List: ruby-talk #399619
Nathan Beyer wrote in post #1077055:
> I'm aware of the implementation details of Ruby 1.9's String. What
> I've been trying to figure out for a bit now is all of the
> idiosyncrasies of the standard library APIs.

Ah, well that's an open-ended question. Ruby's standard library is very =

large, and none of the encoding-related behaviour is documented. But =

File.open / getc are pretty fundamental to encoding behaviour.

> As such, I'm very curious
> about these details, so I performed a few more experiments.
> Interestingly I'm still not seeing this behavior. Could this have
> changed at some point between 1.9.0 and 1.9.3-p194? Am I running into
> something on OS-specific

I don't think so, unless it's behaviour of irb. I think you are just =

misinterpreting the results, and being confused by String#inspect.

Look at
    c.bytes.to_a
and
    c.pack("H*")
to see what's really in the String.

> I ran two tests. One with the inverted exclamation character, which is
> code point U+00A0 and the euro sign, which is code point U+20AC. I
> used these two characters, as the inverted exclamation has the same
> code point value in Unicode, ISO-8859-1 and Windows-1252, but the byte
> value is two bytes in UTF-8 and one byte in Windows-1252; the euro
> sign is in Unicode and Windows-1252, but at different code points.
>
> For the euro sign, i create a windows-1252 text file with a single
> byte of 0x80 (the code point value) and then opened up IRB and ran the
> following.
>
> 1.9.3-p194 :001 > f =3D File.open('euro_win1252.txt', 'r:windows-1252')=

>  =3D> #<File:euro_win1252.txt>
> 1.9.3-p194 :002 > c =3D f.getc
>  =3D> "\x80"

That's the single byte you expected. However String#inspect has some =

hard-coded behaviour which treats bytes in the range 0x80-0x9f (I think) =

as unprintable, and therefore substitutes hex representation. "puts c" =

will squirt the string directly at the terminal, and because your =

terminal is UTF-8 but the string is invalid UTF-8, it will be =

unprintable. Your terminal will probably substitute some special =

character.

> 1.9.3-p194 :003 > c.encoding
>  =3D> #<Encoding:Windows-1252>
> 1.9.3-p194 :004 > ct =3D c.encode('utf-8')
>  =3D> "=E2=82=AC"

You've transcoded it. Now ct contains two bytes, which is the UTF-8 =

representation of that character. Then you've sent it to the screen.

By default ruby does *no* transcoding on output (i.e. it does not take =

into account the encoding of your terminal). Your terminal is in fact =

UTF-8, and so those two bytes get displayed as the one character you're =

sending.

(Because you're running OSX, your terminal is almost certainly UTF-8; =

mine is anyway)

> For the inverted exclamation point, i created a windows-1252 text file
> with a single byte of 0xA1 (the code point value) and then opened up
> IRB and ran the following.
>
>  1.9.3-p194 :001 > f =3D File.open('inverted_win1252.txt',
> 'r:windows-1252')
>   =3D> #<File:inverted_win1252.txt>
>  1.9.3-p194 :002 > c =3D f.getc
>   =3D> "\xA1"

Again that's one byte; for some reason String#inspect or irb is showing =

it in hex representation and I don't know why in this case. But if it =

didn't, it would be unprintable on a UTF-8 terminal.

>  1.9.3-p194 :003 > c.encoding
>   =3D> #<Encoding:Windows-1252>
>  1.9.3-p194 :004 > ct =3D c.encode('utf-8')
>   =3D> "=C2=A1"

Now ct contains 2 bytes, the UTF-8 representation of that character, and =

your terminal displays it properly.

Anyway, try running something like this from the command line and see if =

it's any clearer, because it eliminates any possible interaction with =

irb.

File.open("inverted_win1252.txt","wb") do |f|
  f.write "\xA1"
end
File.open("inverted_win1252.txt","r:windows-1252") do |f|
  c =3D f.getc
  puts c.bytes.to_a
  puts c.unpack("H*")
  puts c.encoding
  puts c.inspect
  puts c
  ct =3D c.encode("utf-8")
  puts ct.bytes.to_a
  puts ct.unpack("H*")
  puts ct.encoding
  puts ct.inspect
  puts ct
end

Regards,

Brian.

-- =

Posted via http://www.ruby-forum.com/.=

In This Thread