[#398788] Constructor or a Method — Rubyist Rohit <lists@...>

Take for instance this code:

13 messages 2012/09/01

[#398896] how to sum element of array — Edward QU <lists@...>

dear all

19 messages 2012/09/04

[#398936] best coding for limiting a value — Regis d'Aubarede <lists@...>

A) result=value<min ? min : (value > max ? max : value)

17 messages 2012/09/04

[#398962] Long calculation & time limit — toto tartemolle <lists@...>

Hello,

17 messages 2012/09/05

[#398997] OpenURI open method problem — "Derek T." <lists@...>

The code I am referring to looks like this:

12 messages 2012/09/05

[#399002] Parsing through downloaded html — Sybren Kooistra <lists@...>

Hi all,

28 messages 2012/09/06

[#399012] "Hiding" pictures(and source code if it's possible) — "Damián M. González" <lists@...>

Ey guys, how are you?

11 messages 2012/09/06

[#399083] regix in grep or something like this — Ferdous ara <lists@...>

Hi

12 messages 2012/09/07

[#399206] please help me with making script — Charmaine Willemsen <lists@...>

In this example i like to parse birthday and sexe

11 messages 2012/09/11

[#399218] Pathname#to_str withdrawn in 1.9? — matt@... (Matt Neuburg)

Just getting started experimenting with Ruby 1.9 (1.9.3) and my scripts

13 messages 2012/09/12

[#399227] Breaking Down the Block — incag neato <lists@...>

Can someone please explain in plain english how this block treats the

20 messages 2012/09/13

[#399244] ruby Range to array that acts like time objects? — "Jermaine O." <lists@...>

Hello everybody,

15 messages 2012/09/13

[#399293] Ruby on Ubuntu 12.04 LST — Bojan Jordanovski <lists@...>

Hello everybody,

13 messages 2012/09/14

[#399298] wow, YAML / Psych in 1.9.3 is *slow*! — matt@... (Matt Neuburg)

I just started trying Ruby 1.9.3, coming from Ruby 1.8.7, and was

12 messages 2012/09/14

[#399304] Ruby 1.9.3 and OS X Mountain Lion — sto.mar@...

Hi all,

16 messages 2012/09/14

[#399343] Class variables or Class singleton variables? — "Damián M. González" <lists@...>

Guys, how are you?

18 messages 2012/09/15

[#399386] Ruby - is it worth the effort? — neomex <neomex@...>

Hello,

19 messages 2012/09/17
[#399406] Re: Ruby - is it worth the effort? — Roger Pack <lists@...> 2012/09/17

Unfortunately with Ruby for me it's typically "fun and fast development"

[#399409] Re: Ruby - is it worth the effort? — Peter Zotov <whitequark@...> 2012/09/17

Roger Pack писал 17.09.2012 22:06:

[#399491] Re: Ruby - is it worth the effort? — Robert Klemme <shortcutter@...> 2012/09/19

On Mon, Sep 17, 2012 at 8:20 PM, Peter Zotov <whitequark@whitequark.org> wrote:

[#399421] Encoding question — Thomas Bednarz <lists@...>

I am new to ruby and play around with it a little bit at the moment. I

17 messages 2012/09/17

[#399441] Bug or feature — Damjan Rems <lists@...>

There has probably been some discussion about this problem so sorry if I

13 messages 2012/09/18

[#399451] Class variables — Aleksander Ciesielski <neomex@...>

Is it obligatory to use instance variables in classes? Can't we just

17 messages 2012/09/18

[#399479] Ruby SQL Select Sum 2 Columns? — Courtney Fay <lists@...>

I have the following definition which is looking at an apache database,

12 messages 2012/09/18

[#399556] still learning by doing - connecting rooms in a game — "Sebastjan H." <lists@...>

Hi,

28 messages 2012/09/20
[#399570] Re: still learning by doing - connecting rooms in a game — Henry Maddocks <hmaddocks@...> 2012/09/20

[#399574] Re: still learning by doing - connecting rooms in a game — "Sebastjan H." <lists@...> 2012/09/21

Henry Maddocks wrote in post #1076876:

[#399575] Re: still learning by doing - connecting rooms in a game — Henry Maddocks <hmaddocks@...> 2012/09/21

[#399576] Re: still learning by doing - connecting rooms in a game — "Sebastjan H." <lists@...> 2012/09/21

Could you be so kind as to suggest another book? I mean there are many

[#399585] Re: still learning by doing - connecting rooms in a game — "Sebastjan H." <lists@...> 2012/09/21

Sebastjan H. wrote in post #1076909:

[#399572] How would you allow variable from specific list of Fixnum? — Eliezer Croitoru <eliezer@...>

I have:

11 messages 2012/09/21

[#399623] Very important question - survey — Marc Heiler <lists@...>

Is matz more like a ninja or more like a samurai?

11 messages 2012/09/22

[#399695] inject problem — Roelof Wobben <rwobben@...>

26 messages 2012/09/25

[#399714] could initialize return an existing object instead of a new instance? — Gary Weaver <lists@...>

Is it possible for initialize to return an existing object instead of a

9 messages 2012/09/25

[#399811] Good book for getting started with Ruby? [I code Python!] — Alec Taylor <alec.taylor6@...>

I've learned programming in C++, Python and PHP at University. (also

12 messages 2012/09/28

[#399815] calcaulation with unknown numbers of numbers and options fail — Roelof Wobben <rwobben@...>

11 messages 2012/09/28

Re: Encoding question

From: Brian Candler <lists@...>
Date: 2012-09-22 09:57:32 UTC
List: ruby-talk #399619
Nathan Beyer wrote in post #1077055:
> I'm aware of the implementation details of Ruby 1.9's String. What
> I've been trying to figure out for a bit now is all of the
> idiosyncrasies of the standard library APIs.

Ah, well that's an open-ended question. Ruby's standard library is very 
large, and none of the encoding-related behaviour is documented. But 
File.open / getc are pretty fundamental to encoding behaviour.

> As such, I'm very curious
> about these details, so I performed a few more experiments.
> Interestingly I'm still not seeing this behavior. Could this have
> changed at some point between 1.9.0 and 1.9.3-p194? Am I running into
> something on OS-specific

I don't think so, unless it's behaviour of irb. I think you are just 
misinterpreting the results, and being confused by String#inspect.

Look at
    c.bytes.to_a
and
    c.pack("H*")
to see what's really in the String.

> I ran two tests. One with the inverted exclamation character, which is
> code point U+00A0 and the euro sign, which is code point U+20AC. I
> used these two characters, as the inverted exclamation has the same
> code point value in Unicode, ISO-8859-1 and Windows-1252, but the byte
> value is two bytes in UTF-8 and one byte in Windows-1252; the euro
> sign is in Unicode and Windows-1252, but at different code points.
>
> For the euro sign, i create a windows-1252 text file with a single
> byte of 0x80 (the code point value) and then opened up IRB and ran the
> following.
>
> 1.9.3-p194 :001 > f = File.open('euro_win1252.txt', 'r:windows-1252')
>  => #<File:euro_win1252.txt>
> 1.9.3-p194 :002 > c = f.getc
>  => "\x80"

That's the single byte you expected. However String#inspect has some 
hard-coded behaviour which treats bytes in the range 0x80-0x9f (I think) 
as unprintable, and therefore substitutes hex representation. "puts c" 
will squirt the string directly at the terminal, and because your 
terminal is UTF-8 but the string is invalid UTF-8, it will be 
unprintable. Your terminal will probably substitute some special 
character.

> 1.9.3-p194 :003 > c.encoding
>  => #<Encoding:Windows-1252>
> 1.9.3-p194 :004 > ct = c.encode('utf-8')
>  => "竄ャ"

You've transcoded it. Now ct contains two bytes, which is the UTF-8 
representation of that character. Then you've sent it to the screen.

By default ruby does *no* transcoding on output (i.e. it does not take 
into account the encoding of your terminal). Your terminal is in fact 
UTF-8, and so those two bytes get displayed as the one character you're 
sending.

(Because you're running OSX, your terminal is almost certainly UTF-8; 
mine is anyway)

> For the inverted exclamation point, i created a windows-1252 text file
> with a single byte of 0xA1 (the code point value) and then opened up
> IRB and ran the following.
>
>  1.9.3-p194 :001 > f = File.open('inverted_win1252.txt',
> 'r:windows-1252')
>   => #<File:inverted_win1252.txt>
>  1.9.3-p194 :002 > c = f.getc
>   => "\xA1"

Again that's one byte; for some reason String#inspect or irb is showing 
it in hex representation and I don't know why in this case. But if it 
didn't, it would be unprintable on a UTF-8 terminal.

>  1.9.3-p194 :003 > c.encoding
>   => #<Encoding:Windows-1252>
>  1.9.3-p194 :004 > ct = c.encode('utf-8')
>   => "ツ。"

Now ct contains 2 bytes, the UTF-8 representation of that character, and 
your terminal displays it properly.

Anyway, try running something like this from the command line and see if 
it's any clearer, because it eliminates any possible interaction with 
irb.

File.open("inverted_win1252.txt","wb") do |f|
  f.write "\xA1"
end
File.open("inverted_win1252.txt","r:windows-1252") do |f|
  c = f.getc
  puts c.bytes.to_a
  puts c.unpack("H*")
  puts c.encoding
  puts c.inspect
  puts c
  ct = c.encode("utf-8")
  puts ct.bytes.to_a
  puts ct.unpack("H*")
  puts ct.encoding
  puts ct.inspect
  puts ct
end

Regards,

Brian.

-- 
Posted via http://www.ruby-forum.com/.

In This Thread