From: Michael Selig Date: 2008-10-27T19:17:45+09:00 Subject: [ruby-core:19540] Re: String literal encoding (Was: Default source encoding (Was: [Bug #680] csv.rb: CSV.parse is toolate when encoding is mismatch)) On Mon, 27 Oct 2008 20:55:32 +1100, Nobuyoshi Nakada wrote: > Hi, > > At Mon, 27 Oct 2008 15:57:03 +0900, > Michael Selig wrote in [ruby-core:19535]: >> > Even in 1.8 or prior, -Ks has been mandatory for Shift_JIS >> > sources, so they have had -K in the shebang lines already. >> >> Why then can I write a ruby 1.8 script which does a "puts" of a >> Shift_JIS >> string (no shebang or magic comment), and have it run fine without -Ks? > > Because you are avoiding troublesome chars. Without such > chars, we can't write the words "display", "table", "software" > and "ruby". OK, I'm sure you know more about Japanese encodings that I do. But my original point is that 1.8 scripts exist which contain multibyte characters (eg UTF-8) which work fine under 1.8 without-K, but will fail under 1.9 unless a magic comment or -K is provided. > But it's very ambiguous and dangerous to imply encodings. We > can't trust locale for this purpose, at least. It's a trade-off between that and backward compatibility. I think the "danger" is not high and it gives backward compatibility, so my vote would be to use it. > You can use BOM to mean that the source is written in UTF-8. BOM? Byte order marker? How does that help with backward compatibility? Doesn't it still mean modifying the 1.8 script to work under 1.9? Cheers Mike