[#395238] rubygem: ispunity (unite all your internet connections) — Arun Tomar <tomar.arun@...>

Dear friends,

12 messages 2012/05/01

[#395250] Overwriting one Ruby array or arrays with another — Craig Law <lists@...>

Hi

14 messages 2012/05/02

[#395258] array of strings - finding letter combinations — "Sebastjan H." <lists@...>

Hi All,

16 messages 2012/05/02

[#395357] Why Enumerator#next does not return more than one value? — Földes László <lists@...>

If I have an Enumerator which yields elements of a mathematical series

10 messages 2012/05/07

[#395373] How to use Data_Wrap_Struct to assign the DATA VALUE to an exsiting Ruby object? — Iñaki Baz Castillo <ibc@...>

Hi, my code receives an arbitrary klass name (provided by the user)

8 messages 2012/05/07

[#395429] passing via instance variable or regular () — sam jam <lists@...>

def first

10 messages 2012/05/10

[#395463] I'm looking for a Metaprogramming Project — Phil Stone <lists@...>

Hello,

19 messages 2012/05/11

[#395548] A million reasons why Encoding was a mistake — Marc Heiler <lists@...>

Newcomer wants to try Ruby.

15 messages 2012/05/15
[#395561] Re: A million reasons why Encoding was a mistake — Ryan Davis <ryand-ruby@...> 2012/05/15

[#395595] Re: A million reasons why Encoding was a mistake — Brian Candler <lists@...> 2012/05/16

I will add that the OP is not entirely alone in his opinion.

[#395551] How to ensure that a block runs entirely after other threads? (Thread.exclusive does not "work") — Iñaki Baz Castillo <ibc@...>

Hi, I expected that in the following example code, thread t1 would not

8 messages 2012/05/15

[#395575] GUI with ruby on windows — David Acosta <lists@...>

hello friends, i am a begginer and i have a litlle question, how can i

17 messages 2012/05/16

[#395604] what is going wrong here? — roob noob <lists@...>

Notice the initialization of both classes in each of the examples, if

20 messages 2012/05/16

[#395646] rb_gc_register_address() or rb_gc_mark()? — Iñaki Baz Castillo <ibc@...>

Hi, I've bad experiences with rb_gc_register_address(), it does never

16 messages 2012/05/17

[#395686] reading from and writing to a Unicode encoded file — "Sebastjan H." <lists@...>

Hi,

19 messages 2012/05/18
[#395694] Re: reading from and writing to a Unicode encoded file — Regis d'Aubarede <lists@...> 2012/05/18

Hello,

[#395697] Re: reading from and writing to a Unicode encoded file — "Sebastjan H." <lists@...> 2012/05/18

Regis d'Aubarede wrote in post #1061272:

[#395698] Re: reading from and writing to a Unicode encoded file — Regis d'Aubarede <lists@...> 2012/05/18

Sebastjan H. wrote in post #1061276:

[#395699] Re: reading from and writing to a Unicode encoded file — "Sebastjan H." <lists@...> 2012/05/18

Regis d'Aubarede wrote in post #1061277:

[#395750] Re: reading from and writing to a Unicode encoded file - issues when using Shoes — "Sebastjan H." <lists@...> 2012/05/21

Hi,

[#395754] Re: reading from and writing to a Unicode encoded file - issues when using Shoes — "Sebastjan H." <lists@...> 2012/05/21

Sebastjan H. wrote in post #1061483:

[#395740] ? Ruby through CGI and Rails — Shaun Lloyd <list@...>

Hi everybody,

22 messages 2012/05/21
[#395764] Re: Ruby through CGI and Rails — Brian Candler <lists@...> 2012/05/21

Shaun Lloyd wrote in post #1061455:

[#395786] Re: Ruby through CGI and Rails — Shaun Lloyd <list@...> 2012/05/22

On 22/05/12 03:37, Brian Candler wrote:

[#395838] Re: Ruby through CGI and Rails — Brian Candler <lists@...> 2012/05/23

Shaun Lloyd wrote in post #1061602:

[#395787] Changing self class from inside a method?? — David Madison <lists@...>

Let's start off with the assumption I want a method that allows an

10 messages 2012/05/22

[#395841] Memory-efficient set of Fixnums — George Dupre <lists@...>

Hi,

25 messages 2012/05/23

[#395883] looking for a ruby idiom : r=foo; return r if r — botp <botpena@...>

Hi All,

11 messages 2012/05/24

[#395966] Am I justified to use a global variable if it must be used in all scopes? — Phil Stone <lists@...>

Hello,

12 messages 2012/05/27

[#396010] does this leak more than the size of the string via timing side channels — rooby shoez <lists@...>

string1 = "string"

16 messages 2012/05/29

[#396038] Is it possible to avoid longjmp in exceptions, Thread#kill, exit(), signals? — Iñaki Baz Castillo <ibc@...>

Hi, my Ruby C extension runs a C loop (libuv) without GVL. At some

8 messages 2012/05/29

Re: A million reasons why Encoding was a mistake

From: Austin Ziegler <halostatue@...>
Date: 2012-05-20 20:37:12 UTC
List: ruby-talk #395737
On Wed, May 16, 2012 at 9:02 AM, Brian Candler <lists@ruby-forum.com> wrote:
> I will add that the OP is not entirely alone in his opinion.

The OP may not be alone in his opinion, but that's because encodings
are broken in general.

This is *not* a Ruby problem, this is a *data* problem.

C gets it wrong because it assumes that characters, code points, and
bytes are the same (but it gets a pass because it was created in a
time when this was true).

Java gets it wrong because it uses a nominally-UTF-16 character width
(it's actually UCS-2) which doesn't allow for UTF-16 surrogates.

Python and Java get it wrong because they always assume that Unicode
is a safe, reliable, and reversible transformation (and they don't
work well with non-Unicode encoding).

The problem the OP had? Partially a library problem for not shifting
to Ruby 1.9 assumptions (I've slowly been moving my libraries to state
their encodings up front, but it's a pain because I just don't have
the time).

Matz and others have worked very hard to make sure that Ruby 1.9 works
well if you follow certain rules regarding your inputs and outputs.
These rules, by the way, are more or less what Joel Spolsky wrote
almost nine years ago:
http://www.joelonsoftware.com/articles/Unicode.html

If you don't respect your encodings, they will bite you. They may not
bite you up front (as they do with Ruby, because it exposes these
things which are painful), but they *will* bite you.

Ruby got it right, because it acknowledges that (a) this is hard and
(b) gives you the tools you need in order to make this less painful.
It also doesn't (c) incorrectly assume that everything is or can be
expressed safely in Unicode. (Shift-JIS will not roundtrip to Unicode
and back for some characters.)

-a
-- 
Austin Ziegler halostatue@gmail.com austin@halostatue.ca
http://www.halostatue.ca/ http://twitter.com/halostatue

In This Thread