[#31647] [Backport #3666] Backport of r26311 (Bug #2587) — Luis Lavena <redmine@...>

Backport #3666: Backport of r26311 (Bug #2587)

13 messages 2010/08/07

[#31666] [Bug #3677] unable to run certain gem binaries' in windows 7 — Roger Pack <redmine@...>

Bug #3677: unable to run certain gem binaries' in windows 7

10 messages 2010/08/10

[#31676] [Backport #3680] Splatting calls to_ary instead of to_a in some cases — Tomas Matousek <redmine@...>

Backport #3680: Splatting calls to_ary instead of to_a in some cases

10 messages 2010/08/11

[#31681] [Bug #3683] getgrnam on computer with NIS group (+)? — Rocky Bernstein <redmine@...>

Bug #3683: getgrnam on computer with NIS group (+)?

13 messages 2010/08/11

[#31843] Garbage Collection Question — Asher <asher@...>

This question is no doubt a function of my own lack of understanding, but I think that asking it will at least help some other folks see what's going on with the internals during garbage collection.

17 messages 2010/08/25
[#31861] Re: Garbage Collection Question — Roger Pack <rogerdpack2@...> 2010/08/26

> The question in short: when an object goes out of scope and has no

[#31862] Re: Garbage Collection Question — Asher <asher@...> 2010/08/26

Right - so how does a pointer ever get off the stack?

[#31873] Re: Garbage Collection Question — Kurt Stephens <ks@...> 2010/08/27

On 8/26/10 11:51 AM, Asher wrote:

[#31894] Re: Garbage Collection Question — Asher <asher@...> 2010/08/27

I very much appreciate the response, and this is helpful in describing the narrative, but it's still a few steps behind my question - but it may very well have clarified some points that help us get there.

[#31896] Re: Garbage Collection Question — Evan Phoenix <evan@...> 2010/08/27

You have introduced something called a "root node" without defining it. What do you mean by this?

[#31885] Avoiding $LOAD_PATH pollution — Eric Hodel <drbrain@...7.net>

Last year Nobu asked me to propose an API for adding an object to

21 messages 2010/08/27

[#31947] not use system for default encoding — Roger Pack <rogerdpack2@...>

It strikes me as a bit "scary" to use system locale settings to

19 messages 2010/08/30

[#31971] Change Ruby's License to BSDL + Ruby's dual license — "NARUSE, Yui" <naruse@...>

Ruby's License will change to BSDL + Ruby's dual license

16 messages 2010/08/31

[ruby-core:31960] Re: not use system for default encoding

From: Run Paint Run Run <runrun@...>
Date: 2010-08-30 20:55:35 UTC
List: ruby-core #31960
> It strikes me as a bit "scary" to use system locale settings to *arbitrarily*
> set Encoding.default_external

What do you mean by “arbitrarily”? The algorithm used
(i.e. http://goo.gl/soW7) is pretty straightforward. Presumably
a user’s locale encoding reflects that in which he prefers to work.

> For example, I develop on windows (def: IBM437).  This means that if I want
> this to work cross platform I have to specify IBM437 for every File.read (et
> al) that I use in my library.  So it is a bit scary.

Well, firstly, that’s a pretty odd encoding to use. In general, you’ll have a
far easier time if you use UTF-8 for everything, and legacy encodings only when
necessary or, possibly, writing in a CJK script. In any case, even if you
continue using that encoding, the external encoding only needs to be specified
explicitly if the files contain non-ASCII characters. And if they do, and
interoperability is your goal, then why are you using IBM437 in the first
place?

> Suggestion: default to UTF-8 *no matter where* then allow the user to change
> it if they want something else.

That stands in opposition to the design goals of Ruby’s M17N. See Naruse’s
article at http://goo.gl/Xy20 for the background. Or, to put it another way,
Ruby’s system was designed by encoding experts, including Unicode Consortium
member Martin J. Dürst, and still explicitly rejects defaulting to UTF-8.

> Or even default to BINARY (ASCII-8BIT) unless they specify.  Most users don't
> want/need encoding until they run into it--they can handle it then.

That was the philosophy of English-centric software development for many years,
but global distribution and the web discredited it. Users don’t think they care
about encoding, but then they process input containing non-ASCII byte
sequences, and everything blows up. For example, consider the following “binary”
representations of e acute:

  Encoding.list.map{|n| "é".encode(n).dump rescue nil}.compact.uniq
  #=> [""\\u{e9}"", ""\\x88m"", ""\\xA0\\xC1"", ""\\x8F\\xAB\\xB1"",
  ""\\xA8\\xA6"", ""\\xE9"", ""\\x00\\xE9".force_encoding("UTF-16BE")",
  ""\\xE9\\x00".force_encoding("UTF-16LE")",
  ""\\x00\\x00\\x00\\xE9".force_encoding("UTF-32BE")",
  ""\\xE9\\x00\\x00\\x00".force_encoding("UTF-32LE")", ""\\x82"", ""\\x8E"",
  ""e\\xCC\\x81"", ""\\xC3\\xA9""]

If you store text as byte sequences without associated encodings, how will you
display it? How do you pattern match against it? You can’t because the strategy
is "\\110\\97\\105\\118\\130", as IBM437 users would say, or, if you speak
GB2312, "\\110\\97\\105\\118\\168\\166". Ultimately, the closest you can get to
ignoring encodings while at the same time remaining interoperable, is by
storing data in a Unicode-compatible encoding—UTF-8 being the obvious general
choice—then transcoding your input into UTF-8. Even this requires that either
the input is tagged with an encoding, or you’re willing to use imperfect
heuristic algorithms to detect it.

In This Thread