[#13161] hacking on the "heap" implementation in gc.c — Lloyd Hilaiel <lloyd@...>

Hi all,

16 messages 2007/11/01

[#13182] Thinking of dropping YAML from 1.8 — Urabe Shyouhei <shyouhei@...>

Hello all.

14 messages 2007/11/03

[#13315] primary encoding and source encoding — David Flanagan <david@...>

I've got a couple of questions about the handling of primary encoding.

29 messages 2007/11/08
[#13331] Re: primary encoding and source encoding — Yukihiro Matsumoto <matz@...> 2007/11/09

Hi,

[#13368] method names in 1.9 — "David A. Black" <dblack@...>

Hi --

61 messages 2007/11/10
[#13369] Re: method names in 1.9 — Yukihiro Matsumoto <matz@...> 2007/11/10

Hi,

[#13388] Re: method names in 1.9 — Charles Oliver Nutter <charles.nutter@...> 2007/11/11

Yukihiro Matsumoto wrote:

[#13403] Re: method names in 1.9 — "Austin Ziegler" <halostatue@...> 2007/11/11

On 11/11/07, Charles Oliver Nutter <charles.nutter@sun.com> wrote:

[#13410] Re: method names in 1.9 — David Flanagan <david@...> 2007/11/11

Austin Ziegler wrote:

[#13413] Re: method names in 1.9 — Charles Oliver Nutter <charles.nutter@...> 2007/11/11

David Flanagan wrote:

[#13423] Re: method names in 1.9 — Jordi <mumismo@...> 2007/11/12

Summing it up:

[#13386] Re: method names in 1.9 — Trans <transfire@...> 2007/11/11

[#13391] Re: method names in 1.9 — Matthew Boeh <mboeh@...> 2007/11/11

On Sun, Nov 11, 2007 at 05:50:18PM +0900, Trans wrote:

[#13457] mingw rename — "Roger Pack" <rogerpack2005@...>

Currently for different windows' builds, the names for RUBY_PLATFORM

13 messages 2007/11/13

[#13485] Proposal: Array#walker — Wolfgang Nádasi-Donner <ed.odanow@...>

Good morning all together!

23 messages 2007/11/14
[#13486] Re: Proposal: Array#walker — Wolfgang Nádasi-Donner <ed.odanow@...> 2007/11/14

A nicer version may be...

[#13488] Re: Proposal: Array#walker — Trans <transfire@...> 2007/11/14

[#13495] Re: Proposal: Array#walker — Trans <transfire@...> 2007/11/14

[#13498] state of threads in 1.9 — Jordi <mumismo@...>

Are Threads mapped to threads on the underlying operating system in

30 messages 2007/11/14
[#13519] Re: state of threads in 1.9 — "Bill Kelly" <billk@...> 2007/11/14

[#13526] Re: state of threads in 1.9 — Eric Hodel <drbrain@...7.net> 2007/11/14

On Nov 14, 2007, at 11:18 , Bill Kelly wrote:

[#13528] test/unit and miniunit — Ryan Davis <ryand-ruby@...>

When is the 1.9 freeze?

17 messages 2007/11/14

[#13564] Thoughts about Array#compact!, Array#flatten!, Array#reject!, String#strip!, String#capitalize!, String#gsub!, etc. — Wolfgang Nádasi-Donner <ed.odanow@...>

Good evening all together!

53 messages 2007/11/15
[#13575] Re: Thoughts about Array#compact!, Array#flatten!, Array#reject!, String#strip!, String#capitalize!, String#gsub!, etc. — "Nikolai Weibull" <now@...> 2007/11/15

On Nov 15, 2007 8:14 PM, Wolfgang N=E1dasi-Donner <ed.odanow@wonado.de> wro=

[#13578] Re: Thoughts about Array#compact!, Array#flatten!, Array#reject!, String#strip!, String#capitalize!, String#gsub!, etc. — Michael Neumann <mneumann@...> 2007/11/16

Nikolai Weibull schrieb:

[#13598] wondering about #tap (was: Re: Thoughts about Array#compact!, Array#flatten!, Array#reject!, String#strip!, String#capitalize!, String#gsub!, etc.) — "David A. Black" <dblack@...> 2007/11/16

Hi --

[#13605] Re: wondering about #tap (was: Re: Thoughts about Array#compact!, Array#flatten!, Array#reject!, String#strip!, String#capitalize!, String#gsub!, etc.) — Trans <transfire@...> 2007/11/16

[#13612] Re: wondering about #tap (was: Re: Thoughts about Array#compact!, Array#flatten!, Array#reject!, String#strip!, String#capitalize!, String#gsub!, etc.) — "David A. Black" <dblack@...> 2007/11/16

Hi --

[#13624] Re: wondering about #tap (was: Re: Thoughts about Array#compact!, Array#flatten!, Array#reject!, String#strip!, String#capitalize!, String#gsub!, etc.) — "Nikolai Weibull" <now@...> 2007/11/16

On Nov 16, 2007 12:40 PM, David A. Black <dblack@rubypal.com> wrote:

[#13632] Re: wondering about #tap — David Flanagan <david@...> 2007/11/16

David A. Black wrote:

[#13634] Re: wondering about #tap — "David A. Black" <dblack@...> 2007/11/16

Hi --

[#13636] Re: wondering about #tap — "Rick DeNatale" <rick.denatale@...> 2007/11/16

On Nov 16, 2007 12:40 PM, David A. Black <dblack@rubypal.com> wrote:

[#13637] Re: wondering about #tap — murphy <murphy@...> 2007/11/16

Rick DeNatale wrote:

[#13640] Re: wondering about #tap — Wolfgang Nádasi-Donner <ed.odanow@...> 2007/11/16

murphy schrieb:

[#13614] Suggestion for native thread tests — "Eust痃uio Rangel" <eustaquiorangel@...>

Hi!

12 messages 2007/11/16

[#13685] Problems with \M-x in utf-8 encoded strings — Wolfgang Nádasi-Donner <ed.odanow@...>

Hi!

11 messages 2007/11/18

[#13741] retry semantics changed — Dave Thomas <dave@...>

In 1.8, I could write:

46 messages 2007/11/23
[#13742] Re: retry semantics changed — "Brian Mitchell" <binary42@...> 2007/11/23

On Nov 23, 2007 12:06 PM, Dave Thomas <dave@pragprog.com> wrote:

[#13743] Re: retry semantics changed — Dave Thomas <dave@...> 2007/11/23

[#13746] Re: retry semantics changed — Yukihiro Matsumoto <matz@...> 2007/11/23

Hi,

[#13747] Re: retry semantics changed — Dave Thomas <dave@...> 2007/11/23

[#13748] Re: retry semantics changed — Yukihiro Matsumoto <matz@...> 2007/11/23

Hi,

[#13749] Re: retry semantics changed — Dave Thomas <dave@...> 2007/11/23

Re: \u escapes in string literals: proof of concept implementation

From: David Flanagan <david@...>
Date: 2007-11-06 05:14:34 UTC
List: ruby-core #13233
Nobuyoshi Nakada wrote:
> Hi,
> 
> At Tue, 30 Oct 2007 15:30:30 +0900,
> Martin Duerst wrote in [ruby-core:13082]:
>> Please don't. If you really want, you might use \x{...} for a big-
>> endian representation of the underlying byte sequence for all encodings,
>> including UTF-8. This would mean e.g. the following:
> 
> In [ruby-dev:16603], Matz said that `codepoint isn't a byte
> representation but is a "number"'.
> 
>> Directly encoded string: "中田 伸悦"
>>
>> Using \x for UTF-8: "\xE4\xB8\xAD\xE7\x94\xB0 \xE4\xBC\xB8\xE6\x82\xA6"
>> Using \x for Shift_JIS: "\x92\x86\x93\x63 \x90\x4c\x89\x78"
>>
>> Using \x{...} for UTF-8: "\x{E4B8AD}\x{E794B0} \x{E4BCB8}\x{E682A6}"
>> Using \x{...} for Shift_JIS: "\x{9286}\x{9363} \x{904c}\x{8978}"
>>
>> Using \u (currently only UTF-8): "\u4E2D\u7530 \u4F38\u60A6"
>> Using \u (in the future potentially for Shift_JIS and others):
>>                                  "\u4E2D\u7530 \u4F38\u60A6"
> 
> Rather, "\x{4366 4544} \x{3f2d 3159}" for both of Shift_JIS and
> EUC-JP which are based on JIS0212, and "\x{4E2D 7530} \x{4F38
> 60A6}" for UTF-8, I'd expect.

So the \x escape identifies the codepoint, but not the encoding. Its
interpretation as a sequence of bytes must therefore depend on the
current primary encoding (or perhaps on the script encoding). So string
literals can have different meanings when run in different locales (or
when cut-and-pasted between programs).  The \u escape is very different:
it specifies the encoding, so the encoding of the string literal always
comes out right even if the script is run in a different locale or the
encoding of the script itself is changed.

If I'm understanding this correctly, it seems like these \x{} escapes
would be very dangerous.  (Though I suppose you could argue that they
are no more dangerous or non-portable than regular \x byte escapes.)

I think I can understand how this \x proposal would be useful for
JIS0201 codepoints: you could create string literals that would be
portable to both EUC and SJIS encodings.  Is that the use case that is
driving this?  Would Japanese programmers actually write their string
literals that way?  If so, why not call this \j (for JIS) or \k (for
Kanji) instead of \x?  And if renaming it to \j, it should probably
cause an error when used with a primary encoding that is not based on
JIS0201.

	David

In This Thread