[#12372] Release compatibility/train — Prashant Srinivasan <Prashant.Srinivasan@...>

Hello all,

28 messages 2007/10/03
[#12373] Re: Release compatibility/train — Yukihiro Matsumoto <matz@...> 2007/10/03

Hi,

[#12374] Re: Release compatibility/train — David Flanagan <david@...> 2007/10/03

Yukihiro Matsumoto wrote:

[#12376] Re: Release compatibility/train — Prashant Srinivasan <Prashant.Srinivasan@...> 2007/10/03

[#12377] Re: Release compatibility/train — Yukihiro Matsumoto <matz@...> 2007/10/03

Hi,

[#12382] Re: Release compatibility/train — Charles Oliver Nutter <charles.nutter@...> 2007/10/03

Yukihiro Matsumoto wrote:

[#12385] Re: Release compatibility/train — Yukihiro Matsumoto <matz@...> 2007/10/03

Hi,

[#12388] Re: Release compatibility/train — Charles Oliver Nutter <charles.nutter@...> 2007/10/03

Yukihiro Matsumoto wrote:

[#12389] Re: Release compatibility/train — Yukihiro Matsumoto <matz@...> 2007/10/03

Hi,

[#12406] Re: Release compatibility/train — "David A. Black" <dblack@...> 2007/10/03

Hi --

[#12383] Include Rake in Ruby 1.9 — "NAKAMURA, Hiroshi" <nakahiro@...>

-----BEGIN PGP SIGNED MESSAGE-----

20 messages 2007/10/03

[#12539] Ordered Hashes in 1.9? — Michael Neumann <mneumann@...>

Hi all,

17 messages 2007/10/08
[#12542] Re: Ordered Hashes in 1.9? — Yukihiro Matsumoto <matz@...> 2007/10/08

Hi,

[#12681] Unicode: Progress? — murphy <murphy@...>

Hello!

17 messages 2007/10/15

[#12693] retry: revised 1.9 http patch — Hugh Sasse <hgs@...>

I'm reposting this because I've had little response to this version

11 messages 2007/10/15

[#12697] Range.first is incompatible with Enumerable.first — David Flanagan <david@...>

The new Enumerable.first method is a generalization of Array.first to

11 messages 2007/10/16

[#12754] Improving 'syntax error, unexpected $end, expecting kEND'? — Hugh Sasse <hgs@...>

I've had a look at this, but can't see how to do it: When I get

17 messages 2007/10/18
[#12886] Re: Improving 'syntax error, unexpected $end, expecting kEND'? — David Flanagan <david@...> 2007/10/23

The patch below changes this message to:

[#12758] Encoding::primary_encoding — David Flanagan <david@...>

Hi,

25 messages 2007/10/18
[#12763] Re: Encoding::primary_encoding — Nobuyoshi Nakada <nobu@...> 2007/10/19

Hi,

[#12802] Re: Encoding::primary_encoding — Wolfgang N疆asi-Donner <ed.odanow@...> 2007/10/21

Nobuyoshi Nakada schrieb:

[#12803] Re: Encoding::primary_encoding — Nobuyoshi Nakada <nobu@...> 2007/10/21

Hi,

[#12804] Re: Encoding::primary_encoding — Wolfgang N疆asi-Donner <ed.odanow@...> 2007/10/21

Nobuyoshi Nakada schrieb:

[#12808] Re: Encoding::primary_encoding — Nobuyoshi Nakada <nobu@...> 2007/10/22

Hi,

[#12818] Re: Encoding::primary_encoding — Wolfgang N疆asi-Donner <ed.odanow@...> 2007/10/22

Nobuyoshi Nakada schrieb:

[#12820] Re: Encoding::primary_encoding — "Michal Suchanek" <hramrach@...> 2007/10/22

T24gMjIvMTAvMjAwNywgV29sZmdhbmcgTsOhZGFzaS1Eb25uZXIgPGVkLm9kYW5vd0B3b25hZG8u

[#12823] Re: Encoding::primary_encoding — Wolfgang Nádasi-Donner <ed.odanow@...> 2007/10/22

Michal Suchanek schrieb:

[#12824] Re: Encoding::primary_encoding — Nobuyoshi Nakada <nobu@...> 2007/10/22

Hi,

[#12767] \u escapes in string literals: proof of concept implementation — David Flanagan <david@...>

Back at the end of August, Matz wrote (see

45 messages 2007/10/19
[#12769] Re: \u escapes in string literals: proof of concept implementation — "Nobuyoshi Nakada" <nobu@...> 2007/10/19

Hi,

[#12782] Re: \u escapes in string literals: proof of concept implementation — David Flanagan <david@...> 2007/10/20

Nobuyoshi Nakada wrote:

[#12831] Re: \u escapes in string literals: proof of concept implementation — Yukihiro Matsumoto <matz@...> 2007/10/22

Hi,

[#12841] Re: \u escapes in string literals: proof of concept implementation — David Flanagan <david@...> 2007/10/22

Yukihiro Matsumoto wrote:

[#12862] Re: \u escapes in string literals: proof of concept implementation — Martin Duerst <duerst@...> 2007/10/23

At 04:19 07/10/23, David Flanagan wrote:

[#12864] Re: \u escapes in string literals: proof of concept implementation — David Flanagan <david@...> 2007/10/23

Martin Duerst wrote:

[#12870] Re: \u escapes in string literals: proof of concept implementation — Martin Duerst <duerst@...> 2007/10/23

At 13:10 07/10/23, David Flanagan wrote:

[#12872] Re: \u escapes in string literals: proof of concept implementation — David Flanagan <david@...> 2007/10/23

Martin Duerst wrote:

[#12936] Re: \u escapes in string literals: proof of concept implementation — Yukihiro Matsumoto <matz@...> 2007/10/25

Hi,

[#12980] Re: \u escapes in string literals: proof of concept implementation — David Flanagan <david@...> 2007/10/26

Yukihiro Matsumoto wrote:

[#13028] Re: \u escapes in string literals: proof of concept implementation — Nobuyoshi Nakada <nobu@...> 2007/10/29

Hi,

[#13032] Re: \u escapes in string literals: proof of concept implementation — David Flanagan <david@...> 2007/10/29

Nobuyoshi Nakada wrote:

[#13034] Re: \u escapes in string literals: proof of concept implementation — Nobuyoshi Nakada <nobu@...> 2007/10/29

Hi,

[#13082] Re: \u escapes in string literals: proof of concept implementation — Martin Duerst <duerst@...> 2007/10/30

At 16:46 07/10/29, Nobuyoshi Nakada wrote:

[#13231] Re: \u escapes in string literals: proof of concept implementation — Nobuyoshi Nakada <nobu@...> 2007/11/06

Hi,

[#13234] Re: \u escapes in string literals: proof of concept implementation — Martin Duerst <duerst@...> 2007/11/06

At 11:29 07/11/06, Nobuyoshi Nakada wrote:

[#12825] clarification of ruby libraries installation paths? — Lucas Nussbaum <lucas@...>

Hi,

53 messages 2007/10/22
[#12830] Re: clarification of ruby libraries installation paths? — Ben Bleything <ben@...> 2007/10/22

On Mon, Oct 22, 2007, Lucas Nussbaum wrote:

[#12833] Re: clarification of ruby libraries installation paths? — Lucas Nussbaum <lucas@...> 2007/10/22

On 23/10/07 at 00:13 +0900, Ben Bleything wrote:

[#12835] Re: clarification of ruby libraries installation paths? — "Austin Ziegler" <halostatue@...> 2007/10/22

On 10/22/07, Lucas Nussbaum <lucas@lucas-nussbaum.net> wrote:

[#12836] Re: clarification of ruby libraries installation paths? — Lucas Nussbaum <lucas@...> 2007/10/22

On 23/10/07 at 01:55 +0900, Austin Ziegler wrote:

[#12888] Re: clarification of ruby libraries installation paths? — Gonzalo Garramu <ggarra@...> 2007/10/23

Lucas Nussbaum wrote:

[#12894] Re: clarification of ruby libraries installation paths? — Lucas Nussbaum <lucas@...> 2007/10/24

On 24/10/07 at 05:14 +0900, Gonzalo Garramu wrote:

[#13057] Re: clarification of ruby libraries installation paths? — Gonzalo Garramu <ggarra@...> 2007/10/29

Lucas Nussbaum wrote:

[#13058] Re: clarification of ruby libraries installation paths? — Lucas Nussbaum <lucas@...> 2007/10/29

On 30/10/07 at 07:28 +0900, Gonzalo Garramu wrote:

[#12848] Re: clarification of ruby libraries installation paths? — Sam Roberts <sroberts@...> 2007/10/22

On Tue, Oct 23, 2007 at 01:55:29AM +0900, Austin Ziegler wrote:

[#12855] Re: clarification of ruby libraries installation paths? — "Austin Ziegler" <halostatue@...> 2007/10/23

On 10/22/07, Sam Roberts <sroberts@uniserve.com> wrote:

[#13016] Re: clarification of ruby libraries installation paths? — bob@... (Bob Proulx) 2007/10/28

Austin Ziegler wrote:

[#13029] Re: clarification of ruby libraries installation paths? — "Austin Ziegler" <halostatue@...> 2007/10/29

On 10/28/07, Bob Proulx <bob@proulx.com> wrote:

[#13054] Austin Ziegler's behaviour (Was: clarification of ruby libraries installation paths?) — Lucas Nussbaum <lucas@...> 2007/10/29

Austin,

[#13055] Re: Austin Ziegler's behaviour (Was: clarification of ruby libraries installation paths?) — "Luis Lavena" <luislavena@...> 2007/10/29

On 10/29/07, Lucas Nussbaum <lucas@lucas-nussbaum.net> wrote:

[#13064] Re: Austin Ziegler's behaviour (Was: clarification of ruby libraries installation paths?) — "Austin Ziegler" <halostatue@...> 2007/10/30

On 10/29/07, Luis Lavena <luislavena@gmail.com> wrote:

[#13066] Re: Austin Ziegler's behaviour (Was: clarification of ruby libraries installation paths?) — "Luis Lavena" <luislavena@...> 2007/10/30

On 10/30/07, Austin Ziegler <halostatue@gmail.com> wrote:

[#13094] Re: Austin Ziegler's behaviour (Was: clarification of ruby libraries installation paths?) — "Rick Bradley" <rick@...> 2007/10/30

Do we think that maybe, just maybe, things went off the rails when the

[#13095] Re: Austin Ziegler's behaviour (Was: clarification of ruby libraries installation paths?) — "Luis Lavena" <luislavena@...> 2007/10/30

On 10/30/07, Rick Bradley <rick@rickbradley.com> wrote:

[#12900] Hopefully Complete List of Possible Encoding Specifications - Existing Ones — Wolfgang Nádasi-Donner <ed.odanow@...>

Dear Ruby 1.9 architects, developers, and testers!

31 messages 2007/10/24
[#12905] Re: Hopefully Complete List of Possible Encoding Specifications - Existing Ones — Yukihiro Matsumoto <matz@...> 2007/10/24

Hi,

[#12907] Re: Hopefully Complete List of Possible Encoding Specifications - Existing Ones — Wolfgang Nádasi-Donner <ed.odanow@...> 2007/10/24

Yukihiro Matsumoto schrieb:

[#12909] Re: Hopefully Complete List of Possible Encoding Specifications - Existing Ones — Yukihiro Matsumoto <matz@...> 2007/10/24

Hi,

[#12940] Re: Hopefully Complete List of Possible Encoding Specifications - Existing Ones — Wolfgang Nádasi-Donner <ed.odanow@...> 2007/10/25
[#12942] Re: Hopefully Complete List of Possible Encoding Specifications - Existing Ones — Wolfgang Nádasi-Donner <ed.odanow@...> 2007/10/25

I have a (hopefully) final question before testing all

[#12948] Re: Hopefully Complete List of Possible Encoding Specifications - Existing Ones — Nobuyoshi Nakada <nobu@...> 2007/10/26

Hi,

[#12951] Fluent programming in Ruby — David Flanagan <david@...>

From the ChangeLog:

16 messages 2007/10/26

[#12996] General hash keys for colon notation — murphy <murphy@...>

Dear language designer(s) and parser wizards,

16 messages 2007/10/28

[#13027] Implementation of "guessUTF" method - final questions — Wolfgang Nádasi-Donner <ed.odanow@...>

Dear Ruby designers, developers, and testers!

22 messages 2007/10/29

[#13069] new Enumerable.butfirst method — David Flanagan <david@...>

Matz,

17 messages 2007/10/30

Re: \u escapes in string literals: proof of concept implementation

From: Martin Duerst <duerst@...>
Date: 2007-10-23 03:09:30 UTC
List: ruby-core #12862
At 04:19 07/10/23, David Flanagan wrote:
>Yukihiro Matsumoto wrote:
>> Hi,
>> In message "Re: \u escapes in string literals: proof of concept implementation"
>>     on Sat, 20 Oct 2007 09:50:58 +0900, David Flanagan <david@davidflanagan.com> writes:
>> |> If the current encoding is not based on the Unicode character
>> |> set, I think \u should:
>> |> |> a. raise compile error,
>> |> b. be converted from Unicode to the encoding, or
>> |> c. make the whole string UTF-8 encoding (and raise compile error
>> |>    if other non-ascii characters are there).
>> |
>> |The new patch attached here does c.  Any string with a \u in it is |forced to utf-8 encoding.  Unless the \u encodes a character in the |ASCII range, in which case it leaves the encoding alone.
>> I'd happy to merge the third patch, once I've been convinced by option
>> c.  Originally I was thinking option b, but option c is much simpler.
>
>Are you happy with \u escapes in Regexp literals?  Note that the \u is interpreted by the lexer rather than being passed on to the regexp engine.

This may be a somewhat controversial proposal, but while we are
adding \u to regular expressions, why not also add \x? Anybody know
the reason why currently \x isn't available in regular expressions?

>I don't have an argument against option b.  I didn't even consider it because I don't know how to that kind of conversion.  Are you still planning to add a method for transcoding of strings in 1.9?

I hope yes. Matz asked me and my student to work on it.
It may take a few more days or weeks.

>If we can't to transcoding explicitly, then I'd argue that making \u do transcoding doesn't make sense.  (Is "transcoding" the right word to use here?)

Yes, definitely.

>Option c was relatively simple. Note, though, that I don't currently implement the part in parentheses. That is, there is no check to see if a string has other multi-byte characters in it.  So with -Ke or -Ks you could start a Kanji string in one of those encodings and then throw in a \u escape. You'd probably end up with an string that was not legal utf-8 nor legal euc or sjis.  The \u escape could raise an error.  But what about the reverse case.  Say I start writing a string with a \u escape, then then add a Kanji character at the end of the string.  I don't believe that the lexer can currently detect that case and complain.  As I understand it, we can use \x to mess up any string, and I don't believe that the lexer applies any sanity checks to string literals.  So if we're not doing that, I'm not sure we need to apply those sanity checks to \u.

I think \x and \u are different. With \u, people are looking up a character
number and expect a character. With \x, it's a byte, and people should
get used to the fact that they really have to know what they are doing.


>On a related note, Martin submitted test cases that expected errors to be raised for string like "\uD800" because (I think) that codepoint is supposed to be part of a surrogate pair and it should be illegal to have it by itself in a string.  He's got a point, but I'm not sure that is something we want to enforce at the \u level.  If the lexer is going to complain about unpaired surrogates,

It should also complain about paired surrogates. There are no surrogate
pairs in UTF-8. \u designates a codepoint/character, but there is no
codepoint or character \uD800, so we should complain.

>I think it needs to catch them when encoded with \x or when they're just in the byte stream of the source file as well.

That would be nice, but I think it's on a different level.
As above, \x are bytes, at the user's disgression.

>Anyway, that is my "argument" for option c.

That things are simpler is quite clear. But option a wouldn't be
difficult to implement, either, I guess. My suggestion is to stay
with option a until we have a better idea of which of b and c is
really needed/implementable/... There may even be mixtures:
Transcode to current file encoding or primary encoding,...,
if it works (i.e. the actual characters can be converted),
otherwise transcode characters included literally to UTF-8.
Not that I'm saying that we need to go there, just that there
are other options that we might consider.

Regards,    Martin.


#-#-#  Martin J. Du"rst, Assoc. Professor, Aoyama Gakuin University
#-#-#  http://www.sw.it.aoyama.ac.jp       mailto:duerst@it.aoyama.ac.jp     


In This Thread