[#12372] Release compatibility/train — Prashant Srinivasan <Prashant.Srinivasan@...>

Hello all,

28 messages 2007/10/03
[#12373] Re: Release compatibility/train — Yukihiro Matsumoto <matz@...> 2007/10/03

Hi,

[#12374] Re: Release compatibility/train — David Flanagan <david@...> 2007/10/03

Yukihiro Matsumoto wrote:

[#12376] Re: Release compatibility/train — Prashant Srinivasan <Prashant.Srinivasan@...> 2007/10/03

[#12377] Re: Release compatibility/train — Yukihiro Matsumoto <matz@...> 2007/10/03

Hi,

[#12382] Re: Release compatibility/train — Charles Oliver Nutter <charles.nutter@...> 2007/10/03

Yukihiro Matsumoto wrote:

[#12385] Re: Release compatibility/train — Yukihiro Matsumoto <matz@...> 2007/10/03

Hi,

[#12388] Re: Release compatibility/train — Charles Oliver Nutter <charles.nutter@...> 2007/10/03

Yukihiro Matsumoto wrote:

[#12389] Re: Release compatibility/train — Yukihiro Matsumoto <matz@...> 2007/10/03

Hi,

[#12406] Re: Release compatibility/train — "David A. Black" <dblack@...> 2007/10/03

Hi --

[#12383] Include Rake in Ruby 1.9 — "NAKAMURA, Hiroshi" <nakahiro@...>

-----BEGIN PGP SIGNED MESSAGE-----

20 messages 2007/10/03

[#12539] Ordered Hashes in 1.9? — Michael Neumann <mneumann@...>

Hi all,

17 messages 2007/10/08
[#12542] Re: Ordered Hashes in 1.9? — Yukihiro Matsumoto <matz@...> 2007/10/08

Hi,

[#12681] Unicode: Progress? — murphy <murphy@...>

Hello!

17 messages 2007/10/15

[#12693] retry: revised 1.9 http patch — Hugh Sasse <hgs@...>

I'm reposting this because I've had little response to this version

11 messages 2007/10/15

[#12697] Range.first is incompatible with Enumerable.first — David Flanagan <david@...>

The new Enumerable.first method is a generalization of Array.first to

11 messages 2007/10/16

[#12754] Improving 'syntax error, unexpected $end, expecting kEND'? — Hugh Sasse <hgs@...>

I've had a look at this, but can't see how to do it: When I get

17 messages 2007/10/18
[#12886] Re: Improving 'syntax error, unexpected $end, expecting kEND'? — David Flanagan <david@...> 2007/10/23

The patch below changes this message to:

[#12758] Encoding::primary_encoding — David Flanagan <david@...>

Hi,

25 messages 2007/10/18
[#12763] Re: Encoding::primary_encoding — Nobuyoshi Nakada <nobu@...> 2007/10/19

Hi,

[#12802] Re: Encoding::primary_encoding — Wolfgang N疆asi-Donner <ed.odanow@...> 2007/10/21

Nobuyoshi Nakada schrieb:

[#12803] Re: Encoding::primary_encoding — Nobuyoshi Nakada <nobu@...> 2007/10/21

Hi,

[#12804] Re: Encoding::primary_encoding — Wolfgang N疆asi-Donner <ed.odanow@...> 2007/10/21

Nobuyoshi Nakada schrieb:

[#12808] Re: Encoding::primary_encoding — Nobuyoshi Nakada <nobu@...> 2007/10/22

Hi,

[#12818] Re: Encoding::primary_encoding — Wolfgang N疆asi-Donner <ed.odanow@...> 2007/10/22

Nobuyoshi Nakada schrieb:

[#12820] Re: Encoding::primary_encoding — "Michal Suchanek" <hramrach@...> 2007/10/22

T24gMjIvMTAvMjAwNywgV29sZmdhbmcgTsOhZGFzaS1Eb25uZXIgPGVkLm9kYW5vd0B3b25hZG8u

[#12823] Re: Encoding::primary_encoding — Wolfgang Nádasi-Donner <ed.odanow@...> 2007/10/22

Michal Suchanek schrieb:

[#12824] Re: Encoding::primary_encoding — Nobuyoshi Nakada <nobu@...> 2007/10/22

Hi,

[#12767] \u escapes in string literals: proof of concept implementation — David Flanagan <david@...>

Back at the end of August, Matz wrote (see

45 messages 2007/10/19
[#12769] Re: \u escapes in string literals: proof of concept implementation — "Nobuyoshi Nakada" <nobu@...> 2007/10/19

Hi,

[#12782] Re: \u escapes in string literals: proof of concept implementation — David Flanagan <david@...> 2007/10/20

Nobuyoshi Nakada wrote:

[#12831] Re: \u escapes in string literals: proof of concept implementation — Yukihiro Matsumoto <matz@...> 2007/10/22

Hi,

[#12841] Re: \u escapes in string literals: proof of concept implementation — David Flanagan <david@...> 2007/10/22

Yukihiro Matsumoto wrote:

[#12862] Re: \u escapes in string literals: proof of concept implementation — Martin Duerst <duerst@...> 2007/10/23

At 04:19 07/10/23, David Flanagan wrote:

[#12864] Re: \u escapes in string literals: proof of concept implementation — David Flanagan <david@...> 2007/10/23

Martin Duerst wrote:

[#12870] Re: \u escapes in string literals: proof of concept implementation — Martin Duerst <duerst@...> 2007/10/23

At 13:10 07/10/23, David Flanagan wrote:

[#12872] Re: \u escapes in string literals: proof of concept implementation — David Flanagan <david@...> 2007/10/23

Martin Duerst wrote:

[#12936] Re: \u escapes in string literals: proof of concept implementation — Yukihiro Matsumoto <matz@...> 2007/10/25

Hi,

[#12980] Re: \u escapes in string literals: proof of concept implementation — David Flanagan <david@...> 2007/10/26

Yukihiro Matsumoto wrote:

[#13028] Re: \u escapes in string literals: proof of concept implementation — Nobuyoshi Nakada <nobu@...> 2007/10/29

Hi,

[#13032] Re: \u escapes in string literals: proof of concept implementation — David Flanagan <david@...> 2007/10/29

Nobuyoshi Nakada wrote:

[#13034] Re: \u escapes in string literals: proof of concept implementation — Nobuyoshi Nakada <nobu@...> 2007/10/29

Hi,

[#13082] Re: \u escapes in string literals: proof of concept implementation — Martin Duerst <duerst@...> 2007/10/30

At 16:46 07/10/29, Nobuyoshi Nakada wrote:

[#13231] Re: \u escapes in string literals: proof of concept implementation — Nobuyoshi Nakada <nobu@...> 2007/11/06

Hi,

[#13234] Re: \u escapes in string literals: proof of concept implementation — Martin Duerst <duerst@...> 2007/11/06

At 11:29 07/11/06, Nobuyoshi Nakada wrote:

[#12825] clarification of ruby libraries installation paths? — Lucas Nussbaum <lucas@...>

Hi,

53 messages 2007/10/22
[#12830] Re: clarification of ruby libraries installation paths? — Ben Bleything <ben@...> 2007/10/22

On Mon, Oct 22, 2007, Lucas Nussbaum wrote:

[#12833] Re: clarification of ruby libraries installation paths? — Lucas Nussbaum <lucas@...> 2007/10/22

On 23/10/07 at 00:13 +0900, Ben Bleything wrote:

[#12835] Re: clarification of ruby libraries installation paths? — "Austin Ziegler" <halostatue@...> 2007/10/22

On 10/22/07, Lucas Nussbaum <lucas@lucas-nussbaum.net> wrote:

[#12836] Re: clarification of ruby libraries installation paths? — Lucas Nussbaum <lucas@...> 2007/10/22

On 23/10/07 at 01:55 +0900, Austin Ziegler wrote:

[#12888] Re: clarification of ruby libraries installation paths? — Gonzalo Garramu <ggarra@...> 2007/10/23

Lucas Nussbaum wrote:

[#12894] Re: clarification of ruby libraries installation paths? — Lucas Nussbaum <lucas@...> 2007/10/24

On 24/10/07 at 05:14 +0900, Gonzalo Garramu wrote:

[#13057] Re: clarification of ruby libraries installation paths? — Gonzalo Garramu <ggarra@...> 2007/10/29

Lucas Nussbaum wrote:

[#13058] Re: clarification of ruby libraries installation paths? — Lucas Nussbaum <lucas@...> 2007/10/29

On 30/10/07 at 07:28 +0900, Gonzalo Garramu wrote:

[#12848] Re: clarification of ruby libraries installation paths? — Sam Roberts <sroberts@...> 2007/10/22

On Tue, Oct 23, 2007 at 01:55:29AM +0900, Austin Ziegler wrote:

[#12855] Re: clarification of ruby libraries installation paths? — "Austin Ziegler" <halostatue@...> 2007/10/23

On 10/22/07, Sam Roberts <sroberts@uniserve.com> wrote:

[#13016] Re: clarification of ruby libraries installation paths? — bob@... (Bob Proulx) 2007/10/28

Austin Ziegler wrote:

[#13029] Re: clarification of ruby libraries installation paths? — "Austin Ziegler" <halostatue@...> 2007/10/29

On 10/28/07, Bob Proulx <bob@proulx.com> wrote:

[#13054] Austin Ziegler's behaviour (Was: clarification of ruby libraries installation paths?) — Lucas Nussbaum <lucas@...> 2007/10/29

Austin,

[#13055] Re: Austin Ziegler's behaviour (Was: clarification of ruby libraries installation paths?) — "Luis Lavena" <luislavena@...> 2007/10/29

On 10/29/07, Lucas Nussbaum <lucas@lucas-nussbaum.net> wrote:

[#13064] Re: Austin Ziegler's behaviour (Was: clarification of ruby libraries installation paths?) — "Austin Ziegler" <halostatue@...> 2007/10/30

On 10/29/07, Luis Lavena <luislavena@gmail.com> wrote:

[#13066] Re: Austin Ziegler's behaviour (Was: clarification of ruby libraries installation paths?) — "Luis Lavena" <luislavena@...> 2007/10/30

On 10/30/07, Austin Ziegler <halostatue@gmail.com> wrote:

[#13094] Re: Austin Ziegler's behaviour (Was: clarification of ruby libraries installation paths?) — "Rick Bradley" <rick@...> 2007/10/30

Do we think that maybe, just maybe, things went off the rails when the

[#13095] Re: Austin Ziegler's behaviour (Was: clarification of ruby libraries installation paths?) — "Luis Lavena" <luislavena@...> 2007/10/30

On 10/30/07, Rick Bradley <rick@rickbradley.com> wrote:

[#12900] Hopefully Complete List of Possible Encoding Specifications - Existing Ones — Wolfgang Nádasi-Donner <ed.odanow@...>

Dear Ruby 1.9 architects, developers, and testers!

31 messages 2007/10/24
[#12905] Re: Hopefully Complete List of Possible Encoding Specifications - Existing Ones — Yukihiro Matsumoto <matz@...> 2007/10/24

Hi,

[#12907] Re: Hopefully Complete List of Possible Encoding Specifications - Existing Ones — Wolfgang Nádasi-Donner <ed.odanow@...> 2007/10/24

Yukihiro Matsumoto schrieb:

[#12909] Re: Hopefully Complete List of Possible Encoding Specifications - Existing Ones — Yukihiro Matsumoto <matz@...> 2007/10/24

Hi,

[#12940] Re: Hopefully Complete List of Possible Encoding Specifications - Existing Ones — Wolfgang Nádasi-Donner <ed.odanow@...> 2007/10/25
[#12942] Re: Hopefully Complete List of Possible Encoding Specifications - Existing Ones — Wolfgang Nádasi-Donner <ed.odanow@...> 2007/10/25

I have a (hopefully) final question before testing all

[#12948] Re: Hopefully Complete List of Possible Encoding Specifications - Existing Ones — Nobuyoshi Nakada <nobu@...> 2007/10/26

Hi,

[#12951] Fluent programming in Ruby — David Flanagan <david@...>

From the ChangeLog:

16 messages 2007/10/26

[#12996] General hash keys for colon notation — murphy <murphy@...>

Dear language designer(s) and parser wizards,

16 messages 2007/10/28

[#13027] Implementation of "guessUTF" method - final questions — Wolfgang Nádasi-Donner <ed.odanow@...>

Dear Ruby designers, developers, and testers!

22 messages 2007/10/29

[#13069] new Enumerable.butfirst method — David Flanagan <david@...>

Matz,

17 messages 2007/10/30

Re: \u escapes in string literals: proof of concept implementation

From: Nobuyoshi Nakada <nobu@...>
Date: 2007-10-25 08:05:21 UTC
List: ruby-core #12927
Hi,

At Thu, 25 Oct 2007 15:30:39 +0900,
David Flanagan wrote in [ruby-core:12923]:
> Well if you're writing the patch, we can take the "proof of concept" out 
> of the subject line!

But it didn't work enough.

> Thanks for this!  My recent patch (just posted a bit before yours) 
> includes a version of Martin's test cases in 
> tests/ruby/test_unicode_escape.rb

Thank you, I ran it and found a few bugs, and also found
oniguruma doesn't take care of \u by default.  It may be
reasonable to convert them in the parser, but /\x41/ == /A/ has
returned false...

An additional patch to the previous one.  With this patch,
\x{...} means a codepoint in the given encoding, while \xXX
means a raw byte as well as it's done now.


diff -U2 parse.y parse.y
--- parse.y	(working copy)
+++ parse.y	(working copy)
@@ -4952,4 +4952,5 @@
 	c = scan_hex(lex_p, 8, numlen);
 	if (!*numlen) goto invalid;
+	lex_p += *numlen;
 	if (!peek('}')) {
 	    yyerror("unterminated hex escape");
@@ -4966,4 +4967,5 @@
 	    return 0;
 	}
+	lex_p += *numlen;
     }
     return c;
@@ -4982,5 +4984,5 @@
 	    return 0;
 	}
-	if (codepoint > 0x7fffffff) {
+	if (codepoint > 0x10ffff) {
 	    yyerror("illegal Unicode codepoint (too large)");
 	    return 0;
@@ -5015,5 +5017,7 @@
 {
     int c;
+    int numlen;
 
+    if (mb) *mb = ENC_CODERANGE_SINGLE;
     switch (c = nextc()) {
       case '\\':	/* Backslash */
@@ -5056,22 +5060,19 @@
       case 'x':	/* hex constant */
 	if (flags & (ESCAPE_CONTROL|ESCAPE_META)) goto eof;
-	{
-	    int numlen;
-
-	    c = tok_hex(&numlen);
-	    if (numlen == 0) goto eof;
+	c = tok_hex(&numlen);
+	if (numlen == 0) return 0;
+	if (mb && (c >= 0x80)) {
+	    if (numlen == 2)
+		*mb = ENC_CODERANGE_UNKNOWN;
+	    else
+		*mb = ENC_CODERANGE_MULTI;
 	}
-	if (mb && (c >= 0x80)) *mb = ENC_CODERANGE_UNKNOWN;
 	return c;
 
       case 'u':	/* hex constant */
 	if (flags & (ESCAPE_CONTROL|ESCAPE_META)) goto eof;
-	{
-	    int numlen;
-
-	    c = tok_utf8(&numlen, encp);
-	    if (numlen == 0) goto eof;
-	}
-	if (mb && (c >= 0x80)) *mb = ENC_CODERANGE_UNKNOWN;
+	c = tok_utf8(&numlen, encp);
+	if (numlen == 0) return 0;
+	if (mb && (c >= 0x80)) *mb = ENC_CODERANGE_MULTI;
 	return c;
 
@@ -5125,4 +5126,13 @@
 #define tokcopy(n) memcpy(tokspace(n), lex_p - (n), (n))
 
+static void
+parser_tokaddmbc(struct parser_params *parser, int c, rb_encoding *enc)
+{
+    int len = rb_enc_codelen(c, enc);
+    rb_enc_mbcput(c, tokspace(len), enc);
+}
+
+#define tokaddmbc(c, enc) parser_tokaddmbc(parser, c, enc)
+
 static int
 parser_tokadd_escape(struct parser_params *parser, int term, int *mb, rb_encoding **encp)
@@ -5173,8 +5183,6 @@
 	    uc = tok_utf8(&numlen, encp);
 	    if (numlen == 0) goto eof;
-	    lex_p += numlen;
-	    tokcopy(numlen + 2);
+	    tokaddmbc(uc, *encp);
 	    if (mb && (uc >= 0x80)) *mb = ENC_CODERANGE_MULTI;
-	    if (uc >= 0x80) return 1;
 	}
 	return 0;
@@ -5294,13 +5302,4 @@
 #define tokadd_mbchar(c) parser_tokadd_mbchar(parser, c)
 
-static void
-parser_tokaddmbc(struct parser_params *parser, int c, rb_encoding *enc)
-{
-    int len = rb_enc_codelen(c, enc);
-    rb_enc_mbcput(c, tokspace(len), enc);
-}
-
-#define tokaddmbc(c, enc) parser_tokaddmbc(parser, c, enc)
-
 static int
 parser_tokadd_string(struct parser_params *parser,
@@ -5374,13 +5373,16 @@
 		}
 		else if (func & STR_FUNC_EXPAND) {
+		    int tmb = ENC_CODERANGE_SINGLE;
 		    pushback(c);
 		    if (func & STR_FUNC_ESCAPE) tokadd('\\');
-		    c = read_escape(0, mb, &enc);
-		    if (has_mb && enc != *encp) {
-			mixed_escape(beg, enc, *encp);
-			continue;
-		    }
-		    if (c >= 0x80) {
-			tokaddmbc(c, enc);
+		    c = read_escape(0, &tmb, &enc);
+		    if (mb) *mb = tmb;
+		    if (tmb == ENC_CODERANGE_MULTI) {
+			if (has_mb && enc != *encp) {
+			    mixed_escape(beg, enc, *encp);
+			}
+			else {
+			    tokaddmbc(c, enc);
+			}
 			continue;
 		    }
@@ -6256,7 +6258,11 @@
 	    goto ternary;
 	}
-	else if (c == '\\' && (c = read_escape(0, 0, &enc)) >= 0x80) {
-	    mb = ENC_CODERANGE_UNKNOWN;
-	    tokaddmbc(c, enc);
+	else if (c == '\\' && (c = read_escape(0, &mb, &enc)) >= 0x80) {
+	    if (mb == ENC_CODERANGE_MULTI) {
+		tokaddmbc(c, enc);
+	    }
+	    else {
+		tokadd(c);
+	    }
 	}
 	else {


-- 
Nobu Nakada

In This Thread