[#13161] hacking on the "heap" implementation in gc.c — Lloyd Hilaiel <lloyd@...>

Hi all,

16 messages 2007/11/01

[#13182] Thinking of dropping YAML from 1.8 — Urabe Shyouhei <shyouhei@...>

Hello all.

14 messages 2007/11/03

[#13315] primary encoding and source encoding — David Flanagan <david@...>

I've got a couple of questions about the handling of primary encoding.

29 messages 2007/11/08
[#13331] Re: primary encoding and source encoding — Yukihiro Matsumoto <matz@...> 2007/11/09

Hi,

[#13368] method names in 1.9 — "David A. Black" <dblack@...>

Hi --

61 messages 2007/11/10
[#13369] Re: method names in 1.9 — Yukihiro Matsumoto <matz@...> 2007/11/10

Hi,

[#13388] Re: method names in 1.9 — Charles Oliver Nutter <charles.nutter@...> 2007/11/11

Yukihiro Matsumoto wrote:

[#13403] Re: method names in 1.9 — "Austin Ziegler" <halostatue@...> 2007/11/11

On 11/11/07, Charles Oliver Nutter <charles.nutter@sun.com> wrote:

[#13410] Re: method names in 1.9 — David Flanagan <david@...> 2007/11/11

Austin Ziegler wrote:

[#13413] Re: method names in 1.9 — Charles Oliver Nutter <charles.nutter@...> 2007/11/11

David Flanagan wrote:

[#13423] Re: method names in 1.9 — Jordi <mumismo@...> 2007/11/12

Summing it up:

[#13386] Re: method names in 1.9 — Trans <transfire@...> 2007/11/11

[#13391] Re: method names in 1.9 — Matthew Boeh <mboeh@...> 2007/11/11

On Sun, Nov 11, 2007 at 05:50:18PM +0900, Trans wrote:

[#13457] mingw rename — "Roger Pack" <rogerpack2005@...>

Currently for different windows' builds, the names for RUBY_PLATFORM

13 messages 2007/11/13

[#13485] Proposal: Array#walker — Wolfgang Nádasi-Donner <ed.odanow@...>

Good morning all together!

23 messages 2007/11/14
[#13486] Re: Proposal: Array#walker — Wolfgang Nádasi-Donner <ed.odanow@...> 2007/11/14

A nicer version may be...

[#13488] Re: Proposal: Array#walker — Trans <transfire@...> 2007/11/14

[#13495] Re: Proposal: Array#walker — Trans <transfire@...> 2007/11/14

[#13498] state of threads in 1.9 — Jordi <mumismo@...>

Are Threads mapped to threads on the underlying operating system in

30 messages 2007/11/14
[#13519] Re: state of threads in 1.9 — "Bill Kelly" <billk@...> 2007/11/14

[#13526] Re: state of threads in 1.9 — Eric Hodel <drbrain@...7.net> 2007/11/14

On Nov 14, 2007, at 11:18 , Bill Kelly wrote:

[#13528] test/unit and miniunit — Ryan Davis <ryand-ruby@...>

When is the 1.9 freeze?

17 messages 2007/11/14

[#13564] Thoughts about Array#compact!, Array#flatten!, Array#reject!, String#strip!, String#capitalize!, String#gsub!, etc. — Wolfgang Nádasi-Donner <ed.odanow@...>

Good evening all together!

53 messages 2007/11/15
[#13575] Re: Thoughts about Array#compact!, Array#flatten!, Array#reject!, String#strip!, String#capitalize!, String#gsub!, etc. — "Nikolai Weibull" <now@...> 2007/11/15

On Nov 15, 2007 8:14 PM, Wolfgang N=E1dasi-Donner <ed.odanow@wonado.de> wro=

[#13578] Re: Thoughts about Array#compact!, Array#flatten!, Array#reject!, String#strip!, String#capitalize!, String#gsub!, etc. — Michael Neumann <mneumann@...> 2007/11/16

Nikolai Weibull schrieb:

[#13598] wondering about #tap (was: Re: Thoughts about Array#compact!, Array#flatten!, Array#reject!, String#strip!, String#capitalize!, String#gsub!, etc.) — "David A. Black" <dblack@...> 2007/11/16

Hi --

[#13605] Re: wondering about #tap (was: Re: Thoughts about Array#compact!, Array#flatten!, Array#reject!, String#strip!, String#capitalize!, String#gsub!, etc.) — Trans <transfire@...> 2007/11/16

[#13612] Re: wondering about #tap (was: Re: Thoughts about Array#compact!, Array#flatten!, Array#reject!, String#strip!, String#capitalize!, String#gsub!, etc.) — "David A. Black" <dblack@...> 2007/11/16

Hi --

[#13624] Re: wondering about #tap (was: Re: Thoughts about Array#compact!, Array#flatten!, Array#reject!, String#strip!, String#capitalize!, String#gsub!, etc.) — "Nikolai Weibull" <now@...> 2007/11/16

On Nov 16, 2007 12:40 PM, David A. Black <dblack@rubypal.com> wrote:

[#13632] Re: wondering about #tap — David Flanagan <david@...> 2007/11/16

David A. Black wrote:

[#13634] Re: wondering about #tap — "David A. Black" <dblack@...> 2007/11/16

Hi --

[#13636] Re: wondering about #tap — "Rick DeNatale" <rick.denatale@...> 2007/11/16

On Nov 16, 2007 12:40 PM, David A. Black <dblack@rubypal.com> wrote:

[#13637] Re: wondering about #tap — murphy <murphy@...> 2007/11/16

Rick DeNatale wrote:

[#13640] Re: wondering about #tap — Wolfgang Nádasi-Donner <ed.odanow@...> 2007/11/16

murphy schrieb:

[#13614] Suggestion for native thread tests — "Eust痃uio Rangel" <eustaquiorangel@...>

Hi!

12 messages 2007/11/16

[#13685] Problems with \M-x in utf-8 encoded strings — Wolfgang Nádasi-Donner <ed.odanow@...>

Hi!

11 messages 2007/11/18

[#13741] retry semantics changed — Dave Thomas <dave@...>

In 1.8, I could write:

46 messages 2007/11/23
[#13742] Re: retry semantics changed — "Brian Mitchell" <binary42@...> 2007/11/23

On Nov 23, 2007 12:06 PM, Dave Thomas <dave@pragprog.com> wrote:

[#13743] Re: retry semantics changed — Dave Thomas <dave@...> 2007/11/23

[#13746] Re: retry semantics changed — Yukihiro Matsumoto <matz@...> 2007/11/23

Hi,

[#13747] Re: retry semantics changed — Dave Thomas <dave@...> 2007/11/23

[#13748] Re: retry semantics changed — Yukihiro Matsumoto <matz@...> 2007/11/23

Hi,

[#13749] Re: retry semantics changed — Dave Thomas <dave@...> 2007/11/23

\xDE instead of \336 (was: Re: \u escapes in string literals: proof of concept implementation)

From: Martin Duerst <duerst@...>
Date: 2007-11-12 11:10:37 UTC
List: ruby-core #13436
At 05:51 07/11/07, Yukihiro Matsumoto wrote:

>    on Tue, 6 Nov 2007 14:58:24 +0900, Martin Duerst 
><duerst@it.aoyama.ac.jp> writes:

>Your proposal of using hexadecimal instead of octal in
>[ruby-core:13026] sounds interesting.

Thanks. Encouraged by this, I worked on a tiny patch, and a small
test file, which are attached below. Please take a look.

Please note that I have choosen upper-case hex escapes.
The other choice is lower-case hex escapes. I'm sure that
there will be opinions on this point, but I don't really
have one, so please just chose whichever you prefer.

Please also note that I wasn't as successful for regular expressions
as for Strings and Symbols. Regular expressions seem to keep their
literal representation internally, whereas strings get parsed.
Or I just haven't understood how they work at all yet.

I'm not at all sure whether this covers enough cases, or whether
this goes too far. Probably trying it out will show where things
break. I have run 'make test', and it stops at the same place with
and without the patch, which may not say much. For your reference,
the test that breaks is at line 49 of bootstraptest/test_knownbug.rb.
The following code:
    Regexp.union(
      "a",
      Regexp.new("\x80".force_encoding("euc-jp")),
      Regexp.new("\x80".force_encoding("utf-8")))
should obviously raise an exception, but doesn't.

Regards,    Martin.

P.S.: Just for the record.

>|>Rather, "\x{4366 4544} \x{3f2d 3159}" for both of Shift_JIS and
>|>EUC-JP
>|
>|Can you explain how you got these numbers?
>
>Perhaps JIS 0208 KU for higher byte, and TEN for lower byte, in other
>words, EUC-JP code point with each byte MSB masked.

The second half of your description is correct. But in my understanding,
Ku and Ten start at 1, which would be EUC-JP code point
with each byte masked *and then 32 substracted*.




#-#-#  Martin J. Du"rst, Assoc. Professor, Aoyama Gakuin University
#-#-#  http://www.sw.it.aoyama.ac.jp       mailto:duerst@it.aoyama.ac.jp     

Attachments (2)

patch_hexescapes.txt (866 Bytes, text/x-diff)
Index: string.c
===================================================================
--- string.c	(revision 13898)
+++ string.c	(working copy)
@@ -2946,7 +2946,7 @@
 	    char buf[5];
 	    char *s = buf;
 
-	    sprintf(buf, "\\%03o", c & 0377);
+	    sprintf(buf, "\\x%02X", c & 0xFF);
 	    while (*s) {
 		str_cat_char(result, *s++, enc);
 	    }
@@ -3057,7 +3057,7 @@
 	}
 	else {
 	    *q++ = '\\';
-	    sprintf(q, "%03o", c&0xff);
+	    sprintf(q, "x%02X", c&0xff);
 	    q += 3;
 	}
     }
Index: re.c
===================================================================
--- re.c	(revision 13898)
+++ re.c	(working copy)
@@ -255,7 +255,7 @@
 	    else if (!rb_enc_isspace(*p, enc)) {
 		char b[8];
 
-		sprintf(b, "\\%03o", *p & 0377);
+		sprintf(b, "\\x%02X", *p & 0xFF);
 		rb_str_buf_cat(str, b, 4);
 	    }
 	    else {
testhex.rb (530 Bytes, text/x-ruby)
require 'test/unit'
class TestHexEscape < Test::Unit::TestCase
  def setup
    @s = "abc\336f"
  end
  
  def test_escapes
    assert_equal "\"abc\\xDEf\"", "abc\336f".inspect
    assert_equal "\"abc\\xDEf\"", "abc\336f".dump
    assert_equal "/abc\\xDEf/",   Regexp.new("abc\336f").inspect
    assert_equal ":\"abc\\xDEf\"",  :"abc\336f".inspect
  end
end
  

s = "abc\xdef"
sym = s.to_sym
r = Regexp.new(s)

puts s.inspect
puts sym.inspect
puts r.inspect
puts s.dump


require 'pp'
pp s
pp sym
pp r

In This Thread

Prev Next