[#33000] [Ruby 1.9-Bug#4014][Open] Case-Sensitivity of Property Names Depends on Regexp Encoding — Run Paint Run Run <redmine@...>

Bug #4014: Case-Sensitivity of Property Names Depends on Regexp Encoding

11 messages 2010/11/01

[#33021] Re: [Ruby 1.9-Feature#4015][Open] File::DIRECT Constant for O_DIRECT — Yukihiro Matsumoto <matz@...>

Hi,

15 messages 2010/11/02

[#33139] [Ruby 1.9-Bug#4044][Open] Regex matching errors when using \W character class and /i option — Ben Hoskings <redmine@...>

Bug #4044: Regex matching errors when using \W character class and /i option

8 messages 2010/11/11

[#33162] Windows Unicode (chcp 65001) Generates incorrect output — Luis Lavena <luislavena@...>

Hello,

10 messages 2010/11/14

[#33246] [Ruby 1.9-Feature#4068][Open] Replace current standard Date/DateTime library with home_run — Jeremy Evans <redmine@...>

Feature #4068: Replace current standard Date/DateTime library with home_run

40 messages 2010/11/17

[#33255] [Ruby 1.9-Feature#4071][Open] support basic auth for Net::HTTP.get requests — "coderrr ." <redmine@...>

Feature #4071: support basic auth for Net::HTTP.get requests

23 messages 2010/11/19

[#33322] [Ruby 1.9-Feature#4085][Open] Refinements and nested methods — Shugo Maeda <redmine@...>

Feature #4085: Refinements and nested methods

94 messages 2010/11/24
[#33345] Re: [Ruby 1.9-Feature#4085][Open] Refinements and nested methods — Yusuke ENDOH <mame@...> 2010/11/25

Hi,

[#33356] Re: [Ruby 1.9-Feature#4085][Open] Refinements and nested methods — Shugo Maeda <shugo@...> 2010/11/25

Hi,

[#33375] Re: [Ruby 1.9-Feature#4085][Open] Refinements and nested methods — Yusuke ENDOH <mame@...> 2010/11/25

Hi,

[#33381] Re: [Ruby 1.9-Feature#4085][Open] Refinements and nested methods — Shugo Maeda <shugo@...> 2010/11/25

Hi,

[#33387] Re: [Ruby 1.9-Feature#4085][Open] Refinements and nested methods — Magnus Holm <judofyr@...> 2010/11/25

Woah, this is very nice stuff! Some comments/questions:

[#33487] Re: [Ruby 1.9-Feature#4085][Open] Refinements and nested methods — Charles Oliver Nutter <headius@...> 2010/11/30

This is a long response, and for that I apologize. I want to make sure

[#33535] Re: [Ruby 1.9-Feature#4085][Open] Refinements and nested methods — Yusuke ENDOH <mame@...> 2010/12/03

Hi,

[#33519] Re: [Ruby 1.9-Feature#4085][Open] Refinements and nested methods — Shugo Maeda <shugo@...> 2010/12/02

Hi,

[#33523] Re: [Ruby 1.9-Feature#4085][Open] Refinements and nested methods — Yusuke ENDOH <mame@...> 2010/12/02

Hi,

[#33539] Re: [Ruby 1.9-Feature#4085][Open] Refinements and nested methods — Shugo Maeda <shugo@...> 2010/12/03

Hi,

[#33543] Re: [Ruby 1.9-Feature#4085][Open] Refinements and nested methods — Yusuke ENDOH <mame@...> 2010/12/03

Hi,

[#33546] Re: [Ruby 1.9-Feature#4085][Open] Refinements and nested methods — Shugo Maeda <shugo@...> 2010/12/03

Hi,

[#33548] Re: [Ruby 1.9-Feature#4085][Open] Refinements and nested methods — Yusuke ENDOH <mame@...> 2010/12/03

Hi,

[#33567] Re: [Ruby 1.9-Feature#4085][Open] Refinements and nested methods — Shugo Maeda <shugo@...> 2010/12/04

Hi,

[#33595] Re: [Ruby 1.9-Feature#4085][Open] Refinements and nested methods — Charles Oliver Nutter <headius@...> 2010/12/06

On Sat, Dec 4, 2010 at 6:32 AM, Shugo Maeda <shugo@ruby-lang.org> wrote:

[#33367] Planning to release 1.8.7 fixes on 12/25 (Japanese timezone) — Urabe Shyouhei <shyouhei@...>

Hello,

20 messages 2010/11/25
[#33439] Re: Planning to release 1.8.7 fixes on 12/25 (Japanese timezone) — Luis Lavena <luislavena@...> 2010/11/27

2010/11/25 Urabe Shyouhei <shyouhei@ruby-lang.org>:

[#33456] [Request for Comment] avoid timer thread — SASADA Koichi <ko1@...>

Hi,

25 messages 2010/11/29
[#35152] Re: [Request for Comment] avoid timer thread — Mark Somerville <mark@...> 2011/02/08

On Mon, Nov 29, 2010 at 11:53:03AM +0900, SASADA Koichi wrote:

[#36077] Re: [Request for Comment] avoid timer thread — Mark Somerville <mark@...> 2011/05/09

On Tue, Feb 08, 2011 at 09:24:13PM +0900, Mark Somerville wrote:

[#36952] Re: [Request for Comment] avoid timer thread — Eric Wong <normalperson@...> 2011/06/10

Mark Somerville <mark@scottishclimbs.com> wrote:

[#37080] Re: [Request for Comment] avoid timer thread — Mark Somerville <mark@...> 2011/06/13

On Sat, Jun 11, 2011 at 05:57:11AM +0900, Eric Wong wrote:

[#37103] Re: [Request for Comment] avoid timer thread — Eric Wong <normalperson@...> 2011/06/13

Mark Somerville <mark@scottishclimbs.com> wrote:

[#37187] Re: [Request for Comment] avoid timer thread — SASADA Koichi <ko1@...> 2011/06/16

(2011/06/14 3:37), Eric Wong wrote:

[#37195] Re: [Request for Comment] avoid timer thread — Eric Wong <normalperson@...> 2011/06/17

SASADA Koichi <ko1@atdot.net> wrote:

[#37205] Re: [Request for Comment] avoid timer thread — Eric Wong <normalperson@...> 2011/06/17

Eric Wong <normalperson@yhbt.net> wrote:

[#33469] [Ruby 1.9-Feature#4100][Open] Improve Net::HTTP documentation — Eric Hodel <redmine@...>

Feature #4100: Improve Net::HTTP documentation

12 messages 2010/11/29

[ruby-core:33014] [Ruby 1.9-Bug#4014] Case-Sensitivity of Property Names Depends on Regexp Encoding

From: Yui NARUSE <redmine@...>
Date: 2010-11-02 18:41:13 UTC
List: ruby-core #33014
Issue #4014 has been updated by Yui NARUSE.


Hmm, it's a difficult problem...

> run@paint:~$ ruby -e 'p /\p{ascii}/u'
> /\p{ascii}/
> run@paint:~$ ruby -e 'p /\p{ascii}/n'
> -e:1: invalid character property name {ascii}: /\p{ascii}/
> run@paint:~$ ruby -e 'p /\p{ASCII}/n'
> /\p{ASCII}/n
> run@paint:~$ ruby -e 'p /\p{ASCII}/u'
> /\p{ASCII}/

A spec may deny \p/\P for non Unicode regexps, it breaks some compatibility:
  Alnum, Alpha, Blank, Cntrl, Digit, Graph, Lower,
  Print, Punct, Space, Upper, XDigit, ASCII, Word
They are case sensitive and limited to \p/\P, not [[:alnum:]].
This has a good side effect that we can assume /\p{Alpha}/ must be a UTF-8 regexp.

Another spec may only allow lower case for non Unicode, but it seems late.
Martin says Unicode's guideline is wrong, but the compatibility for both ruby and other languages
following guideline seems correct.

RunPaint's suggestion is reasonable one, the patch is following:

diff --git a/regenc.c b/regenc.c
index b9b03b0..f0ddd2c 100644
--- a/regenc.c
+++ b/regenc.c
@@ -789,20 +789,20 @@ extern int
 onigenc_minimum_property_name_to_ctype(OnigEncoding enc, UChar* p, UChar* end)
 {
   static const PosixBracketEntryType PBS[] = {
-    PosixBracketEntryInit("Alnum",  ONIGENC_CTYPE_ALNUM),
-    PosixBracketEntryInit("Alpha",  ONIGENC_CTYPE_ALPHA),
-    PosixBracketEntryInit("Blank",  ONIGENC_CTYPE_BLANK),
-    PosixBracketEntryInit("Cntrl",  ONIGENC_CTYPE_CNTRL),
-    PosixBracketEntryInit("Digit",  ONIGENC_CTYPE_DIGIT),
-    PosixBracketEntryInit("Graph",  ONIGENC_CTYPE_GRAPH),
-    PosixBracketEntryInit("Lower",  ONIGENC_CTYPE_LOWER),
-    PosixBracketEntryInit("Print",  ONIGENC_CTYPE_PRINT),
-    PosixBracketEntryInit("Punct",  ONIGENC_CTYPE_PUNCT),
-    PosixBracketEntryInit("Space",  ONIGENC_CTYPE_SPACE),
-    PosixBracketEntryInit("Upper",  ONIGENC_CTYPE_UPPER),
-    PosixBracketEntryInit("XDigit", ONIGENC_CTYPE_XDIGIT),
-    PosixBracketEntryInit("ASCII",  ONIGENC_CTYPE_ASCII),
-    PosixBracketEntryInit("Word",   ONIGENC_CTYPE_WORD),
+    PosixBracketEntryInit("alnum",  ONIGENC_CTYPE_ALNUM),
+    PosixBracketEntryInit("alpha",  ONIGENC_CTYPE_ALPHA),
+    PosixBracketEntryInit("blank",  ONIGENC_CTYPE_BLANK),
+    PosixBracketEntryInit("cntrl",  ONIGENC_CTYPE_CNTRL),
+    PosixBracketEntryInit("digit",  ONIGENC_CTYPE_DIGIT),
+    PosixBracketEntryInit("graph",  ONIGENC_CTYPE_GRAPH),
+    PosixBracketEntryInit("lower",  ONIGENC_CTYPE_LOWER),
+    PosixBracketEntryInit("print",  ONIGENC_CTYPE_PRINT),
+    PosixBracketEntryInit("punct",  ONIGENC_CTYPE_PUNCT),
+    PosixBracketEntryInit("space",  ONIGENC_CTYPE_SPACE),
+    PosixBracketEntryInit("upper",  ONIGENC_CTYPE_UPPER),
+    PosixBracketEntryInit("xdigit", ONIGENC_CTYPE_XDIGIT),
+    PosixBracketEntryInit("ascii",  ONIGENC_CTYPE_ASCII),
+    PosixBracketEntryInit("word",   ONIGENC_CTYPE_WORD),
   };

   const PosixBracketEntryType *pb, *pbe;
@@ -811,7 +811,7 @@ onigenc_minimum_property_name_to_ctype(OnigEncoding enc, UChar* p, UChar* end)
   len = onigenc_strlen(enc, p, end);
   for (pbe = (pb = PBS) + sizeof(PBS)/sizeof(PBS[0]); pb < pbe; ++pb) {
     if (len == pb->len &&
-        onigenc_with_ascii_strncmp(enc, p, end, pb->name, pb->len) == 0)
+        STRNCASECMP(p, pb->name, pb->len) == 0)
       return pb->ctype;
   }

----------------------------------------
http://redmine.ruby-lang.org/issues/show/4014

----------------------------------------
http://redmine.ruby-lang.org

In This Thread