[#50466] [ruby-trunk - Bug #7492][Open] Segmentation fault at DL::TestDL#test_call_double on x64 Windows 8 — "phasis68 (Heesob Park)" <phasis@...>

23 messages 2012/12/02

[#50558] [ruby-trunk - Feature #7511][Open] short-circuiting logical implication operator — "rits (First Last)" <redmine@...>

12 messages 2012/12/04

[#50575] [ruby-trunk - Feature #7517][Open] Fixnum::MIN,MAX — "matz (Yukihiro Matsumoto)" <matz@...>

20 messages 2012/12/05

[#50755] Becoming a committer — Charlie Somerville <charlie@...>

Hi ruby-core,

21 messages 2012/12/11
[#50759] Re: Becoming a committer — Yukihiro Matsumoto <matz@...> 2012/12/11

Hi,

[#50784] Re: Becoming a committer — Charles Oliver Nutter <headius@...> 2012/12/11

It's really this easy? If so, I'll send over my public key today :)

[#50795] Re: Becoming a committer — Yukihiro Matsumoto <matz@...> 2012/12/11

Hi,

[#50806] [ruby-trunk - Feature #7548][Open] Load and Require Callbacks — "trans (Thomas Sawyer)" <transfire@...>

12 messages 2012/12/12

[#50810] [ruby-trunk - Feature #7549][Open] A Ruby Design Process — "brixen (Brian Ford)" <brixen@...>

34 messages 2012/12/12

[#50867] [ruby-trunk - Bug #7556][Assigned] test error on refinement — "usa (Usaku NAKAMURA)" <usa@...>

14 messages 2012/12/13

[#50900] [ruby-trunk - Bug #7564][Open] r38175 introduces incompatibility — "tenderlovemaking (Aaron Patterson)" <aaron@...>

14 messages 2012/12/14

[#50951] [ruby-trunk - Bug #7584][Open] Ruby hangs when shutting down an ssl connection in gc finalization — "bpot (Bob Potter)" <bobby.potter@...>

12 messages 2012/12/17

[#51076] [ruby-trunk - Feature #7604][Open] Make === comparison operator ability to delegate comparison to an argument — "prijutme4ty (Ilya Vorontsov)" <prijutme4ty@...>

12 messages 2012/12/22

[#51170] [ruby-trunk - Bug #7629][Open] Segmentation fault — "atd (Antonio Tapiador)" <atapiador@...>

13 messages 2012/12/28

[ruby-core:50969] [ruby-trunk - Bug #4044] Regex matching errors when using \W character class and /i option

From: "ben_h (Ben Hoskings)" <ben@...>
Date: 2012-12-18 23:13:20 UTC
List: ruby-core #50969
Issue #4044 has been updated by ben_h (Ben Hoskings).


Hi all, long time no see :)

naruse (Yui NARUSE) wrote:
> =begin
>  > The current behavior means that \W does not mean [^A-Za-z0-9_] in Ruby 1.9 in some cases.
>  
>  Unicode ignore case breaks it.
>  http://unicode.org/reports/tr21/
>  
>  212A; C; 006B; # KELVIN SIGN
>  00DF; F; 0073 0073; # LATIN SMALL LETTER SHARP S
>  http://www.unicode.org/Public/UNIDATA/CaseFolding.txt
>  
>  \W includes U+212A and U+00DF
>  /i adds U+006B (k) and U+0073 (S) to [\W]
>  ^ reverses the class; it doesn't include k & S.

I think I see the misunderstanding: there are multiple characters that render as 'k' and 's'.

K, S, k, s are basic word characters, and so [^\W] should match them (along with all A-Z and a-z):
0x004B (Latin capital letter K)
0x0053 (Latin capital letter S)
0x006B (Latin capital letter k)
0x0073 (Latin capital letter s)

But, I'm not sure how [^\W] should treat these characters:
0x00DF (Latin small letter sharp s) 
0x017F (Latin small letter long s)
0x212A (Kelvin sign)


The important thing is that all the characters in A-Z (0x41-0x5A) & a-z (0x61-0x7A) are word characters, so [^\W] should match all of them.

Cheers,
Ben

----------------------------------------
Bug #4044: Regex matching errors when using \W character class and /i option
https://bugs.ruby-lang.org/issues/4044#change-34835

Author: ben_h (Ben Hoskings)
Status: Feedback
Priority: Normal
Assignee: naruse (Yui NARUSE)
Category: core
Target version: 1.9.2
ruby -v: ruby 1.9.2p0 (2010-08-18 revision 29036) [x86_64-darwin10.4.0]


=begin
 Hi all,
 
 Josh Bassett and I just discovered an issue with regex matches on ruby-1.9.2p0. (We reduced it while we were hacking on gemcutter.)
 
 The case-insensitive (/i) option together with the non-word character class (\W) match inconsistently against the alphabet. Specifically the regex doesn't match properly against the letters 'k' and 's'.
 
 The following expression demonstrates the problem in irb:
 
     puts ('a'..'z').to_a.map {|c| [c, c.ord, c[/[^\W]/i] ].inspect }
 
 As a reference, the following two expressions are working properly:
 
     puts ('a'..'z').to_a.map {|c| [c, c.ord, c[/[^\W]/] ].inspect }
     puts ('a'..'z').to_a.map {|c| [c, c.ord, c[/[\w]/i] ].inspect }
 
 Cheers
 Ben Hoskings & Josh Bassett
=end



-- 
http://bugs.ruby-lang.org/

In This Thread

Prev Next