From: "akr (Akira Tanaka)" Date: 2012-04-05T13:43:21+09:00 Subject: [ruby-core:44141] [ruby-trunk - Bug #6258] String#succ has suprising behavior for "\u1036" (MYANMAR SIGN ANUSVARA), producing "\u1000" instead of "\u1037" Issue #6258 has been updated by akr (Akira Tanaka). "\u1036".succ is "\u1000\u1000", not a single character. % ruby -ve 'puts "\u1036".succ.dump' ruby 2.0.0dev (2012-03-16 trunk 35049) [x86_64-linux] "\u{1000}\u{1000}" It is similar that "z".succ is "aa". It is because U+1000 to U+1036 are alphabet characters and U+0fff and U+1037 is not. % ruby -e '0xfff.upto(0x1037) {|c| p ["%x" % c, /[[:alpha:]]/ =~ c.chr("UTF-8")] }' ["fff", nil] ["1000", 0] ... ["1036", 0] ["1037", nil] What I'm not sure is U+1036 is alphabet or not. I think nurse-san or martin-sensei is appropriate for this matter. ---------------------------------------- Bug #6258: String#succ has suprising behavior for "\u1036" (MYANMAR SIGN ANUSVARA), producing "\u1000" instead of "\u1037" https://bugs.ruby-lang.org/issues/6258#change-25664 Author: dbenhur (Devin Ben-Hur) Status: Assigned Priority: Normal Assignee: akr (Akira Tanaka) Category: M17N Target version: ruby -v: ruby 1.9.3p125, ruby 1.9.2p180, "\u1036".succ.ord.to_s(16) # => "1000" Discovered when investigating StackOverflow question http://stackoverflow.com/questions/10020230/anomalous-behavior-while-comparing-a-unicode-character-to-a-unicode-character-range Range#=== ultimately invokes String#upto which uses String#succ ("\u1036".."\u1037").to_a.map{|c| c.ord.to_s(16)} => ["1036"] # expected ["1036","1037"] Also once #succ! proceeds past U+1036 it continues to produce U+1000 indefinitely irb(main):115:0> c = "\u1036" => "���" irb(main):116:0> c.ord.to_s(16) => "1035" irb(main):117:0> c.succ!.ord.to_s(16) => "1036" irb(main):118:0> c.succ!.ord.to_s(16) => "1000" irb(main):119:0> c.succ!.ord.to_s(16) => "1000" But if one starts naturally at U+1000 #succ! increments as expected irb(main):001:0> c = "\u1000" => "���" irb(main):002:0> c.ord.to_s(16) => "1000" irb(main):003:0> c.succ!.ord.to_s(16) => "1001" irb(main):004:0> c.succ!.ord.to_s(16) => "1002" -- http://bugs.ruby-lang.org/