From: duerst@... Date: 2017-02-27T10:58:58+00:00 Subject: [ruby-core:79798] [Ruby trunk Bug#13220] Enhance support of Unicode strings manipulation Issue #13220 has been updated by Martin D��rst. Nobuyoshi Nakada wrote: > Note that these results are in NFD. > It seems to result as expected by using NFC. This is mostly true, but there are 'visual' characters that cannot be expressed in a single code point in Unicode. As an example: "q��".unicode_normalize.gsub("q", "x") # => "x��" (The "q��" may show with the two dots above the q or after them depending on the font and rendering engine used by your browser or mailer; in my case, the dots appear after, but the cursor moves across the q and the dots with a single key press.) For many of the tests, applying them to grapheme clusters might work, but there may be languages where it won't be that easy. Also, I don't understand why the author expects "a��" for "a��".next, but is happy for "a��".upto("c��").to_a to cycle through ["a��", "b��", "c��"]. Here, the expectations seem to be inconsistent, but it also has to be said that e.g. Swedes would expect "a��".next to be "��" (see https://en.wikipedia.org/wiki/Swedish_alphabet). ---------------------------------------- Bug #13220: Enhance support of Unicode strings manipulation https://bugs.ruby-lang.org/issues/13220#change-63227 * Author: Radovan Smitala * Status: Feedback * Priority: Normal * Assignee: * Target version: * ruby -v: ruby 2.4.0p0 (2016-12-24 revision 57164) [x86_64-darwin16] * Backport: 2.2: UNKNOWN, 2.3: UNKNOWN, 2.4: UNKNOWN ---------------------------------------- Hi, last days, Starr Horne posted very interesting testing results about manipulation unicode strings in Ruby 2.4. And many methods doesn't work as excepted. Article: http://blog.honeybadger.io/ruby-s-unicode-support/ -- https://bugs.ruby-lang.org/ Unsubscribe: