From: naruse@... Date: 2018-03-22T11:18:12+00:00 Subject: [ruby-core:86254] [Ruby trunk Bug#14363] each_grapheme_cluster.size returns the wrong size Issue #14363 has been updated by naruse (Yui NARUSE). Backport changed from 2.3: DONTNEED, 2.4: DONTNEED, 2.5: REQUIRED to 2.3: DONTNEED, 2.4: DONTNEED, 2.5: DONE ruby_2_5 r62896 merged revision(s) 62892,62893. ---------------------------------------- Bug #14363: each_grapheme_cluster.size returns the wrong size https://bugs.ruby-lang.org/issues/14363#change-71163 * Author: sos4nt (Stefan Sch����ler) * Status: Closed * Priority: Normal * Assignee: * Target version: * ruby -v: ruby 2.5.0p0 (2017-12-25 revision 61468) [x86_64-darwin15] * Backport: 2.3: DONTNEED, 2.4: DONTNEED, 2.5: DONE ---------------------------------------- Ruby 2.5 adds `String#each_grapheme_cluster` to enumerate the string's grapheme clusters: ```ruby str = "a\u0300i\u0301" #=> "a��i��" str.each_grapheme_cluster.to_a #=> ["a��", "i��"] ``` Unfortunately, the enumerator's `size` doesn't work as expected: ```ruby str.each_grapheme_cluster.size #=> 4 ``` The source code reveals that it invokes `rb_str_each_char_size`, so it is equivalent to `each_char.size`: ```c static VALUE rb_str_each_grapheme_cluster(VALUE str) { RETURN_SIZED_ENUMERATOR(str, 0, 0, rb_str_each_char_size); return rb_str_enumerate_grapheme_clusters(str, 0); } ``` If the grapheme enumerator's size cannot be calculated lazily, `each_grapheme_cluster.size` should return `nil` to indicate that. ---Files-------------------------------- each_grapheme_cluster_size_nil.patch (921 Bytes) each_grapheme_cluster_size_real.patch (3.03 KB) -- https://bugs.ruby-lang.org/ Unsubscribe: