From: "Dan0042 (Daniel DeLorme)" Date: 2022-03-17T18:18:21+00:00 Subject: [ruby-core:107958] [Ruby master Feature#18563] Add "graphemes" and "each_grapheme" aliases Issue #18563 has been updated by Dan0042 (Daniel DeLorme). nobu (Nobuyoshi Nakada) wrote in #note-4: > How about `letters` and `each_letter`? I like the general idea, but to me "letters" mean \p{L} Ideally, what is now a "char" should be called a grapheme (like "a" and "\u0300"), and "grapheme_clusters" should be called chars (like "a" and "a\u0300") It may sound like a radical idea, but what about having `each_char` output grapheme clusters? The vast majority of the time they are the same thing, and for the few exceptions we probably want `"��t��".chars` to return 3 characters even if they are encoded as "\u0065\u0301\u0074\u00e9" (i.e. have the "intuitively correct" result even without unicode normalization) Or how about `characters` and `each_character`? ---------------------------------------- Feature #18563: Add "graphemes" and "each_grapheme" aliases https://bugs.ruby-lang.org/issues/18563#change-96908 * Author: shan (Shannon Skipper) * Status: Closed * Priority: Normal ---------------------------------------- https://bugs.ruby-lang.org/issues/13780#note-10 > grapheme sounds like an element in the grapheme cluster. How about each_grapheme_cluster? > If everyone gets used to the grapheme as an alias of grapheme cluster, we'd love to add an alias each_grapheme. > Matz. Languages that have added grapheme cluster support seem to be almost exclusively opting for the shorter "graphemes" alias as a part that stands for the whole. * JavaScript/TypeScript grapheme-splitter library: `splitGraphemes` * PHP: `grapheme_extract` * Zig ziglyph library: `GraphemeIterator` * Golang uniseg library: `NewGraphemes` * Matlab: `splitGraphemes` * Python grapheme library: `graphemes` * Elixir: `graphemes` * Crystal uni_text_seg library: `graphemes` * Nim nim-graphemes library: `graphemes` * Rust unicode-segmentation library: `graphemes` Now that some time has passed and the "graphemes" alias for "grapheme clusters" has been fairly widely adopted by languages and libraries, I'd like to go ahead and propose a `graphemes` alias for `grapheme_clusters` and an `each_grapheme` alias for `each_grapheme_cluster`. -- https://bugs.ruby-lang.org/ Unsubscribe: