[#107867] Fwd: [ruby-cvs:91197] 8f59482f5d (master): add some tests for Unicode Version 14.0.0 — Martin J. Dürst <duerst@...>
To everybody taking care of continuous integration:
3 messages
2022/03/13
[#108090] [Ruby master Bug#18666] No rule to make target 'yaml/yaml.h', needed by 'api.o' — duerst <noreply@...>
Issue #18666 has been reported by duerst (Martin D端rst).
7 messages
2022/03/28
[#108117] [Ruby master Feature#18668] Merge `io-nonblock` gems into core — "Eregon (Benoit Daloze)" <noreply@...>
Issue #18668 has been reported by Eregon (Benoit Daloze).
22 messages
2022/03/30
[ruby-core:107958] [Ruby master Feature#18563] Add "graphemes" and "each_grapheme" aliases
From:
"Dan0042 (Daniel DeLorme)" <noreply@...>
Date:
2022-03-17 18:18:21 UTC
List:
ruby-core #107958
Issue #18563 has been updated by Dan0042 (Daniel DeLorme).
nobu (Nobuyoshi Nakada) wrote in #note-4:
> How about `letters` and `each_letter`?
I like the general idea, but to me "letters" mean \p{L}
Ideally, what is now a "char" should be called a grapheme (like "a" and "\u0300"), and "grapheme_clusters" should be called chars (like "a" and "a\u0300")
It may sound like a radical idea, but what about having `each_char` output grapheme clusters? The vast majority of the time they are the same thing, and for the few exceptions we probably want `"辿t辿".chars` to return 3 characters even if they are encoded as "\u0065\u0301\u0074\u00e9" (i.e. have the "intuitively correct" result even without unicode normalization)
Or how about `characters` and `each_character`?
----------------------------------------
Feature #18563: Add "graphemes" and "each_grapheme" aliases
https://bugs.ruby-lang.org/issues/18563#change-96908
* Author: shan (Shannon Skipper)
* Status: Closed
* Priority: Normal
----------------------------------------
https://bugs.ruby-lang.org/issues/13780#note-10
> grapheme sounds like an element in the grapheme cluster. How about each_grapheme_cluster?
> If everyone gets used to the grapheme as an alias of grapheme cluster, we'd love to add an alias each_grapheme.
> Matz.
Languages that have added grapheme cluster support seem to be almost exclusively opting for the shorter "graphemes" alias as a part that stands for the whole.
* JavaScript/TypeScript grapheme-splitter library: `splitGraphemes`
* PHP: `grapheme_extract`
* Zig ziglyph library: `GraphemeIterator`
* Golang uniseg library: `NewGraphemes`
* Matlab: `splitGraphemes`
* Python grapheme library: `graphemes`
* Elixir: `graphemes`
* Crystal uni_text_seg library: `graphemes`
* Nim nim-graphemes library: `graphemes`
* Rust unicode-segmentation library: `graphemes`
Now that some time has passed and the "graphemes" alias for "grapheme clusters" has been fairly widely adopted by languages and libraries, I'd like to go ahead and propose a `graphemes` alias for `grapheme_clusters` and an `each_grapheme` alias for `each_grapheme_cluster`.
--
https://bugs.ruby-lang.org/
Unsubscribe: <mailto:ruby-core-request@ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>