From: duerst@... Date: 2018-10-03T05:45:35+00:00 Subject: [ruby-core:89259] [Ruby trunk Feature#14839] How to deal with capitalizing Georgian in Unicode 11.0.0 Issue #14839 has been updated by duerst (Martin D��rst). duerst (Martin D��rst) wrote: > I just noticed that `String.capitalize` is actually more difficult than I thought. It is a no-op when applied to lowercase, but it will produce mixed case when applied to all uppercase text. On the Unicode mailing list, I got the following ideas: * Provide an option to keep non-start characters (from Markus Scherer, this is available in ICU, see https://www.unicode.org/mail-arch/unicode-ml/y2018-m10/0010.html) * Formally (re)define `str.capitalize` as `str.downcase.capitalize` (from Ken Whistler, see https://www.unicode.org/mail-arch/unicode-ml/y2018-m10/0013.html). This should not change anything for other scripts, but for Georgian, `#capitalize` and `#downcase` would be the same, and `#capitalize` would not produce mixed-case words. I'm currently leaning towards the second proposal. It looks like this may make the operation a lot slower, but I think it's easy to avoid a major slowdown. ---------------------------------------- Feature #14839: How to deal with capitalizing Georgian in Unicode 11.0.0 https://bugs.ruby-lang.org/issues/14839#change-74288 * Author: duerst (Martin D��rst) * Status: Feedback * Priority: Normal * Assignee: duerst (Martin D��rst) * Target version: ---------------------------------------- This is a request for feedback. In particular if you are from Georgia (the country, not the US state), or if you know somebody (who knows somebody,...) from Georgia, feedback on this issue is very much appreciated. If I don't get any feedback, I'll precede as explained below. Unicode 11.0.0 introduces an upper-case version of present-day Georgian letters called Mtavruli (the lower case letters are called Mkhedruli). Mtavruli letters are only used to empthasize whole words; there is no initial-letter capitalization in Georgian. Therefore, the Mkhedruli letters do not have Mtavruli letters as their titlecase, but are explicitly mapped to themselves. This means that in Ruby, `mkhedruli.capitalize` would be a no-op although `mkhedruli.upcase` would convert to Mtavruli letters. Additional pointers: http://www.unicode.org/versions/Unicode11.0.0/#Migration http://www.unicode.org/charts/PDF/Unicode-11.0/U110-1C90.pdf http://www.unicode.org/versions/Unicode11.0.0/ch07.pdf (Section 7.7, Georgian, pp. 320-321) -- https://bugs.ruby-lang.org/ Unsubscribe: