From: duerst Date: 2022-02-27T05:36:28+00:00 Subject: [ruby-core:107753] [Ruby master Bug#18590] String#downcase and CAPITAL LETTER I WITH DOT ABOVE Issue #18590 has been updated by duerst (Martin D��rst). mame (Yusuke Endoh) wrote in #note-9: > I wanted to create a PR to fix the document, but I am unsure what document is the best reference for full case mapping. @duerst Could you please fix it? Or should we wait until the chart will be fixed? The best reference is section 3.13 (Default Case Algorithms) of https://www.unicode.org/versions/latest/ch03.pdf. This is a lot of text, not as easy to understand as a table. But maybe this is better. People don't need a table, it's easy to create one with Ruby :-). [Please not that this URI currently redirects to https://www.unicode.org/versions/Unicode14.0.0/ch03.pdf, but I still have to upgrade Ruby to Unicode 14.0.0; hope to be able to do this in the next couple weeks.] ---------------------------------------- Bug #18590: String#downcase and CAPITAL LETTER I WITH DOT ABOVE https://bugs.ruby-lang.org/issues/18590#change-96678 * Author: andrykonchin (Andrew Konchin) * Status: Open * Priority: Normal * Assignee: duerst (Martin D��rst) * ruby -v: 3.1.0p0 * Backport: 2.6: UNKNOWN, 2.7: UNKNOWN, 3.0: UNKNOWN, 3.1: UNKNOWN ---------------------------------------- Downcasing for "��" character works in an unexpected way: ```ruby '��'.downcase => "i��" ``` Expected result - downcasing should return "i". Instead, it returns small "i" and additional "dot" character: ```ruby '��'.downcase.chars => ["i", "��"] ``` According to the standard Unicode case mapping character '��'(0130) maps to lowercased 'i' (0069). ``` 0130;LATIN CAPITAL LETTER I WITH DOT ABOVE;Lu;0;L;0049 0307;;;;N;LATIN CAPITAL LETTER I DOT;;;0069; ``` https://www.unicode.org/Public/UCD/latest/ucd/UnicodeData.txt -- https://bugs.ruby-lang.org/ Unsubscribe: