From: duerst Date: 2022-02-24T09:50:29+00:00 Subject: [ruby-core:107737] [Ruby master Bug#18590] String#downcase and CAPITAL LETTER I WITH DOT ABOVE Issue #18590 has been updated by duerst (Martin D��rst). mame (Yusuke Endoh) wrote in #note-7: > BTW, the rdoc of String#downcase in 3.1 and master is very less informative, and has a broken link (which is maybe the same issue as #18468). It was changed at commit:f7e266e6d2ccad63e4245a106a80c82ef2b38cbf between 3.0 and 3.1. Personally I strongly prefer [the 3.0 style](https://ruby-doc.org/core-3.0.0/String.html#method-i-downcase). I also prefer the 3.0 version, but that's probably because I wrote that documentation of these methods (when I implemented them). Anyway, I think the 3.1 way of documenting things could also work, but the options link on each casing method should include a fragment and point to https://ruby-doc.org/core-3.1.0/doc/case_mapping_rdoc.html#label-Default+Case+Mapping, not just to https://ruby-doc.org/core-3.1.0/doc/case_mapping_rdoc.html. @BurdetteLamar mame (Yusuke Endoh) wrote in #note-6: > @duerst Let me confirm. The rdoc of 3.1 and master refers to https://www.unicode.org/charts/case/. > > > Default Case Mapping > > By default, all of these methods use full Unicode case mapping, which is suitable for most languages. See [Unicode Latin Case Chart](https://www.unicode.org/charts/case/). > > It is not clear to me that the document says "0069 0307 for '��'.downcase". That document does NOT say "0069 0307 for '��'.downcase". > Is it okay? I reported to Unicode that they should check it an clarify how this chart was made. > Should it be replaced with https://www.unicode.org/Public/UCD/latest/ucd/SpecialCasing.txt ? In the Ruby documentation, probably yes. SpecialCasing.txt is an official Unicode data file. The case charts are just a Web page. But the case charts may be easier to understand for non-experts. ---------------------------------------- Bug #18590: String#downcase and CAPITAL LETTER I WITH DOT ABOVE https://bugs.ruby-lang.org/issues/18590#change-96663 * Author: andrykonchin (Andrew Konchin) * Status: Open * Priority: Normal * Assignee: duerst (Martin D��rst) * ruby -v: 3.1.0p0 * Backport: 2.6: UNKNOWN, 2.7: UNKNOWN, 3.0: UNKNOWN, 3.1: UNKNOWN ---------------------------------------- Downcasing for "��" character works in an unexpected way: ```ruby '��'.downcase => "i��" ``` Expected result - downcasing should return "i". Instead, it returns small "i" and additional "dot" character: ```ruby '��'.downcase.chars => ["i", "��"] ``` According to the standard Unicode case mapping character '��'(0130) maps to lowercased 'i' (0069). ``` 0130;LATIN CAPITAL LETTER I WITH DOT ABOVE;Lu;0;L;0049 0307;;;;N;LATIN CAPITAL LETTER I DOT;;;0069; ``` https://www.unicode.org/Public/UCD/latest/ucd/UnicodeData.txt -- https://bugs.ruby-lang.org/ Unsubscribe: