[#107430] [Ruby master Feature#18566] Merge `io-wait` gem into core IO — "byroot (Jean Boussier)" <noreply@...>

Issue #18566 has been reported by byroot (Jean Boussier).

22 messages 2022/02/02

[#107434] [Ruby master Bug#18567] Depending on default gems when not needed considered harmful — "Eregon (Benoit Daloze)" <noreply@...>

Issue #18567 has been reported by Eregon (Benoit Daloze).

31 messages 2022/02/02

[#107443] [Ruby master Feature#18568] Explore lazy RubyGems boot to reduce need for --disable-gems — "headius (Charles Nutter)" <noreply@...>

Issue #18568 has been reported by headius (Charles Nutter).

13 messages 2022/02/02

[#107481] [Ruby master Feature#18571] Removed the bundled sources from release package after Ruby 3.2 — "hsbt (Hiroshi SHIBATA)" <noreply@...>

Issue #18571 has been reported by hsbt (Hiroshi SHIBATA).

9 messages 2022/02/04

[#107490] [Ruby master Bug#18572] Performance regression when invoking refined methods — "palkan (Vladimir Dementyev)" <noreply@...>

Issue #18572 has been reported by palkan (Vladimir Dementyev).

12 messages 2022/02/05

[#107514] [Ruby master Feature#18576] Rename `ASCII-8BIT` encoding to `BINARY` — "byroot (Jean Boussier)" <noreply@...>

Issue #18576 has been reported by byroot (Jean Boussier).

47 messages 2022/02/08

[#107536] [Ruby master Feature#18579] Concatenation of ASCII-8BIT strings shouldn't behave differently depending on string contents — "tenderlovemaking (Aaron Patterson)" <noreply@...>

Issue #18579 has been reported by tenderlovemaking (Aaron Patterson).

11 messages 2022/02/09

[#107547] [Ruby master Bug#18580] Range#include? inconsistency for String ranges — "zverok (Victor Shepelev)" <noreply@...>

Issue #18580 has been reported by zverok (Victor Shepelev).

10 messages 2022/02/10

[#107603] [Ruby master Feature#18589] Finer-grained constant invalidation — "kddeisz (Kevin Newton)" <noreply@...>

Issue #18589 has been reported by kddeisz (Kevin Newton).

17 messages 2022/02/16

[#107624] [Ruby master Bug#18590] String#downcase and CAPITAL LETTER I WITH DOT ABOVE — "andrykonchin (Andrew Konchin)" <noreply@...>

Issue #18590 has been reported by andrykonchin (Andrew Konchin).

13 messages 2022/02/17

[#107651] [Ruby master Misc#18591] DevMeeting-2022-03-17 — "mame (Yusuke Endoh)" <noreply@...>

Issue #18591 has been reported by mame (Yusuke Endoh).

11 messages 2022/02/18

[#107682] [Ruby master Feature#18595] Alias `String#-@` as `String#dedup` — "byroot (Jean Boussier)" <noreply@...>

Issue #18595 has been reported by byroot (Jean Boussier).

15 messages 2022/02/21

[#107699] [Ruby master Feature#18597] Strings need a named method like `dup` that doesn't duplicate if receiver is mutable — "danh337 (Dan H)" <noreply@...>

Issue #18597 has been reported by danh337 (Dan H).

18 messages 2022/02/21

[ruby-core:107731] [Ruby master Bug#18590] String#downcase and CAPITAL LETTER I WITH DOT ABOVE

From: duerst <noreply@...>
Date: 2022-02-23 08:17:52 UTC
List: ruby-core #107731
Issue #18590 has been updated by duerst (Martin Dürst).

Status changed from Assigned to Closed

andrykonchin (Andrew Konchin) wrote in #note-3:
> Thank you for the suggestion.
> 
> I am wondering whether `String#downcase` (when called without arguments) follows only Unicode case mapping rules (as stated in the [documentation]). Or also the folding ones?
> 
> I would expect that a call of `String#downcase` without arguments uses the one-to-one case mapping rules, that are specified in the [UnicodeData.txt] file.

It should use the mappings in https://www.unicode.org/Public/UCD/latest/ucd/SpecialCasing.txt.

And that is 0069 0307 (i.e. 'i' followed by dot above) for 'İ'.downcase.

> [documentation]: https://ruby-doc.org/core-3.0.0/String.html#method-i-downcase
> [UnicodeData.txt]: https://www.unicode.org/Public/UCD/latest/ucd/UnicodeData.txt

The data in UnicodeData is restricted to simple case mappings (i.e. mappings that don't change the length of the string in terms of number of codepoints). In Ruby, there is no need for such a restriction. See also https://www.sw.it.aoyama.ac.jp/2016/pub/RubyKaigi/, slide 23.

I'm closing this, because it works as intended/described, as far as I can see.


----------------------------------------
Bug #18590: String#downcase and CAPITAL LETTER I WITH DOT ABOVE
https://bugs.ruby-lang.org/issues/18590#change-96654

* Author: andrykonchin (Andrew Konchin)
* Status: Closed
* Priority: Normal
* Assignee: duerst (Martin Dürst)
* ruby -v: 3.1.0p0
* Backport: 2.6: UNKNOWN, 2.7: UNKNOWN, 3.0: UNKNOWN, 3.1: UNKNOWN
----------------------------------------
Downcasing for "İ" character works in an unexpected way:

```ruby
'İ'.downcase
=> "i̇"
```

Expected result - downcasing should return "i". Instead, it returns small "i" and additional "dot" character:

```ruby
'İ'.downcase.chars
=> ["i", "̇"]
```

According to the standard Unicode case mapping character 'İ'(0130) maps to lowercased 'i' (0069).

```
0130;LATIN CAPITAL LETTER I WITH DOT ABOVE;Lu;0;L;0049 0307;;;;N;LATIN CAPITAL LETTER I DOT;;;0069;
```

https://www.unicode.org/Public/UCD/latest/ucd/UnicodeData.txt




-- 
https://bugs.ruby-lang.org/

Unsubscribe: <mailto:ruby-core-request@ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>

In This Thread