[ruby-core:75858] Re: Important: Somewhat backwards-incompatible change (Fwd: [ruby-cvs:62388] duerst:r55225 (trunk): * string.c: Activate full Unicode case mapping for UTF-8)
From:
Martin J. Dürst <duerst@...>
Date:
2016-06-06 10:08:31 UTC
List:
ruby-core #75858
There was some hickup with tests, but since r55277, this is now for real. I just added UTF-16BE/LE and UTF-32BE/LE to the suppported encodings, too. Regards, Martin. On 2016/05/31 10:25, Martin J. Dürst wrote: > With the change below, I have activated full Unicode case mapping for > UTF-8 in trunk. This is may create minor incompatibilities. > > Essentially, up to r55224, you get: > > 'résumé ĭñŧėřŋãţijňőńæłĩżàťïōņ'.upcase > # -> 'RéSUMé ĭñŧėřŋãţijňőńæłĩżàťïōņ' > > Starting with r55225, you get: > > 'résumé ĭñŧėřŋãţijňőńæłĩżàťïōņ'.upcase > # -> 'RÉSUMÉ ĬÑŦĖŘŊÃŢIJŇŐŃÆŁĨŻÀŤÏŌŅ' > > In general, the later is highly desirable, and that's why Matz > explicitly proposed it (https://bugs.ruby-lang.org/issues/10085#note-5). > > However, there are some exceptions. For example, DNS servers only match > ASCII case-insensitively, but otherwise is case-sensitive. So to match, > you want to case-map only ASCII > (https://bugs.ruby-lang.org/issues/10085#note-9). You can do that now with > > 'résumé ĭñŧėřŋãţijňőńæłĩżàťïōņ'.upcase(:ascii) > # -> 'RéSUMé ĭñŧėřŋãţijňőńæłĩżàťïōņ' > > So if you have code that needs such behavior, please add the :ascii > option (it's not needed if you know that all your data is ASCII anyway). > > You can find documentation of the various options in the String#downcase > documentation, but they apply to all casing methods > (upcase/downcase/capitalize/swapcase). > > Implementations for encodings other than UTF-8 will follow (hopefully > soon). > > Regards, Martin. > > -------- Forwarded Message -------- > Subject: [ruby-cvs:62388] duerst:r55225 (trunk): * string.c: Activate > full Unicode case mapping for UTF-8 by removing > Date: Tue, 31 May 2016 01:10:07 +0000 > From: duerst@ruby-lang.org > To: ruby-cvs@ruby-lang.org > > duerst 2016-05-31 10:10:06 +0900 (Tue, 31 May 2016) > > New Revision: 55225 > > https://svn.ruby-lang.org/cgi-bin/viewvc.cgi?view=revision&revision=55225 > > Log: > * string.c: Activate full Unicode case mapping for UTF-8 by removing > the protective check for the presence of an option. > Update documentation. > * test/ruby/enc/test_case_comprehensive.rb: Adjust tests for above > change. > > Modified files: > trunk/ChangeLog > trunk/string.c > trunk/test/ruby/enc/test_case_comprehensive.rb > . > > > Unsubscribe: <mailto:ruby-core-request@ruby-lang.org?subject=unsubscribe> > <http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core> Unsubscribe: <mailto:ruby-core-request@ruby-lang.org?subject=unsubscribe> <http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>