[#79914] [Ruby trunk Bug#13282] opt_str_freeze does not always dedupe — normalperson@...
Issue #13282 has been reported by Eric Wong.
4 messages
2017/03/05
[#80140] [Ruby trunk Feature#13295] [PATCH] compile.c: apply opt_str_freeze to String#-@ (uminus) — shyouhei@...
Issue #13295 has been updated by shyouhei (Shyouhei Urabe).
5 messages
2017/03/13
[#80362] Re: [Ruby trunk Feature#13295] [PATCH] compile.c: apply opt_str_freeze to String#-@ (uminus)
— Eric Wong <normalperson@...>
2017/03/26
shyouhei@ruby-lang.org wrote:
[#80368] Re: [Ruby trunk Feature#13295] [PATCH] compile.c: apply opt_str_freeze to String#-@ (uminus)
— SASADA Koichi <ko1@...>
2017/03/27
On 2017/03/26 15:16, Eric Wong wrote:
[#80205] Re: [ruby-cvs:65166] duerst:r58000 (trunk): clarifiy 'codepoint' in documentation of String#each_codepoint — Eric Wong <normalperson@...>
duerst@ruby-lang.org wrote:
4 messages
2017/03/17
[#80213] Re: [ruby-cvs:65166] duerst:r58000 (trunk): clarifiy 'codepoint' in documentation of String#each_codepoint
— Martin J. Dürst <duerst@...>
2017/03/17
Hello Eric,
[#80290] [Ruby trunk Feature#13355] [PATCH] compile.c: optimize literal String range in case/when dispatch — normalperson@...
Issue #13355 has been reported by normalperson (Eric Wong).
4 messages
2017/03/23
[#80410] Re: [Ruby trunk Feature#13355] [PATCH] compile.c: optimize literal String range in case/when dispatch
— Eric Wong <normalperson@...>
2017/03/27
normalperson@yhbt.net wrote:
[#80415] [Ruby trunk Feature#12589] VM performance improvement proposal — vmakarov@...
Issue #12589 has been updated by vmakarov (Vladimir Makarov).
5 messages
2017/03/28
[#80488] [Ruby trunk Feature#12589] VM performance improvement proposal — vmakarov@...
Issue #12589 has been updated by vmakarov (Vladimir Makarov).
4 messages
2017/03/29
[ruby-core:80215] [Ruby trunk Bug#13321] String#codepoints for one-byte encodings
From:
duerst@...
Date:
2017-03-18 01:50:33 UTC
List:
ruby-core #80215
Issue #13321 has been updated by duerst (Martin D端rst).
InfraRuby (InfraRuby Vision) wrote:
> Please update the documentation for `String#codepoints` too.
That says "This is a shorthand for `str.each_codepoint.to_a`".
> `String#codepoints` does return (Unicode) codepoints for US-ASCII and ISO-8859-1 as those encodings are the basis of Unicode.
Well, yes, and for almost all encodings, the returned values are Unicode code points for the ASCII characters, and for some other encodings, there is a bit more of overlap. I don't think we need to go too much into details.
> Maybe add `Encoding#unicode_codepoints?` which returns `true` for these encodings: US-ASCII, ISO-8859-1, UTF-8, UTF-16(BE|LE), UTF-32(BE|LE).
There are quite a few other cases where behavior of String methods changes depending on the string's Encoding. I think it would be good to have access to this information, but methods with more general names may be needed.
Anyway, to get Unicode codepoints out of an arbitrary string, `string.encode('UTF-8').codepoints` will always do the job.
> (Also, there's an unrelated change in that revision.)
Yes, thanks for noticing, fixed.
----------------------------------------
Bug #13321: String#codepoints for one-byte encodings
https://bugs.ruby-lang.org/issues/13321#change-63650
* Author: InfraRuby (InfraRuby Vision)
* Status: Rejected
* Priority: Normal
* Assignee:
* Target version:
* ruby -v:
* Backport: 2.2: UNKNOWN, 2.3: UNKNOWN, 2.4: UNKNOWN
----------------------------------------
On many versions of Ruby, including 2.4.0:
"\x80".force_encoding("WINDOWS-1252").codepoints.first # => 0x80
I expected 0x20AC: https://en.wikipedia.org/wiki/Windows-1252
See:
https://github.com/ruby/ruby/blob/v2_4_0/string.c#L7817-L7818
https://github.com/ruby/ruby/blob/v2_4_0/string.c#L422-L424
--
https://bugs.ruby-lang.org/
Unsubscribe: <mailto:ruby-core-request@ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>