From: ruby-lang@... Date: 2017-03-17T18:39:22+00:00 Subject: [ruby-core:80210] [Ruby trunk Bug#13321] String#codepoints for one-byte encodings Issue #13321 has been updated by InfraRuby (InfraRuby Vision). Please update the documentation for `String#codepoints` too. `String#codepoints` does return (Unicode) codepoints for US-ASCII and ISO-8859-1 as those encodings are the basis of Unicode. Maybe add `Encoding#unicode_codepoints?` which returns `true` for these encodings: US-ASCII, ISO-8859-1, UTF-8, UTF-16(BE|LE), UTF-32(BE|LE). (Also, there's an unrelated change in that revision.) ---------------------------------------- Bug #13321: String#codepoints for one-byte encodings https://bugs.ruby-lang.org/issues/13321#change-63647 * Author: InfraRuby (InfraRuby Vision) * Status: Rejected * Priority: Normal * Assignee: * Target version: * ruby -v: * Backport: 2.2: UNKNOWN, 2.3: UNKNOWN, 2.4: UNKNOWN ---------------------------------------- On many versions of Ruby, including 2.4.0: "\x80".force_encoding("WINDOWS-1252").codepoints.first # => 0x80 I expected 0x20AC: https://en.wikipedia.org/wiki/Windows-1252 See: https://github.com/ruby/ruby/blob/v2_4_0/string.c#L7817-L7818 https://github.com/ruby/ruby/blob/v2_4_0/string.c#L422-L424 -- https://bugs.ruby-lang.org/ Unsubscribe: