[ruby-core:80210] [Ruby trunk Bug#13321] String#codepoints for one-byte encodings

From: ruby-lang@...
Date: 2017-03-17 18:39:22 UTC
List: ruby-core #80210
Issue #13321 has been updated by InfraRuby (InfraRuby Vision).


Please update the documentation for `String#codepoints` too.

`String#codepoints` does return (Unicode) codepoints for US-ASCII and ISO-8859-1 as those encodings are the basis of Unicode.

Maybe add `Encoding#unicode_codepoints?` which returns `true` for these encodings: US-ASCII, ISO-8859-1, UTF-8, UTF-16(BE|LE), UTF-32(BE|LE).

(Also, there's an unrelated change in that revision.)


----------------------------------------
Bug #13321: String#codepoints for one-byte encodings
https://bugs.ruby-lang.org/issues/13321#change-63647

* Author: InfraRuby (InfraRuby Vision)
* Status: Rejected
* Priority: Normal
* Assignee: 
* Target version: 
* ruby -v: 
* Backport: 2.2: UNKNOWN, 2.3: UNKNOWN, 2.4: UNKNOWN
----------------------------------------
On many versions of Ruby, including 2.4.0:

    "\x80".force_encoding("WINDOWS-1252").codepoints.first # => 0x80

I expected 0x20AC: https://en.wikipedia.org/wiki/Windows-1252

See:
  https://github.com/ruby/ruby/blob/v2_4_0/string.c#L7817-L7818
  https://github.com/ruby/ruby/blob/v2_4_0/string.c#L422-L424




-- 
https://bugs.ruby-lang.org/

Unsubscribe: <mailto:ruby-core-request@ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>

In This Thread

Prev Next