From: usa@... Date: 2018-01-31T13:30:21+00:00 Subject: [ruby-core:85304] [Ruby trunk Bug#13952] String#succ not updating code range Issue #13952 has been updated by usa (Usaku NAKAMURA). Backport changed from 2.3: REQUIRED, 2.4: DONE to 2.3: DONE, 2.4: DONE ruby_2_3 r62139 merged revision(s) 60066. ---------------------------------------- Bug #13952: String#succ not updating code range https://bugs.ruby-lang.org/issues/13952#change-70084 * Author: nirvdrum (Kevin Menard) * Status: Closed * Priority: Normal * Assignee: * Target version: * ruby -v: ruby 2.4.2p198 (2017-09-14 revision 59899) [x86_64-linux] * Backport: 2.3: DONE, 2.4: DONE ---------------------------------------- I'm seeing some strange behavior with `String#succ` and updating code ranges. I haven't yet traced the code to see what the culprit is, but I'm reproducing my findings here so they don't get lost (and maybe someone has a better idea of what's going on.) This sequence of calls produces the expected output. ``` x = "\xFF".force_encoding("binary") y = x.succ z = String.new z << 0x01 << 0x00 puts "x ASCII-only?: #{x.ascii_only?}" puts "y ASCII-only?: #{y.ascii_only?}" puts "z ASCII-only?: #{z.ascii_only?}" puts "y Encoding: #{y.encoding}" puts "y Bytes: #{y.bytes}" puts "z Encoding: #{z.encoding}" puts "z Bytes: #{z.bytes}" ``` The output is: ``` x ASCII-only?: false y ASCII-only?: true z ASCII-only?: true y Encoding: ASCII-8BIT y Bytes: [1, 0] z Encoding: ASCII-8BIT z Bytes: [1, 0] ``` However, by inserting a call that would force `x` to calculate its code range prior to the `String#succ` call, we get a different set of results: ``` x = "\xFF".force_encoding("binary") x.ascii_only? y = x.succ z = String.new z << 0x01 << 0x00 puts "x ASCII-only?: #{x.ascii_only?}" puts "y ASCII-only?: #{y.ascii_only?}" puts "z ASCII-only?: #{z.ascii_only?}" puts "y Encoding: #{y.encoding}" puts "y Bytes: #{y.bytes}" puts "z Encoding: #{z.encoding}" puts "z Bytes: #{z.bytes}" ``` Now we see that `y` isn't considered to be ASCII-only, even though it has the exact same encoding and byte sequence as `z` (and as `y` in the previous call sequence that did work): ``` x ASCII-only?: false y ASCII-only?: false z ASCII-only?: true y Encoding: ASCII-8BIT y Bytes: [1, 0] z Encoding: ASCII-8BIT z Bytes: [1, 0] ``` Having not looked at it, it looks like the code range isn't updated and we only get the correct result if `CR_UNKNOWN` hasn't been replaced by some other call that needs the code range. -- https://bugs.ruby-lang.org/ Unsubscribe: