From: ruby@... Date: 2017-09-29T13:46:56+00:00 Subject: [ruby-core:83062] [Ruby trunk Bug#13952] String#succ not updating code range Issue #13952 has been reported by nirvdrum (Kevin Menard). ---------------------------------------- Bug #13952: String#succ not updating code range https://bugs.ruby-lang.org/issues/13952 * Author: nirvdrum (Kevin Menard) * Status: Open * Priority: Normal * Assignee: * Target version: * ruby -v: ruby 2.4.2p198 (2017-09-14 revision 59899) [x86_64-linux] * Backport: 2.3: UNKNOWN, 2.4: UNKNOWN ---------------------------------------- I'm seeing some strange behavior with `String#succ` and updating code ranges. I haven't yet traced the code to see what the culprit is, but I'm reproducing my findings here so they don't get lost (and maybe someone has a better idea of what's going on.) This sequence of calls produces the expected output. ``` x = "\xFF".force_encoding("binary") y = x.succ z = String.new z << 0x01 << 0x00 puts "x ASCII-only?: #{x.ascii_only?}" puts "y ASCII-only?: #{y.ascii_only?}" puts "z ASCII-only?: #{z.ascii_only?}" puts "y Encoding: #{y.encoding}" puts "y Bytes: #{y.bytes}" puts "z Encoding: #{z.encoding}" puts "z Bytes: #{z.bytes}" ``` The output is: ``` x ASCII-only?: false y ASCII-only?: true z ASCII-only?: true y Encoding: ASCII-8BIT y Bytes: [1, 0] z Encoding: ASCII-8BIT z Bytes: [1, 0] ``` However, by inserting a call that would force `x` to calculate its code range prior to the `String#succ` call, we get a different set of results: ``` x = "\xFF".force_encoding("binary") x.ascii_only? y = x.succ z = String.new z << 0x01 << 0x00 puts "x ASCII-only?: #{x.ascii_only?}" puts "y ASCII-only?: #{y.ascii_only?}" puts "z ASCII-only?: #{z.ascii_only?}" puts "y Encoding: #{y.encoding}" puts "y Bytes: #{y.bytes}" puts "z Encoding: #{z.encoding}" puts "z Bytes: #{z.bytes}" ``` Now we see that `y` isn't considered to be ASCII-only, even though it has the exact same encoding and byte sequence as `z` (and as `y` in the previous call sequence that did work): ``` x ASCII-only?: false y ASCII-only?: false z ASCII-only?: true y Encoding: ASCII-8BIT y Bytes: [1, 0] z Encoding: ASCII-8BIT z Bytes: [1, 0] ``` Having not looked at it, it looks like the code range isn't updated and we only get the correct result if `CR_UNKNOWN` hasn't been replaced by some other call that needs the code range. -- https://bugs.ruby-lang.org/ Unsubscribe: