From: sawadatsuyoshi@... Date: 2020-05-09T14:35:12+00:00 Subject: [ruby-core:98231] [Ruby master Bug#16842] `inspect` prints the UTF-8 character U+0085 (NEXT LINE) verbatim even though it is not printable Issue #16842 has been reported by sawa (Tsuyoshi Sawada). ---------------------------------------- Bug #16842: `inspect` prints the UTF-8 character U+0085 (NEXT LINE) verbatim even though it is not printable https://bugs.ruby-lang.org/issues/16842 * Author: sawa (Tsuyoshi Sawada) * Status: Open * Priority: Normal * ruby -v: ruby 2.8.0dev (2020-05-09T13:24:57Z master 889b0fe46f) [x86_64-linux] * Backport: 2.5: UNKNOWN, 2.6: UNKNOWN, 2.7: UNKNOWN ---------------------------------------- The UTF-8 character U+0085 (NEXT LINE) is not printable, but `inspect` prints the character verbatim (within double quotation): ```ruby 0x85.chr(Encoding::UTF_8).match?(/\p{print}/) # => false 0x85.chr(Encoding::UTF_8).inspect #=> "\" \"" ``` My understanding is that non-printable characters are not printed verbatim with `inspect`: ```ruby "\n".match?(/\p{print}/) # => false "\n".inspect #=> "\"\\n\"" ``` while printable characters are: ```ruby "a".match?(/\p{print}/) # => true "a".inspect # => "\"a\"" ``` I ran the following script, and found that U+0085 is the only character within the range U+0000 to U+FFFF that behaves like this. ```ruby def verbatim?(char) !char.inspect.start_with?(%r{\"\\[a-z]}) end def printable?(char) char.match?(/\p{print}/) end (0x0000..0xffff).each do |i| begin char = i.chr(Encoding::UTF_8) rescue RangeError next end puts '%#x' % i unless verbatim?(char) == printable?(char) end ``` -- https://bugs.ruby-lang.org/ Unsubscribe: