[ruby-core:98231] [Ruby master Bug#16842] `inspect` prints the UTF-8 character U+0085 (NEXT LINE) verbatim even though it is not printable
From:
sawadatsuyoshi@...
Date:
2020-05-09 14:35:12 UTC
List:
ruby-core #98231
Issue #16842 has been reported by sawa (Tsuyoshi Sawada).
----------------------------------------
Bug #16842: `inspect` prints the UTF-8 character U+0085 (NEXT LINE) verbatim even though it is not printable
https://bugs.ruby-lang.org/issues/16842
* Author: sawa (Tsuyoshi Sawada)
* Status: Open
* Priority: Normal
* ruby -v: ruby 2.8.0dev (2020-05-09T13:24:57Z master 889b0fe46f) [x86_64-linux]
* Backport: 2.5: UNKNOWN, 2.6: UNKNOWN, 2.7: UNKNOWN
----------------------------------------
The UTF-8 character U+0085 (NEXT LINE) is not printable, but `inspect` prints the character verbatim (within double quotation):
```ruby
0x85.chr(Encoding::UTF_8).match?(/\p{print}/) # => false
0x85.chr(Encoding::UTF_8).inspect
#=> "\"
\""
```
My understanding is that non-printable characters are not printed verbatim with `inspect`:
```ruby
"\n".match?(/\p{print}/) # => false
"\n".inspect #=> "\"\\n\""
```
while printable characters are:
```ruby
"a".match?(/\p{print}/) # => true
"a".inspect # => "\"a\""
```
I ran the following script, and found that U+0085 is the only character within the range U+0000 to U+FFFF that behaves like this.
```ruby
def verbatim?(char)
!char.inspect.start_with?(%r{\"\\[a-z]})
end
def printable?(char)
char.match?(/\p{print}/)
end
(0x0000..0xffff).each do |i|
begin
char = i.chr(Encoding::UTF_8)
rescue RangeError
next
end
puts '%#x' % i unless verbatim?(char) == printable?(char)
end
```
--
https://bugs.ruby-lang.org/
Unsubscribe: <mailto:ruby-core-request@ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>