[#102652] [Ruby master Bug#17664] Behavior of sockets changed in Ruby 3.0 to non-blocking — ciconia@...
Issue #17664 has been reported by ciconia (Sharon Rosner).
23 messages
2021/02/28
[ruby-core:102611] [Ruby master Bug#16842] `inspect` prints the UTF-8 character U+0085 (NEXT LINE) verbatim even though it is not printable
From:
merch-redmine@...
Date:
2021-02-25 23:54:06 UTC
List:
ruby-core #102611
Issue #16842 has been updated by jeremyevans0 (Jeremy Evans).
Assignee set to duerst (Martin D=FCrst)
Status changed from Open to Assigned
Behavior here seems to be dependent on the encoding:
```
$ LC_ALL=3DC ruby -e "p 0x85.chr(Encoding::UTF_8).inspect.b"
"\"\\u0085\""
$ LC_ALL=3Den_US.UTF-8 ruby -e "p 0x85.chr(Encoding::UTF_8).inspect.b"
"\"\xC2\x85\""
```
I've submitted a pull request to fix the behavior, though the implementatio=
n is rather crude: https://github.com/ruby/ruby/pull/4229
@duerst Is there a better fix by handling the unicode properties differentl=
y?
----------------------------------------
Bug #16842: `inspect` prints the UTF-8 character U+0085 (NEXT LINE) verbati=
m even though it is not printable
https://bugs.ruby-lang.org/issues/16842#change-90598
* Author: sawa (Tsuyoshi Sawada)
* Status: Assigned
* Priority: Normal
* Assignee: duerst (Martin D=FCrst)
* ruby -v: ruby 2.8.0dev (2020-05-09T13:24:57Z master 889b0fe46f) [x86_64-l=
inux]
* Backport: 2.5: UNKNOWN, 2.6: UNKNOWN, 2.7: UNKNOWN
----------------------------------------
The UTF-8 character U+0085 (NEXT LINE) is not printable, but `inspect` prin=
ts the character verbatim (within double quotation):
```ruby
0x85.chr(Encoding::UTF_8).match?(/\p{print}/) # =3D> false
0x85.chr(Encoding::UTF_8).inspect
#=3D> "\"
\""
```
My understanding is that non-printable characters are not printed verbatim =
with `inspect`:
```ruby
"\n".match?(/\p{print}/) # =3D> false
"\n".inspect #=3D> "\"\\n\""
```
while printable characters are:
```ruby
"a".match?(/\p{print}/) # =3D> true
"a".inspect # =3D> "\"a\""
```
I ran the following script, and found that U+0085 is the only character wit=
hin the range U+0000 to U+FFFF that behaves like this.
```ruby
def verbatim?(char)
!char.inspect.start_with?(%r{\"\\[a-z]})
end
def printable?(char)
char.match?(/\p{print}/)
end
(0x0000..0xffff).each do |i|
begin
char =3D i.chr(Encoding::UTF_8)
rescue RangeError
next
end
puts '%#x' % i unless verbatim?(char) =3D=3D printable?(char)
end
```
-- =
https://bugs.ruby-lang.org/
Unsubscribe: <mailto:ruby-core-request@ruby-lang.org?subject=3Dunsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>