[#102393] [Ruby master Feature#17608] Compact and sum in one step — sawadatsuyoshi@...

Issue #17608 has been reported by sawa (Tsuyoshi Sawada).

13 messages 2021/02/04

[#102438] [Ruby master Bug#17619] if false foo=42; end creates a foo local variable set to nil — pkmuldoon@...

Issue #17619 has been reported by pkmuldoon (Phil Muldoon).

10 messages 2021/02/10

[#102631] [Ruby master Feature#17660] Expose information about which basic methods have been redefined — tenderlove@...

Issue #17660 has been reported by tenderlovemaking (Aaron Patterson).

9 messages 2021/02/27

[#102639] [Ruby master Misc#17662] The herdoc pattern used in tests does not syntax highlight correctly in many editors — eregontp@...

Issue #17662 has been reported by Eregon (Benoit Daloze).

13 messages 2021/02/27

[#102652] [Ruby master Bug#17664] Behavior of sockets changed in Ruby 3.0 to non-blocking — ciconia@...

Issue #17664 has been reported by ciconia (Sharon Rosner).

23 messages 2021/02/28

[ruby-core:102613] [Ruby master Bug#16842] `inspect` prints the UTF-8 character U+0085 (NEXT LINE) verbatim even though it is not printable

From: naruse@...
Date: 2021-02-26 05:43:37 UTC
List: ruby-core #102613
Issue #16842 has been updated by naruse (Yui NARUSE).


Why U+0085 is categorized as `Print` in Ruby is historically Oniguruma trea=
ts as that.
https://moriyoshi.hatenablog.com/entry/20090307/1236410006

I'm neutral about the change, but I want the change should have detailed co=
mment or link to this ticket.

----------------------------------------
Bug #16842: `inspect` prints the UTF-8 character U+0085 (NEXT LINE) verbati=
m even though it is not printable
https://bugs.ruby-lang.org/issues/16842#change-90601

* Author: sawa (Tsuyoshi Sawada)
* Status: Assigned
* Priority: Normal
* Assignee: duerst (Martin D=FCrst)
* ruby -v: ruby 2.8.0dev (2020-05-09T13:24:57Z master 889b0fe46f) [x86_64-l=
inux]
* Backport: 2.5: UNKNOWN, 2.6: UNKNOWN, 2.7: UNKNOWN
----------------------------------------
The UTF-8 character U+0085 (NEXT LINE) is not printable, but `inspect` prin=
ts the character verbatim (within double quotation):

```ruby
0x85.chr(Encoding::UTF_8).match?(/\p{print}/) # =3D> false
0x85.chr(Encoding::UTF_8).inspect
#=3D> "\"
\""
```

My understanding is that non-printable characters are not printed verbatim =
with `inspect`:

```ruby
"\n".match?(/\p{print}/) # =3D> false
"\n".inspect #=3D> "\"\\n\""
```

while printable characters are:

```ruby
"a".match?(/\p{print}/) # =3D> true
"a".inspect # =3D> "\"a\""
```

I ran the following script, and found that U+0085 is the only character wit=
hin the range U+0000 to U+FFFF that behaves like this.

```ruby
def verbatim?(char)
  !char.inspect.start_with?(%r{\"\\[a-z]})
end

def printable?(char)
  char.match?(/\p{print}/)
end

(0x0000..0xffff).each do |i|
  begin
    char =3D i.chr(Encoding::UTF_8)
  rescue RangeError
    next
  end
  puts '%#x' % i unless verbatim?(char) =3D=3D printable?(char)
end
```



-- =

https://bugs.ruby-lang.org/

Unsubscribe: <mailto:ruby-core-request@ruby-lang.org?subject=3Dunsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>

In This Thread

Prev Next