[#118346] [Ruby master Bug#20586] Some filesystem calls in dir.c are missing error handling and can return incorrect results if interrupted — "ivoanjo (Ivo Anjo) via ruby-core" <ruby-core@...>
Issue #20586 has been reported by ivoanjo (Ivo Anjo).
13 messages
2024/06/19
[ruby-core:118314] [Ruby master Bug#20578] Tokenizing string literal that have newline and invalid escape is wrong
From:
"tompng (tomoya ishida) via ruby-core" <ruby-core@...>
Date:
2024-06-13 12:37:33 UTC
List:
ruby-core #118314
Issue #20578 has been reported by tompng (tomoya ishida).
----------------------------------------
Bug #20578: Tokenizing string literal that have newline and invalid escape =
is wrong
https://bugs.ruby-lang.org/issues/20578
* Author: tompng (tomoya ishida)
* Status: Open
* ruby -v: ruby 3.4.0dev (2024-06-13T09:49:46Z master 8b843b0dc7) [x86_64-l=
inux]
* Backport: 3.1: UNKNOWN, 3.2: UNKNOWN, 3.3: UNKNOWN
----------------------------------------
Tokenizing string literal that have newline and invalid escape is wrong
When a string literal includes `\n` and an invalid escape after it, tokeniz=
e result gets wrong.
~~~ruby
Ripper.tokenize "\"hello\\x world"
# =3D> ["\"", "hello\\x", " world"] # looks good
Ripper.tokenize "\"\nhello\\x world"
# =3D> ["\"", "\n world", "hello\\x"] # order is reversed
~~~
These invalid escapes also gets wrong
~~~ruby
Ripper.tokenize("\"\n\\Cxx\"") #=3D> ["\"", "\nx", "\\Cx", "\""]
Ripper.tokenize("\"\n\\Mxx\"") #=3D> ["\"", "\nx", "\\Mx", "\""]
Ripper.tokenize("\"\n\\c\\cx\"") #=3D> ["\"", "\nx", "\\c\\c", "\""]
Ripper.tokenize("\"\n\\ux\"") #=3D> ["\"", "\nx", "\""]
Ripper.tokenize("\"\n\\xx\"") #=3D> ["\"", "\nx", "\\x", "\""]
~~~
And these literals also gets wrong
~~~ruby
Ripper.tokenize("<<A\n\n\\xyz") #=3D> ["<<A", "\n", "\nyz", "\\x"]
Ripper.tokenize("%(\n\\xyz)") #=3D> ["%(", "\nyz", "\\x", ")"]
Ripper.tokenize("%Q(\n\\xyz)") #=3D> ["%Q(", "\nyz", "\\x", ")"]
Ripper.tokenize(":\"\n\\xyz\"") #=3D> [":\"", "\nyz", "\\x", "\""]
~~~
I encountered this while typing a valid string literal into IRB
~~~ruby
irb(main):001> "
irb(main):002> \x=E2=96=88
~~~
Other invalid escape sequence that disappears from tokenize result
~~~ruby
Ripper.tokenize('"\u{123')
# =3D> ["\""]
~~~
--=20
https://bugs.ruby-lang.org/