[ruby-core:120417] [Ruby master Bug#20526] File.open(encoding: "bom|utf-8") converts "\r\n" to "\n" on Windows
From:
"YO4 (Yoshinao Muramatsu) via ruby-core" <ruby-core@...>
Date:
2024-12-26 10:37:40 UTC
List:
ruby-core #120417
Issue #20526 has been updated by YO4 (Yoshinao Muramatsu).
There are similar strangeness around an encoding specifiers.
preparations
```ruby
RUBY_VERSION # => "3.3.5"
File.write("a.txt", "a\r\n")
File.binread("a.txt").bytes # => [97, 13, 13, 10]
```
experimentations
```ruby
File.open("a.txt") {|f| f.read.bytes} # => [97, 13, 10] # expected(msvcrt[_*] newline)
File.open("a.txt", "r:utf-8") {|f| f.read.bytes} # => [97, 13, 10] # expected
File.open("a.txt", "r", encoding: "utf-8") {|f| f.read.bytes} # => [97, 13, 10] # expected
File.open("a.txt", encoding: "utf-8") {|f| f.read.bytes} # => [97, 10, 10] # XXX: universal newline enabled?
```
The omission of the mode parameter seems to enable universal newline.
```ruby
File.open("a.txt", "rt:utf-8") {|f| f.read.bytes} # => [97, 10, 10] # expected(universal newline)
File.open("a.txt", "rt:bom|utf-8") {|f| f.read.bytes} # => [97, 10] # XXX
File.open("a.txt", "rt", encoding: "utf-8") {|f| f.read.bytes} # => [97, 10, 10] # expected(universal newline)
File.open("a.txt", "rt", encoding: "bom|utf-8") {|f| f.read.bytes} # => [97, 10] # XXX
```
XXX: This is odd because universal newline and msvcrt newline appear to be cooperating.
----------------------------------------
Bug #20526: File.open(encoding: "bom|utf-8") converts "\r\n" to "\n" on Windows
https://bugs.ruby-lang.org/issues/20526#change-111198
* Author: kou (Kouhei Sutou)
* Status: Open
* ruby -v: ruby 3.2.2 (2023-03-30 revision e51014f9c0) [x64-mingw-ucrt]
* Backport: 3.1: REQUIRED, 3.2: REQUIRED, 3.3: REQUIRED
----------------------------------------
I'm not sure whether this is an intentional behavior or not but it seems that `encoding: "utf-8"` doesn't change newline conversion but `encoding: "bom|utf-8"` changes newline conversion:
```ruby
File.write("a.txt", "a\r\n")
File.read("a.txt").bytes # => [97, 13, 10]
File.open("a.txt", encoding: "utf-8") {|f| f.read.bytes} # => [97, 10, 10]
File.open("a.txt", encoding: "bom|utf-8") {|f| f.read.bytes} # => [97, 10] XXX: \r\n -> \n
File.open("a.txt", encoding: "bom|utf-8", universal_newline: false) {|f| f.read.bytes} # => [97, 13, 10]
```
Note that the `XXX: ` line in the above codes. Is this an intentional behavior?
--
https://bugs.ruby-lang.org/
______________________________________________
ruby-core mailing list -- ruby-core@ml.ruby-lang.org
To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
ruby-core info -- https://ml.ruby-lang.org/mailman3/lists/ruby-core.ml.ruby-lang.org/