[#108771] [Ruby master Bug#18816] Ractor segfaulting MacOS 12.4 (aarch64 / M1 processor) — "brodock (Gabriel Mazetto)" <noreply@...>

Issue #18816 has been reported by brodock (Gabriel Mazetto).

8 messages 2022/06/05

[#108802] [Ruby master Feature#18821] Expose Pattern Matching interfaces in core classes — "baweaver (Brandon Weaver)" <noreply@...>

Issue #18821 has been reported by baweaver (Brandon Weaver).

9 messages 2022/06/08

[#108822] [Ruby master Feature#18822] Ruby lack a proper method to percent-encode strings for URIs (RFC 3986) — "byroot (Jean Boussier)" <noreply@...>

Issue #18822 has been reported by byroot (Jean Boussier).

18 messages 2022/06/09

[#108937] [Ruby master Bug#18832] Suspicious superclass mismatch — "fxn (Xavier Noria)" <noreply@...>

Issue #18832 has been reported by fxn (Xavier Noria).

16 messages 2022/06/15

[#108976] [Ruby master Misc#18836] DevMeeting-2022-07-21 — "mame (Yusuke Endoh)" <noreply@...>

Issue #18836 has been reported by mame (Yusuke Endoh).

12 messages 2022/06/17

[#109043] [Ruby master Bug#18876] OpenSSL is not available with `--with-openssl-dir` — "Gloomy_meng (Gloomy Meng)" <noreply@...>

Issue #18876 has been reported by Gloomy_meng (Gloomy Meng).

18 messages 2022/06/23

[#109052] [Ruby master Bug#18878] parse.y: Foo::Bar {} is inconsistently rejected — "qnighy (Masaki Hara)" <noreply@...>

Issue #18878 has been reported by qnighy (Masaki Hara).

9 messages 2022/06/26

[#109055] [Ruby master Bug#18881] IO#read_nonblock raises IOError when called following buffered character IO — "javanthropus (Jeremy Bopp)" <noreply@...>

Issue #18881 has been reported by javanthropus (Jeremy Bopp).

9 messages 2022/06/26

[#109063] [Ruby master Bug#18882] File.read cuts off a text file with special characters when reading it on MS Windows — magynhard <noreply@...>

Issue #18882 has been reported by magynhard (Matth辰us Johannes Beyrle).

15 messages 2022/06/27

[#109081] [Ruby master Feature#18885] Long lived fork advisory API (potential Copy on Write optimizations) — "byroot (Jean Boussier)" <noreply@...>

Issue #18885 has been reported by byroot (Jean Boussier).

23 messages 2022/06/28

[#109083] [Ruby master Bug#18886] Struct aref and aset don't trigger any tracepoints. — "ioquatix (Samuel Williams)" <noreply@...>

Issue #18886 has been reported by ioquatix (Samuel Williams).

8 messages 2022/06/29

[#109095] [Ruby master Misc#18888] Migrate ruby-lang.org mail services to Google Domains and Google Workspace — "shugo (Shugo Maeda)" <noreply@...>

Issue #18888 has been reported by shugo (Shugo Maeda).

16 messages 2022/06/30

[ruby-core:108816] [Ruby master Bug#16143] BOM UTF-8 is not removed after rewind

From: "mame (Yusuke Endoh)" <noreply@...>
Date: 2022-06-09 06:51:50 UTC
List: ruby-core #108816
Issue #16143 has been updated by mame (Yusuke Endoh).

Status changed from Open to Feedback

I think this issue can be easily worked around by using `IO#set_encoding_by_bom` which was introduced by #15210.

```
csv = CSV.open('bom_test.csv', 'r:BOM|UTF-8', headers: true)
p csv.shift  #=> #<CSV::Row "Name":"John Doe" "City":"New York">

# workaround
csv.rewind
csv.binmode
csv.to_io.set_encoding_by_bom

p csv.shift  #=> #<CSV::Row "Name":"John Doe" "City":"New York">
```

Do we really need any change? It is surprising to me if `IO#pos` is non-zero after `IO#rewind`. `IO#rewind(bom: true)` or something, which @akr proposes, may be less surprising. But IMHO, `IO#set_encoding_by_bom` is enough.

----------------------------------------
Bug #16143: BOM UTF-8 is not removed after rewind
https://bugs.ruby-lang.org/issues/16143#change-97893

* Author: Dirk (Dirk Meier-Eickhoff)
* Status: Feedback
* Priority: Normal
* ruby -v: ruby 2.6.2p47 (2019-03-13 revision 67232) [x86_64-darwin17]
* Backport: 2.5: UNKNOWN, 2.6: UNKNOWN
----------------------------------------
I have a CSV file with "forced quotes" and UTF-8 BOM (\xEF\xBB\xBF) which CSV can not read after a `rewind`. I get "CSV::MalformedCSVError: Illegal quoting in line 1."

My UTF-8 CSV file with BOM:
``` ruby
File.open('bom_test.csv', 'w') do |io|
  io.write("\xEF\xBB\xBF\"Name\",\"City\"\n\"John Doe\",\"New York\"")
end
```

Reproduce error:


``` ruby
# Case 1
csv = CSV.open('bom_test.csv', 'r:BOM|UTF-8', {headers: true})
csv.shift
# => #<CSV::Row "Name":"John Doe" "City":"New York">
csv.rewind
csv.shift
# => CSV::MalformedCSVError (Illegal quoting in line 1.)

# Case 2
csv = CSV.open('bom_test.csv', 'r:BOM|UTF-8', {headers: true})
csv.readline
# => #<CSV::Row "Name":"John Doe" "City":"New York">
csv.rewind
csv.readline
# => CSV::MalformedCSVError (Illegal quoting in line 1.)
```

Sutou Kouhei has posted other reproducable code to my first issue at CSV gem: https://github.com/ruby/csv/issues/103
``` ruby
File.open("/tmp/a.txt", "w") do |x|
  x.puts("\xEF\xBB\xBFa,b,c")
end
File.open("/tmp/a.txt", "r:BOM|UTF-8") do |x|
  p x.gets.unpack("U*") # => [97, 44, 98, 44, 99, 10]
  x.rewind
  p x.gets.unpack("U*") # => [65279, 97, 44, 98, 44, 99, 10]
end
```

He said: "This [CSV] library rely on Ruby's BOM processing. It seems that Ruby's BOM processing doesn't support rewind."

My expectation is that reading a file with BOM always return the same content, regardless of first reading or after a rewind.



-- 
https://bugs.ruby-lang.org/

Unsubscribe: <mailto:ruby-core-request@ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>

In This Thread

Prev Next