From: "kjtsanaktsidis (KJ Tsanaktsidis) via ruby-core" <ruby-core@...>
Date: 2023-12-28T05:55:20+00:00
Subject: [ruby-core:115947] [Ruby master Bug#20101] rb_file_open and rb_io_fdopen don't perform CRLF -> LF conversion when encoding is set

Issue #20101 has been reported by kjtsanaktsidis (KJ Tsanaktsidis).

----------------------------------------
Bug #20101: rb_file_open and rb_io_fdopen don't perform CRLF -> LF conversion when encoding is set
https://bugs.ruby-lang.org/issues/20101

* Author: kjtsanaktsidis (KJ Tsanaktsidis)
* Status: Open
* Priority: Normal
* Assignee: kjtsanaktsidis (KJ Tsanaktsidis)
* Backport: 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN, 3.3: UNKNOWN
----------------------------------------
When opening a file with `File.open`, as long as `'b'` is not set in the mode, Ruby will perform CRLF -> LF conversion on Windows when reading text files - i.e. CRLF line endings on disk get converted to Ruby strings with only "\n" in them. If you explicitly set the encoding with `IO#set_encoding`, this still works properly.

If you open the file in C with either the `rb_io_fdopen` or `rb_file_open` APIs in text mode, CRLF -> LF conversion also works. However, if you then call `IO#set_encoding` on this file, the CRLF -> LF conversion stops happening.

Concretely, this means that the conversion doesn't happen in the following circumstances:
  * When loading ruby files with require (that calls `rb_io_fdopen`)
  * When parsing ruuby files with RubyVM::AbstractSyntaxTree (that calls `rb_file_open`).
This then causes the ErrorHighlight tests to fail on windows if git has checked them out with CRLF line endings - the error messages it's testing wind up with literal \r\n sequences in them because the iseq text from the parser contains un-newline-converted strings.

This seems to happen because, in `File.open`, the file's encflags get the flag `ECONV_DEFAULT_NEWLINE_DECORATOR` in `rb_io_extract_modeenc`; however, this method isn't called for `rb_io_fdopen` or `rb_file_open`, so `encflags` doesn't get set to `ECONV_DEFAULT_NEWLINE_DECORATOR`. Without that flag, the underlying file descriptor's mode gets changed to binary mode by the `NEED_NEWLINE_DECORATOR_ON_READ_CHECK` macro.



-- 
https://bugs.ruby-lang.org/
 ______________________________________________
 ruby-core mailing list -- ruby-core@ml.ruby-lang.org
 To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
 ruby-core info -- https://ml.ruby-lang.org/mailman3/postorius/lists/ruby-core.ml.ruby-lang.org/