From: "kjtsanaktsidis (KJ Tsanaktsidis) via ruby-core" <ruby-core@...> Date: 2023-12-28T05:55:20+00:00 Subject: [ruby-core:115947] [Ruby master Bug#20101] rb_file_open and rb_io_fdopen don't perform CRLF -> LF conversion when encoding is set Issue #20101 has been reported by kjtsanaktsidis (KJ Tsanaktsidis). ---------------------------------------- Bug #20101: rb_file_open and rb_io_fdopen don't perform CRLF -> LF conversion when encoding is set https://bugs.ruby-lang.org/issues/20101 * Author: kjtsanaktsidis (KJ Tsanaktsidis) * Status: Open * Priority: Normal * Assignee: kjtsanaktsidis (KJ Tsanaktsidis) * Backport: 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN, 3.3: UNKNOWN ---------------------------------------- When opening a file with `File.open`, as long as `'b'` is not set in the mode, Ruby will perform CRLF -> LF conversion on Windows when reading text files - i.e. CRLF line endings on disk get converted to Ruby strings with only "\n" in them. If you explicitly set the encoding with `IO#set_encoding`, this still works properly. If you open the file in C with either the `rb_io_fdopen` or `rb_file_open` APIs in text mode, CRLF -> LF conversion also works. However, if you then call `IO#set_encoding` on this file, the CRLF -> LF conversion stops happening. Concretely, this means that the conversion doesn't happen in the following circumstances: * When loading ruby files with require (that calls `rb_io_fdopen`) * When parsing ruuby files with RubyVM::AbstractSyntaxTree (that calls `rb_file_open`). This then causes the ErrorHighlight tests to fail on windows if git has checked them out with CRLF line endings - the error messages it's testing wind up with literal \r\n sequences in them because the iseq text from the parser contains un-newline-converted strings. This seems to happen because, in `File.open`, the file's encflags get the flag `ECONV_DEFAULT_NEWLINE_DECORATOR` in `rb_io_extract_modeenc`; however, this method isn't called for `rb_io_fdopen` or `rb_file_open`, so `encflags` doesn't get set to `ECONV_DEFAULT_NEWLINE_DECORATOR`. Without that flag, the underlying file descriptor's mode gets changed to binary mode by the `NEED_NEWLINE_DECORATOR_ON_READ_CHECK` macro. -- https://bugs.ruby-lang.org/ ______________________________________________ ruby-core mailing list -- ruby-core@ml.ruby-lang.org To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org ruby-core info -- https://ml.ruby-lang.org/mailman3/postorius/lists/ruby-core.ml.ruby-lang.org/