From: naruse@... Date: 2021-04-02T07:01:36+00:00 Subject: [ruby-core:103170] [Ruby master Bug#17729] Fix infinite loop when parsing RUBYLIB with locale-invalid bytes Issue #17729 has been updated by naruse (Yui NARUSE). Backport changed from 2.5: REQUIRED, 2.6: REQUIRED, 2.7: DONE, 3.0: REQUIRED to 2.5: REQUIRED, 2.6: REQUIRED, 2.7: DONE, 3.0: DONE ruby_3_0 1a47de64f44da6d4339ba8b2c5220eeaba82954c merged revision(s) f748b911c9157a0bb86f38280ddfba72a55049b6. ---------------------------------------- Bug #17729: Fix infinite loop when parsing RUBYLIB with locale-invalid bytes https://bugs.ruby-lang.org/issues/17729#change-91243 * Author: nobu (Nobuyoshi Nakada) * Status: Closed * Priority: Normal * Backport: 2.5: REQUIRED, 2.6: REQUIRED, 2.7: DONE, 3.0: DONE ---------------------------------------- https://github.com/ruby/ruby/pull/4281 > `ruby.c` sets up the interpreter `$LOAD_PATH` by parsing a path > separator-delimited list of paths from the `RUBYLIB` environment > variable. The parser delegates to the C standard library function > `mblen` to advance a pointer into the result of `getenv("RUBYLIB")` to > break up the list by path separators. > > `mblen` is a locale-aware API which is documented to return -1 when it > encounters an invalid byte sequence for the current LOCALE. When > invoking the `ruby` CLI with a `RUBYLIB` environment variable containing > an invalid byte sequence or when Ruby is installed to a path containing > invalid byte sequences, the interpreter will enter an infinite loop > during its boot sequence. > > For example, passing in an `\xFF` byte when the locale is set to > `en_US.UTF-8` will result in `mblen` returning -1, which causes the loop > in `push_include` to spin infinitely. > > I have also seen this bug expressed as attempting to allocate a `String` > with a negative length, which seems to imply that if the result of > `getenv` is prefixed in memory with a NUL byte or UTF-8-invalid bytes > greater than `\x7F`, the -1 return value of `mblen` results in a buffer > under read. > > I do not believe this buffer under read to be exploitable because > depending on the byte sequence, the interpreter will infinite loop or > the loop will terminate with a negative pointer offset, which when used > to compute the capacity of an `RString`, will result in an > `ArgumentError` for a negative capacity. > > The fix is to not treat the result of `getenv` as a locale-encoded > string. The return values of `getenv` are platform strings whose only > guarantee is that they are NUL-terminated. > > This fix is applied in `push_include` and the CYGWIN target-specific > `push_include_cygwin`. > > After this patch is applied, `RUBYLIB` with invalid UTF-8 bytes is > parsed properly with a UTF-8 locale: > > ```console > $ env RUBYLIB="$(echo -ne "\xFF")" LOCALE="en_US.UTF-8" LC_ALL="en_US.UTF-8" ./ruby -e 'puts $LOAD_PATH.map(&:inspect)' > `RubyGems' were not loaded. > `did_you_mean' was not loaded. > "\xFF" > "/usr/local/lib/ruby/site_ruby/3.1.0" > "/usr/local/lib/ruby/site_ruby/3.1.0/x86_64-darwin19" > "/usr/local/lib/ruby/site_ruby" > "/usr/local/lib/ruby/vendor_ruby/3.1.0" > "/usr/local/lib/ruby/vendor_ruby/3.1.0/x86_64-darwin19" > "/usr/local/lib/ruby/vendor_ruby" > "/usr/local/lib/ruby/3.1.0" > "/usr/local/lib/ruby/3.1.0/x86_64-darwin19" > ``` -- https://bugs.ruby-lang.org/ Unsubscribe: