[#118784] [Ruby master Feature#20664] Add `before` and `until` options to Enumerator.produce — "knu (Akinori MUSHA) via ruby-core" <ruby-core@...>

Issue #20664 has been reported by knu (Akinori MUSHA).

12 messages 2024/08/03

[#118791] [Ruby master Bug#20666] Segmentation fault instead of LoadError exception — "ErezGeva2@... (Erez Geva) via ruby-core" <ruby-core@...>

Issue #20666 has been reported by ErezGeva2@gmail.com (Erez Geva).

9 messages 2024/08/04

[#118811] [Ruby master Feature#20669] Add error classes to differentiate Marshal ArgumentErrors — "olleolleolle (Olle Jonsson) via ruby-core" <ruby-core@...>

Issue #20669 has been reported by olleolleolle (Olle Jonsson).

7 messages 2024/08/08

[#118844] [Ruby master Feature#20676] Pathnames aren't Comparable — "gmcgibbon (Gannon McGibbon) via ruby-core" <ruby-core@...>

SXNzdWUgIzIwNjc2IGhhcyBiZWVuIHJlcG9ydGVkIGJ5IGdtY2dpYmJvbiAoR2Fubm9uIE1jR2li

8 messages 2024/08/13

[#118879] [Ruby master Bug#20682] Slave PTY output is lost after a child process exits in macOS — "ono-max (Naoto Ono) via ruby-core" <ruby-core@...>

Issue #20682 has been reported by ono-max (Naoto Ono).

9 messages 2024/08/19

[#118932] [Ruby master Bug#20693] Dir.tmpdir should perform a real access check before warning about writability — "kjtsanaktsidis (KJ Tsanaktsidis) via ruby-core" <ruby-core@...>

Issue #20693 has been reported by kjtsanaktsidis (KJ Tsanaktsidis).

9 messages 2024/08/22

[#118979] [Ruby master Feature#20705] Should "0.E-9" be a valid float value? — "kou (Kouhei Sutou) via ruby-core" <ruby-core@...>

Issue #20705 has been reported by kou (Kouhei Sutou).

11 messages 2024/08/29

[#118983] [Ruby master Bug#20706] Can't build Ruby on macOS Sonoma and Sequoia due to: ignoring duplicate libraries, archive member '/' not a mach-o file in libruby.3.3-static.a — "wkoszek (Adam Koszek) via ruby-core" <ruby-core@...>

Issue #20706 has been reported by wkoszek (Adam Koszek).

7 messages 2024/08/29

[ruby-core:118822] [Ruby master Feature#20669] Add Marshal::MarshalError class to differentiate ArgumentErrors

From: "Eregon (Benoit Daloze) via ruby-core" <ruby-core@...>
Date: 2024-08-09 19:31:42 UTC
List: ruby-core #118822
Issue #20669 has been updated by Eregon (Benoit Daloze).


Isn't it fair to assume any Exception from calling `Marshal.load` is an error while de-marshaling the data?
I think it is.

If it's about `serializer.load(value)` and `serializer` not being always Marshal, that's easy to deal with: just do a `if serializer == Marshal` and `rescue` things differently there if you want to wrap in a `UnmarshalError`.

----------------------------------------
Feature #20669: Add Marshal::MarshalError class to differentiate ArgumentErrors
https://bugs.ruby-lang.org/issues/20669#change-109388

* Author: olleolleolle (Olle Jonsson)
* Status: Open
----------------------------------------
Make it possible to differentiate general non-marshal errors from failures to decode Marshal data using a specific exception class.

## Background

There are a variety of error conditions that can cause Marshal to fail, including both corrupt data, and between versions of Ruby/Marshal binary formats, e.g.

```
> Marshal.load("foobar")
<internal:marshal>:34:in `load': incompatible marshal file format (can't be read) (TypeError)

> Marshal.load(Marshal.dump(Object.new).slice(0, 10))
<internal:marshal>:34:in `load': marshal data too short (ArgumentError)

> MyThing = Struct.new(:name, :age)
=> MyThing
> Marshal.dump(MyThing.new("Alice", 20))
=> "\x04\bS:\fMyThing\a:\tnameI\"\nAlice\x06:\x06ET:\bagei\x19"

(in a separate session)
> Marshal.load "\x04\bS:\fMyThing\a:\tnameI\"\nAlice\x06:\x06ET:\bagei\x19"
<internal:marshal>:34:in `load': undefined class/module MyThing (ArgumentError)
```

When data is corrupt, or incompatible, Marshal may raise either `ArgumentError` or `TypeError` with various messages. Not all errors are from `marshal.c`.

Generally, `TypeError` and `ArgumentError` are used to indicate some semantic problem with the program, rather than a runtime issue with the data. For example, `ArgumentError` may be used to indicate less than the required number of arguments passed to a method, and `TypeError` may indicate than the wrong data type was given as an argument.

Dalli is a library for accessing memcached servers. By default it uses Marshal to store serialised Ruby objects. However, when updating major versions of Ruby, or application code, `Marshal.load` may start raising `TypeError` or `ArgumentError`. This class of error needs to be handled differently from "normal" `TypeError` and `ArgumentError`. As such, the only way to differentiate right now is to pattern match the exception message, [which is done here](https://github.com/petergoldstein/dalli/blob/1d4cbfc78e6470ad57a883801a783bf30890c400/lib/dalli/protocol/value_serializer.rb#L36-L72):

```ruby
      # TODO: Some of these error messages need to be validated.  It's not obvious
      # that all of them are actually generated by the invoked code
      # in current systems
      # rubocop:disable Layout/LineLength
      TYPE_ERR_REGEXP = %r{needs to have method `_load'|exception class/object expected|instance of IO needed|incompatible marshal file format}.freeze
      ARGUMENT_ERR_REGEXP = /undefined class|marshal data too short/.freeze
      NAME_ERR_STR = 'uninitialized constant'
      # rubocop:enable Layout/LineLength

      def retrieve(value, bitflags)
        serialized = (bitflags & FLAG_SERIALIZED) != 0
        serialized ? serializer.load(value) : value
      rescue TypeError => e
        filter_type_error(e)
      rescue ArgumentError => e
        filter_argument_error(e)
      rescue NameError => e
        filter_name_error(e)
      end

      def filter_type_error(err)
        raise err unless TYPE_ERR_REGEXP.match?(err.message)

        raise UnmarshalError, "Unable to unmarshal value: #{err.message}"
      end

      def filter_argument_error(err)
        raise err unless ARGUMENT_ERR_REGEXP.match?(err.message)

        raise UnmarshalError, "Unable to unmarshal value: #{err.message}"
      end

      def filter_name_error(err)
        raise err unless err.message.include?(NAME_ERR_STR)

        raise UnmarshalError, "Unable to unmarshal value: #{err.message}"
      end
```

Unfortunately, there have been several incidents where the complexity of the current behaviour has lead to outages in production.

## Proposal

Ideally, the above code could use a specific (one or more) error class for detecting failed `Marshal.load`.

```ruby
      def retrieve(value, bitflags)
        serialized = (bitflags & FLAG_SERIALIZED) != 0
        serialized ? serializer.load(value) : value
      rescue Marshal::LoadError => error
        raise UnmarshalError, "Unable to unmarshal value!"
      end
```

One difficulty is retaining backwards compatibility. We may be able to create a sub-class of `class LoadError < ArgumentError` but it fundamentally seems wrong to me (should really be a `RuntimeError`).

## See also

- [Dalli PR where another error message would be added](https://github.com/petergoldstein/dalli/pull/1008)
- https://github.com/search?q=Marshal.load+ArgumentError&type=code - Git code search showing some variations of the problem (`rescue ArgumentError` and similar).



-- 
https://bugs.ruby-lang.org/
 ______________________________________________
 ruby-core mailing list -- ruby-core@ml.ruby-lang.org
 To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
 ruby-core info -- https://ml.ruby-lang.org/mailman3/lists/ruby-core.ml.ruby-lang.org/


In This Thread