From: "olleolleolle (Olle Jonsson) via ruby-core" Date: 2024-10-03T09:46:13+00:00 Subject: [ruby-core:119431] [Ruby master Feature#20669] Add Marshal::MarshalError class to differentiate ArgumentErrors Issue #20669 has been updated by olleolleolle (Olle Jonsson). In the meantime, Dalli merged a change that sounds a lot like Eregon's comment. https://github.com/petergoldstein/dalli/pull/1011 My immediate needs are met, and I will close this feature request. ---------------------------------------- Feature #20669: Add Marshal::MarshalError class to differentiate ArgumentErrors https://bugs.ruby-lang.org/issues/20669#change-110046 * Author: olleolleolle (Olle Jonsson) * Status: Open ---------------------------------------- Make it possible to differentiate general non-marshal errors from failures to decode Marshal data using a specific exception class. ## Background There are a variety of error conditions that can cause Marshal to fail, including both corrupt data, and between versions of Ruby/Marshal binary formats, e.g. ``` > Marshal.load("foobar") :34:in `load': incompatible marshal file format (can't be read) (TypeError) > Marshal.load(Marshal.dump(Object.new).slice(0, 10)) :34:in `load': marshal data too short (ArgumentError) > MyThing = Struct.new(:name, :age) => MyThing > Marshal.dump(MyThing.new("Alice", 20)) => "\x04\bS:\fMyThing\a:\tnameI\"\nAlice\x06:\x06ET:\bagei\x19" (in a separate session) > Marshal.load "\x04\bS:\fMyThing\a:\tnameI\"\nAlice\x06:\x06ET:\bagei\x19" :34:in `load': undefined class/module MyThing (ArgumentError) ``` When data is corrupt, or incompatible, Marshal may raise either `ArgumentError` or `TypeError` with various messages. Not all errors are from `marshal.c`. Generally, `TypeError` and `ArgumentError` are used to indicate some semantic problem with the program, rather than a runtime issue with the data. For example, `ArgumentError` may be used to indicate less than the required number of arguments passed to a method, and `TypeError` may indicate than the wrong data type was given as an argument. Dalli is a library for accessing memcached servers. By default it uses Marshal to store serialised Ruby objects. However, when updating major versions of Ruby, or application code, `Marshal.load` may start raising `TypeError` or `ArgumentError`. This class of error needs to be handled differently from "normal" `TypeError` and `ArgumentError`. As such, the only way to differentiate right now is to pattern match the exception message, [which is done here](https://github.com/petergoldstein/dalli/blob/1d4cbfc78e6470ad57a883801a783bf30890c400/lib/dalli/protocol/value_serializer.rb#L36-L72): ```ruby # TODO: Some of these error messages need to be validated. It's not obvious # that all of them are actually generated by the invoked code # in current systems # rubocop:disable Layout/LineLength TYPE_ERR_REGEXP = %r{needs to have method `_load'|exception class/object expected|instance of IO needed|incompatible marshal file format}.freeze ARGUMENT_ERR_REGEXP = /undefined class|marshal data too short/.freeze NAME_ERR_STR = 'uninitialized constant' # rubocop:enable Layout/LineLength def retrieve(value, bitflags) serialized = (bitflags & FLAG_SERIALIZED) != 0 serialized ? serializer.load(value) : value rescue TypeError => e filter_type_error(e) rescue ArgumentError => e filter_argument_error(e) rescue NameError => e filter_name_error(e) end def filter_type_error(err) raise err unless TYPE_ERR_REGEXP.match?(err.message) raise UnmarshalError, "Unable to unmarshal value: #{err.message}" end def filter_argument_error(err) raise err unless ARGUMENT_ERR_REGEXP.match?(err.message) raise UnmarshalError, "Unable to unmarshal value: #{err.message}" end def filter_name_error(err) raise err unless err.message.include?(NAME_ERR_STR) raise UnmarshalError, "Unable to unmarshal value: #{err.message}" end ``` Unfortunately, there have been several incidents where the complexity of the current behaviour has lead to outages in production. ## Proposal Ideally, the above code could use a specific (one or more) error class for detecting failed `Marshal.load`. ```ruby def retrieve(value, bitflags) serialized = (bitflags & FLAG_SERIALIZED) != 0 serialized ? serializer.load(value) : value rescue Marshal::LoadError => error raise UnmarshalError, "Unable to unmarshal value!" end ``` One difficulty is retaining backwards compatibility. We may be able to create a sub-class of `class LoadError < ArgumentError` but it fundamentally seems wrong to me (should really be a `RuntimeError`). ## See also - [Dalli PR where another error message would be added](https://github.com/petergoldstein/dalli/pull/1008) - https://github.com/search?q=Marshal.load+ArgumentError&type=code - Git code search showing some variations of the problem (`rescue ArgumentError` and similar). -- https://bugs.ruby-lang.org/ ______________________________________________ ruby-core mailing list -- ruby-core@ml.ruby-lang.org To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org ruby-core info -- https://ml.ruby-lang.org/mailman3/lists/ruby-core.ml.ruby-lang.org/