From: marcandre-ruby-core@... Date: 2020-06-04T07:29:57+00:00 Subject: [ruby-core:98651] [Ruby master Feature#16352] Modify Marshal to dump objects larger than 2 GiB Issue #16352 has been updated by marcandre (Marc-Andre Lafortune). Couldn't we dedicate a special "size" value to indicate "extended marshal size" (say `SIZEOF_LONG - 1`), such that compatibility with all current and future marshal dumps is maintained, with the exception of a marshal object that would actually happen to have exactly a size of `SIZEOF_LONG - 1`? ``` def marshal_dump if size < SIZEOF_LONG - 1 # business as usual, proceed with old dump else io << SIZEOF_LONG - 1 << size_as_int_64 << # presumable rest of output similar... end end def marshal_load io >> size if size == SIZEOF_LONG - 1 # Assume new format io = io.read_int_64 # ... else # ... as before end end ``` ---------------------------------------- Feature #16352: Modify Marshal to dump objects larger than 2 GiB https://bugs.ruby-lang.org/issues/16352#change-85980 * Author: seoanezonjic (Pedro Seoane) * Status: Open * Priority: Normal ---------------------------------------- Using a gem called Numo-array to handle matrix operations, I found the following error while saving a large matrix: ``` in `dump': long too big to dump (TypeError) ``` Github thread is https://github.com/ruby-numo/numo-narray/issues/144. Digging with the authors, I found the following code that reproduces the error: ``` ruby -e 'Marshal.dump(" "*2**31)' ``` Executed in: ruby 2.7.0dev (2019-11-12T12:03:22Z master 3816622fbe) [x86_64-linux] The marshal library has a limit based on constant `SIZEOF_LONG`. This check is performed in [here](https://github.com/ruby/ruby/blob/e7ea6e078fecb70fbc91b04878b69f696749afac/marshal.c#L301L321). I don't understand the motivation of this limit. It has a great impact on libraries that need to serialize large objects such as numeric matrix. In this case, the limit >= 2 GiB is reached easily, and it blocks ruby development. I found another related bug report: #1560, but the Marshal problem was not addressed in it. -- https://bugs.ruby-lang.org/ Unsubscribe: