[ruby-core:98584] [Ruby master Feature#16352] Modify Marshal to dump objects larger than 2 GiB
From:
merch-redmine@...
Date:
2020-05-29 22:25:19 UTC
List:
ruby-core #98584
Issue #16352 has been updated by jeremyevans0 (Jeremy Evans).
Backport deleted (2.5: UNKNOWN, 2.6: UNKNOWN)
ruby -v deleted (ruby 2.7.0dev (2019-11-12T12:03:22Z master 3816622fbe) [x86_64-linux])
Subject changed from Marshal limit of >= 2 GiB to Modify Marshal to dump objects larger than 2 GiB
Tracker changed from Bug to Feature
It's currently expected that Marshal cannot dump objects larger than 2GiB, so this isn't a bug, though arguably RangeError would be more appropriate than TypeError if the data is too large. Supporting the dumping of larger objects does seem like a useful feature, but as @shyouhei mentioned, it requires a format change, which would break backwards compatibility as a marshal dump from Ruby 3 would not be restorable on Ruby 2.7. It does seem like Ruby 3 would be a good time to implement such a format change if we want to support the marshaling of larger objects. We would probably want to keep the supporting the old format so that a marshal dump from Ruby 2.7 will work in Ruby 3, and maybe consider working on a gem that you could install in older Ruby versions to support the new marshal format.
----------------------------------------
Feature #16352: Modify Marshal to dump objects larger than 2 GiB
https://bugs.ruby-lang.org/issues/16352#change-85898
* Author: seoanezonjic (Pedro Seoane)
* Status: Open
* Priority: Normal
----------------------------------------
Hi
Using a gem to handle matrix operations called Numo-array I found the following error when save large matrix:
in `dump': long too big to dump (TypeError)
Github thread: https://github.com/ruby-numo/numo-narray/issues/144
Digging with the authors, we found the following code that reproduces the error:
```
ruby -e 'Marshal.dump(" "*2**31)'
```
Executed in :
ruby 2.7.0dev (2019-11-12T12:03:22Z master 3816622fbe) [x86_64-linux]
The marshal library has a limit that is checked with the SIZEOF_LONG constant. This check is performed in this line https://github.com/ruby/ruby/blob/e7ea6e078fecb70fbc91b04878b69f696749afac/marshal.c#L301 to 321 of the Marshal.c file. I don't understand the motivation of this limit and has a great impact in libraries that need to serialize large objects as numeric matrix. In this case, the limit of >= 2 GiB it's reached easily and it blocks the ruby development in scientifical projects as cited. I found other bug related: #1560, but the Marshal problem itself was not addressed in this case.
Thank you in advance
PEdro Seoane
--
https://bugs.ruby-lang.org/
Unsubscribe: <mailto:ruby-core-request@ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>