From: "Eregon (Benoit Daloze) via ruby-core" Date: 2023-04-16T13:36:29+00:00 Subject: [ruby-core:113267] [Ruby master Bug#4040] SystemStackError with Hash[*a] for Large _a_ Issue #4040 has been updated by Eregon (Benoit Daloze). @jeremyevans0 > I rebased my branch against master, and then ran all of the app_* benchmarks, here are the results: Are the +N% there improvements or regressions? From those numbers it sounds like `+` would be regressions (i.e., more time to execute the same thing). --- I am thinking a bit more about the implications of this for Ruby implementations and JITs. Only passing on the stack means not allowed to pass a huge number of arguments (the case on TruffleRuby). Only passing as a heap array seems inefficient in general (would cause extra allocations, at least in interpreter, for `foo(1, 2)`). I guess one could use 2 different calling conventions, on stack if no rest parameter, on heap if there is a rest parameter. But more calling conventions is a clear cost as it causes extra checks for every call, even more so for polymorphic call site (+ it's messy to do callee-specific logic in the caller). If supporting to pass both arguments on the stack or in a heap array, then the called method (the callee) will most likely need to branch and find out from where to read arguments. It seems always an anti-pattern to have the callee need to deal with two calling conventions. That may actually be easier to deal with in C because a `VALUE*` pointer can represent both, then it would be one check on method entry for which pointer and size to use. In Java, if passing arguments as an Object[] and having hidden arguments at the start of the array, there is no way to share the logic with a Ruby Array from the heap, or it would need some offset for every argument access, which seems very expensive. I suppose one could technically compile 2 variants of a method, one for on stack and one for heap array, but it seems very expensive from a warmup and memory perspective, and it's again costing more calling conventions. Also when using array storage strategies, the array might be int[] behind the scenes and then passing it as a single argument vs a splat is so so so much faster. Basically, I think efficient Ruby implementations and JITs might not want to deal with the complexity of on-heap arguments. Such usage pattern is intrinsically inefficient. For example `m(:name, *array)` is quite expensive if array is big, `m(:name, array)` is strictly better from a performance POV. `m(*array)` can at best be as fast as `m(array)`, but can be much worse, e.g. if passed on stack (and < 128 for your PR) or if `array` is a `int[]`. Of course CRuby devs will decide what they want here. The real issue is if CRuby accepts this: * There is probably no hope to ever revert that decision and to remove those costs, because some code will likely start to depend on it. * It might encourage Ruby users to abuse splats more since they seem not much slower than non-splat on CRuby. ---------------------------------------- Bug #4040: SystemStackError with Hash[*a] for Large _a_ https://bugs.ruby-lang.org/issues/4040#change-102829 * Author: runpaint (Run Paint Run Run) * Status: Open * Priority: Normal * Assignee: ko1 (Koichi Sasada) * ruby -v: ruby 1.9.3dev (2010-11-09 trunk 29737) [x86_64-linux] * Backport: 2.2: UNKNOWN, 2.3: UNKNOWN, 2.4: UNKNOWN ---------------------------------------- =begin I've been hesitating over whether to file a ticket about this, so please feel free to close if I've made the wrong choice. I often use Hash[*array.flatten] in IRB to convert arrays of arrays into hashes. Today I noticed that if the array is big enough, this would raise a SystemStackError. Puzzled, I looked deeper. I assumed I was hitting the maximum number of arguments a method's argc can hold, but realised that the minimum size of the array needed to trigger this exception differed depending on whether I used IRB or not. So, presumably this is indeed exhausting the stack... In IRB, the following is the minimal reproduction of this problem: Hash[*130648.times.map{ 1 }]; true I haven't looked for the minimum value needed with `ruby -e`, but the following reproduces: ruby -e 'Hash[*1380888.times.map{ 1 }]' I suppose this isn't technically a bug, but maybe it offers another argument for either #666 or an extension of #3131. =end -- https://bugs.ruby-lang.org/ ______________________________________________ ruby-core mailing list -- ruby-core@ml.ruby-lang.org To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org ruby-core info -- https://ml.ruby-lang.org/mailman3/postorius/lists/ruby-core.ml.ruby-lang.org/