From: "byroot (Jean Boussier)" Date: 2022-01-31T12:14:43+00:00 Subject: [ruby-core:107389] [Ruby master Feature#18559] Allocation tracing: Objects created by the parser are attributed to Kernel.require Issue #18559 has been reported by byroot (Jean Boussier). ---------------------------------------- Feature #18559: Allocation tracing: Objects created by the parser are attributed to Kernel.require https://bugs.ruby-lang.org/issues/18559 * Author: byroot (Jean Boussier) * Status: Open * Priority: Normal ---------------------------------------- Marking this as a feature, because I think it should be improved but can hardly be considered a bug. ### Repro Consider the following script: ```ruby # /tmp/allocation-source.rb require 'objspace' require 'tmpdir' source = File.join(Dir.tmpdir, "foo.rb") File.write(source, <<~RUBY) # frozen_string_literal: true class Foo def plop "fizz" end end RUBY ObjectSpace.trace_object_allocations_start GC.start gen = GC.count require(source) ObjectSpace.dump_all(output: $stdout, since: gen) ``` ### Expected behavior I'd expect the `ObjectSpace.dump_all` output to attribute all new objects, including `T_IMEMO` etc, to `foo.rb` ### Actual behavior They are attributed to the source file that called `Kernel.require` (so with `--disable-gems`): ``` {"address":"0x11acaec78", "type":"CLASS", "class":"0x11acaebb0", "superclass":"0x10fa4a848", "name":"Foo", "references":["0x10fa4a848", "0x11acaea98", "0x11acaf790"], "file":"/var/folders/vy/srfpq1vn6hv5r6bzkvcw13y80000gn/T/foo.rb", "line":2, "generation":1, "memsize":544, "flags":{"wb_protected":true}} {"address":"0x11acaeca0", "type":"IMEMO", "class":"0x8", "imemo_type":"cref", "references":["0x10fa4a848"], "file":"/tmp/allocation-source.rb", "line":19, "method":"require", "generation":1, "memsize":40, "flags":{"wb_protected":true}} {"address":"0x11acaecc8", "type":"STRING", "class":"0x10fa42418", "frozen":true, "embedded":true, "fstring":true, "bytesize":4, "value":"fizz", "encoding":"UTF-8", "file":"/tmp/allocation-source.rb", "line":19, "method":"require", "generation":1, "memsize":40, "flags":{"wb_protected":true}} {"address":"0x11acaecf0", "type":"ARRAY", "class":"0x10fa28f68", "frozen":true, "length":2, "embedded":true, "references":["0x11acaff88", "0x11acaf240"], "file":"/tmp/allocation-source.rb", "line":19, "method":"require", "generation":1, "memsize":40, "flags":{"wb_protected":true}} {"address":"0x11acaed18", "type":"IMEMO", "imemo_type":"iseq", "references":["0x11acaecc8", "0x11acaf600", "0x11acaf600", "0x11acaecf0"], "file":"/tmp/allocation-source.rb", "line":19, "method":"require", "generation":1, "memsize":416, "flags":{"wb_protected":true}} {"address":"0x11acaf1a0", "type":"ARRAY", "class":"0x10fa28f68", "frozen":true, "length":2, "embedded":true, "references":["0x11acaff88", "0x11acaf240"], "file":"/tmp/allocation-source.rb", "line":19, "method":"require", "generation":1, "memsize":40, "flags":{"wb_protected":true}} {"address":"0x11acaf1c8", "type":"IMEMO", "imemo_type":"iseq", "references":["0x11acaed18", "0x11acaf1f0", "0x11acaf1f0", "0x11acaf1a0", "0x11acaf290"], "file":"/tmp/allocation-source.rb", "line":19, "method":"require", "generation":1, "memsize":456, "flags":{"wb_protected":true}} {"address":"0x11acaf1f0", "type":"STRING", "class":"0x10fa42418", "frozen":true, "embedded":true, "fstring":true, "bytesize":11, "value":"", "file":"/tmp/allocation-source.rb", "line":19, "method":"require", "generation":1, "memsize":40, "flags":{"wb_protected":true}} {"address":"0x11acaf218", "type":"ARRAY", "class":"0x10fa28f68", "frozen":true, "length":2, "embedded":true, "references":["0x11acaff88", "0x11acaf240"], "file":"/tmp/allocation-source.rb", "line":19, "method":"require", "generation":1, "memsize":40, "flags":{"wb_protected":true}} {"address":"0x11acaf240", "type":"STRING", "class":"0x10fa42418", "frozen":true, "fstring":true, "bytesize":63, "value":"/private/var/folders/vy/srfpq1vn6hv5r6bzkvcw13y80000gn/T/foo.rb", "encoding":"UTF-8", "file":"/tmp/allocation-source.rb", "line":19, "method":"require", "generation":1, "memsize":104, "flags":{"wb_protected":true}} .... ``` ### Why is it a problem? This behavior makes it impossible to properly analyze which part of an application use the most memory. For instance when using `heap-profiler` on an app using `Bootsnap`, all objects created as a result of loading source file are attributed to bootsnap: ``` retained memory by gem ----------------------------------- 351.64 MB bootsnap-1.10.2 ``` If this behaved as I expect, `heap-profiler` would be able to report how much each gem contribute to the app RAM usage. ### Possible solution I think `ObjectSpace` should have an API to override `get_trace_arg() / EC->trace_arg`, in the context of allocation tracing, so that `Kernel.require` and `RubyVM::InstructionSequence.load_from_binary` could set it to the source file they're loading. ### Additional use cases? A very similar issue is with objects created by static data parsers such as `YAML`, `JSON` etc. All the objects they created as part of the parsing is attributed to them. So it would very useful if there was a Ruby API so that we could do something like this: ```ruby module YAMLAllocationTracing def load_file(path, ...) ObjectSpace.set_allocation_source(file: path, line: 1, class_path: :YAML, method_id: :load_file) do super end end end YAML.singleton_class.prepend(YAMLAllocationTracing) ``` -- https://bugs.ruby-lang.org/ Unsubscribe: