[ruby-core:69622] [Ruby trunk - Bug #11264] Memory leak in JSON stdlib ext (JSON generation)

From: luke.gru@...
Date: 2015-06-16 23:33:10 UTC
List: ruby-core #69622
Issue #11264 has been updated by Luke Gruber.


Thanks for getting back to me.

That's true, but my guess is the most common leak scenario would be when calling a method that's basically an extension point to the library itself (such as `to_json`. At least if you leak the internals of the buffer, it won't be too many bytes per leak :) Otherwise you would have to wrap the whole 'cState_partial_generate' method in a rescue block.

As the library stands now, lots of methods that raise will result in a leak, not just `to_json`:

Object#respond_to?(:to_json)
Object#to_s
Array#[]
Hash#keys
Hash#[]
String#encode

But the chances of those methods raising errors are much lower, I think.

Should we continue this here or on https://github.com/flori/json/issues/251 ? I don't know what the right approach is
for std libs that are maintained separately as gems.

----------------------------------------
Bug #11264: Memory leak in JSON stdlib ext (JSON generation)
https://bugs.ruby-lang.org/issues/11264#change-52965

* Author: Luke Gruber
* Status: Third Party's Issue
* Priority: Normal
* Assignee: 
* ruby -v: 2.2-head
* Backport: 2.0.0: UNKNOWN, 2.1: UNKNOWN, 2.2: UNKNOWN
----------------------------------------
Hi,

I'm not sure if this is a bug, or just undocumented behaviour, but here's a script to reproduce the memory leak:

~~~ruby
require 'json'

class MyClass
  def to_json(*)
    "a" * 1048576 # 1 megabytes of chars
  end
end

class MyOther
  def to_json(*)
    raise "OMG"
  end
end

1000.times do |i| # will leak up to ~ 4 gigs
  puts i
  JSON.dump([MyClass.new, MyClass.new, MyClass.new, MyOther.new]) rescue nil
end
~~~

What's happening is that the C extension is iterating over the array to eventually dump it out to JSON. It's going through the array in order, appending to the `fbuffer` as needed. The problem is that that the API extension point of adding a `to_json` method to a class (or object), without wrapping the code in some sort of 'begin...rescue , free(buffer), re-raise' block results in the buffer never being freed. Normally this isn't too bad, except if a lot of data was appended to the buffer before the error got raised.

To test it against normal behaviour in the above script, take out the offending `MyOther.new` in the array. It should run much more smoothly this way :)

Note that since the `fbuffer`s aren't GC marked (not that they should be), it isn't possible to trace this leak using `GC.stat`.

Once again, not sure if this is a bug or if we should never raise errors from custom `to_json` methods (ie: always wrap them in a begin... rescue block.

Thanks,

I also reported this to the JSON gem maintainer here: https://github.com/flori/json/issues/251



-- 
https://bugs.ruby-lang.org/

In This Thread

Prev Next