[#33640] [Ruby 1.9-Bug#4136][Open] Enumerable#reject should not inherit the receiver's instance variables — Hiro Asari <redmine@...>

Bug #4136: Enumerable#reject should not inherit the receiver's instance variables

10 messages 2010/12/08

[#33667] [Ruby 1.9-Bug#4149][Open] Documentation submission: syslog standard library — mathew murphy <redmine@...>

Bug #4149: Documentation submission: syslog standard library

11 messages 2010/12/10

[#33683] [feature:trunk] Enumerable#categorize — Tanaka Akira <akr@...>

Hi.

14 messages 2010/12/12
[#33684] Re: [feature:trunk] Enumerable#categorize — "Martin J. Dst" <duerst@...> 2010/12/12

[#33687] Towards a standardized AST for Ruby code — Magnus Holm <judofyr@...>

Hey folks,

23 messages 2010/12/12
[#33688] Re: Towards a standardized AST for Ruby code — Charles Oliver Nutter <headius@...> 2010/12/12

On Sun, Dec 12, 2010 at 9:55 AM, Magnus Holm <judofyr@gmail.com> wrote:

[#33689] Re: Towards a standardized AST for Ruby code — "Haase, Konstantin" <Konstantin.Haase@...> 2010/12/12

On Dec 12, 2010, at 17:46 , Charles Oliver Nutter wrote:

[#33763] [Ruby 1.9-Bug#4168][Open] WeakRef is unsafe to use in Ruby 1.9 — Brian Durand <redmine@...>

Bug #4168: WeakRef is unsafe to use in Ruby 1.9

43 messages 2010/12/17

[#33815] trunk warnflags build issue with curb 0.7.9? — Jon <jon.forums@...>

As this may turn out to be a 3rd party issue rather than a bug, I'd like some feedback.

11 messages 2010/12/22

[#33833] Ruby 1.9.2 is going to be released — "Yuki Sonoda (Yugui)" <yugui@...>

-----BEGIN PGP SIGNED MESSAGE-----

15 messages 2010/12/23

[#33846] [Ruby 1.9-Feature#4197][Open] Improvement of the benchmark library — Benoit Daloze <redmine@...>

Feature #4197: Improvement of the benchmark library

15 messages 2010/12/23

[#33910] [Ruby 1.9-Feature#4211][Open] Converting the Ruby and C API documentation to YARD syntax — Loren Segal <redmine@...>

Feature #4211: Converting the Ruby and C API documentation to YARD syntax

10 messages 2010/12/26

[#33923] [Ruby 1.9-Bug#4214][Open] Fiddle::WINDOWS == false on Windows — Jon Forums <redmine@...>

Bug #4214: Fiddle::WINDOWS == false on Windows

15 messages 2010/12/27

[ruby-core:33763] [Ruby 1.9-Bug#4168][Open] WeakRef is unsafe to use in Ruby 1.9

From: Brian Durand <redmine@...>
Date: 2010-12-17 16:57:58 UTC
List: ruby-core #33763
Bug #4168: WeakRef is unsafe to use in Ruby 1.9
http://redmine.ruby-lang.org/issues/show/4168

Author: Brian Durand
Status: Open, Priority: Normal
Category: lib, Target version: 1.9.x
ruby -v: ruby 1.9.2p0 (2010-08-18 revision 29036) [x86_64-darwin10.4.0]

I've found a couple issues with the implementation of WeakRef in Ruby 1.9.

1. WeakRef is unsafe to use because the ObjectSpace will recycle object ids for use with newly allocated objects after the old objects have been garbage collected. This problem was not present in 1.8, but with the 1.9 threading improvements there is no guarantee that the garbage collector thread has finished running the finalizers before new objects are allocated. This can result in a WeakRef returning a different object than the one it originally referenced.

2. In Ruby 1.9 a Mutex is used to synchronize access to the shared data structures. However, Mutexes are not reentrant and the finalizers are silently throwing deadlock errors if they run while new WeakRefs are being created.

I've included in my patch a reimplementation of WeakRef called WeakReference that does not extend Delegator. This code is based on the original WeakRef code but has been rewritten so the variable names are a little clearer. It replaces the Mutex with a Monitor and adds an additional check to make sure that the object returned by the reference is the same one it originally referred to. WeakRef is rewritten as a Delegator wrapper around a WeakReference for backward compatibility.

I went this route because in some Ruby implementations (like Ruby 1.8), Delegator is very heavy weight to instantiate and creating a collection of thousands of WeakRefs will be slow and use up far too much memory (in fact, it would be nice to either backport this patch to Ruby 1.8 or the delegate changes from 1.9 to 1.8 if possible). I also don't really see the value of having weak references be delegators since is very unsafe to do anything without wrapping the call in a "rescue RefError" block. The object can be reclaimed at any time so the only safe method to be sure you still have it is to create a strong reference to the object at which point you might as well use that reference instead of the delegated methods on the WeakRef. It also should be simpler for other implementations of Ruby like Jruby or Rubinius to map native weak references to a simpler interface.

Sample code with WeakReference

    orig = Object.new
    ref = WeakReference.new(orig)
    # ...
    obj = ref.object
    if obj
      # Do something
    end
    
I also have a version of the patch which just fixes WeakRef to work without introducing a new class, but I feel this version is the right way to go.

Also included are unit tests for weak references but the test that checks for the broken functionality is not 100% reliable since garbage collection is not deterministic. I've also included a script that shows the problem in more detail with the existing weak reference implementation.

    ruby show_bug.rb 100000     # Run 100,000 iterations on the current implementation of WeakRef and report any problems
    ruby -I. show_bug.rb 100000 # Run 100,000 iterations on the new implementation of WeakRef and report any problems


----------------------------------------
http://redmine.ruby-lang.org

Attachments (4)

weakref.rb (5.2 KB, text/x-ruby-script)
require 'monitor'

# WeakReference is a class to represent a reference to an object that is not seen by
# the tracing phase of the garbage collector.  This allows the referenced
# object to be garbage collected as if nothing is referring to it.
#
# The difference between this class and WeakRef is that this class does not
# use the delegator pattern and so has an interface more suited for detecting
# if the referenced object has been reclaimed.
#
# Usage:
#
#   foo = Object.new
#   ref = ActiveSupport::WeakReference.new(foo)
#   ref.alive?			# should be true
#   ref.object			# should be foo
#   ObjectSpace.garbage_collect
#   ref.alive?			# should be false
#   ref.object			# should be nil
class WeakReference
  attr_reader :referenced_object_id
  
  # Map of references to the object_id's they refer to.
  @@referenced_object_ids = {}

  # Map of object_ids to references to them.
  @@object_id_references = {}

  @@monitor = Monitor.new

  # Finalizer that cleans up weak references when an object is destroyed.
  @@object_finalizer = lambda do |object_id|
    @@monitor.synchronize do
      reference_ids = @@object_id_references[object_id]
      if reference_ids
    	  reference_ids.each do |reference_object_id|
    	    @@referenced_object_ids.delete(reference_object_id)
    	  end
    	  @@object_id_references.delete(object_id)
  	  end
    end
  end

  # Finalizer that cleans up weak references when references are destroyed.
  @@reference_finalizer = lambda do |object_id|
    @@monitor.synchronize do
      referenced_id = @@referenced_object_ids.delete(object_id)
      if referenced_id
        obj = ObjectSpace._id2ref(referenced_object_id) rescue nil
        if obj
          backreferences = obj.instance_variable_get(:@__weak_backreferences__) if obj.instance_variable_defined?(:@__weak_backreferences__)
          if backreferences
            backreferences.delete(object_id)
            obj.remove_instance_variable(:@__weak_backreferences__) if backreferences.empty?
          end
        end
        references = @@object_id_references[referenced_id]
        if references
          references.delete(object_id)
      	  @@object_id_references.delete(referenced_id) if references.empty?
    	  end
  	  end
    end
  end

  # Create a new weak reference to an object. The existence of the weak reference
  # will not prevent the garbage collector from reclaiming the referenced object.
  def initialize(obj)
    @referenced_object_id = obj.__id__
    ObjectSpace.define_finalizer(obj, @@object_finalizer)
    ObjectSpace.define_finalizer(self, @@reference_finalizer)
    @@monitor.synchronize do
      @@referenced_object_ids[self.__id__] = obj.__id__
      add_backreference(obj)
      references = @@object_id_references[obj.__id__]
      unless references
        references = []
        @@object_id_references[obj.__id__] = references
      end
      references.push(self.__id__)
    end
  end

  # Get the reference object. If the object has already been garbage collected,
  # then this method will return nil.
  def object
    obj = nil
    begin
      if referenced_object_id == @@referenced_object_ids[self.object_id]
        obj = ObjectSpace._id2ref(referenced_object_id)
        obj = nil unless verify_backreferences(obj)
      end
    rescue RangeError
      # Object has been garbage collected.
    end
    obj
  end

  private

    def add_backreference(obj)
      backreferences = obj.instance_variable_get(:@__weak_backreferences__) if obj.instance_variable_defined?(:@__weak_backreferences__)
      unless backreferences
        backreferences = []
        obj.instance_variable_set(:@__weak_backreferences__, backreferences)
      end
      backreferences << object_id
    end

    def verify_backreferences(obj)
      backreferences = obj.instance_variable_get(:@__weak_backreferences__) if obj.instance_variable_defined?(:@__weak_backreferences__)
      backreferences && backreferences.include?(object_id)
    end
end

require "delegate"

# WeakRef is a class to represent a reference to an object that is not seen by the tracing
# phase of the garbage collector. This allows the referenced object to be garbage collected
# as if nothing is referring to it. Because WeakRef delegates method calls to the referenced
# object, it may be used in place of that object, i.e. it is of the same duck type.
#
# If you don't need to use the delegator pattern, you can use WeakReference instead.
#
# Usage:
#   foo = Object.new
#   foo = Object.new
#   p foo.to_s			# original's class
#   foo = WeakRef.new(foo)
#   p foo.to_s			# should be same class
#   ObjectSpace.garbage_collect
#   p foo.to_s			# should raise exception (recycled)
class WeakRef < Delegator
  class RefError < StandardError
  end
  
  def initialize(obj)
    super
  end
  
  def __getobj__
    obj = @reference.object
    Kernel::raise(RefError, "Invalid Reference - probably recycled", Kernel::caller(1)) unless obj
    obj
  end
  
  def __setobj__(obj)
    @reference = WeakReference.new(obj)
  end
  
  def weakref_alive?
    !!@reference.object
  end
end

if __FILE__ == $0
  foo = Object.new
  p foo.to_s			# original's class
  foo = WeakRef.new(foo)
  p foo.to_s			# should be same class
  ObjectSpace.garbage_collect
  ObjectSpace.garbage_collect
  p foo.to_s			# should raise exception (recycled)
end
test_weakref.rb (2.39 KB, text/x-ruby-script)
require 'test/unit'
require 'weakref'

class TestWeakRef < Test::Unit::TestCase
  def test_can_get_non_garbage_collected_objects
    obj = Object.new
    ref = WeakRef.new(obj)
    assert_equal obj, ref.__getobj__
    assert_equal true, ref.weakref_alive?
  end

  def test_get_the_correct_object
    # Since we can't reliably control the garbage collector, this is a brute force test.
    # It might not always fail if the garbage collector and memory allocator don't
    # cooperate, but it should fail often enough on continuous integration to
    # hilite any problems.
    id_to_ref = {}
    50000.times do |i|
      obj = Object.new
      if id_to_ref.key?(obj.object_id)
        ref = id_to_ref[obj.object_id]
        refobj = ref.__getobj__ rescue nil
        if refobj
          flunk "weak reference found with a live reference to an object that was not the one it was created with"
          break
        end
      end
      %w(Here are a bunch of objects that are allocated and can then be cleaned up by the garbage collector)
      id_to_ref[obj.object_id] = WeakRef.new(obj)
      if i % 1000 == 0
        GC.start
        sleep(0.01)
      end
    end
  end
end

class TestWeakReference < Test::Unit::TestCase
  def test_can_get_non_garbage_collected_objects
    obj = Object.new
    ref = WeakReference.new(obj)
    assert_equal obj, ref.object
    assert_equal obj.object_id, ref.referenced_object_id
  end

  def test_get_the_correct_object
    # Since we can't reliably control the garbage collector, this is a brute force test.
    # It might not always fail if the garbage collector and memory allocator don't
    # cooperate, but it should fail often enough on continuous integration to
    # hilite any problems.
    id_to_ref = {}
    50000.times do |i|
      obj = Object.new
      if id_to_ref.key?(obj.object_id)
        ref = id_to_ref[obj.object_id]
        if ref.object
          flunk "weak reference found with a live reference to an object that was not the one it was created with"
          break
        end
      end
      %w(Here are a bunch of objects that are allocated and can then be cleaned up by the garbage collector)
      id_to_ref[obj.object_id] = WeakReference.new(obj)
      if i % 1000 == 0
        GC.start
        sleep(0.01)
      end
    end
  end

  def test_inspect
    ref = WeakReference.new(Object.new)
    assert ref.inspect
    GC.start
    GC.start
    assert ref.inspect
  end
end
show_bug.rb (1.71 KB, text/x-ruby-script)
require 'weakref'

def test_weak_references(klass, iterations)
  puts "Testing if object id's can be reused by #{klass}..."
  i = 0
  n = 0
  id_to_ref = {}
  failures = 0
  iterations = 100000 if iterations <= 0
  
  iterations.times do
    i += 1
    obj = Object.new

    if id_to_ref.key?(obj.object_id)
      n += 1
      ref = id_to_ref[obj.object_id]
      if ref.weakref_alive?
        STDOUT.write('x')
        STDOUT.flush
        failures += 1
      end
    end

    ref = klass.new(obj)
    id_to_ref[obj.object_id] = ref
    raise "#{klass} did not return the correct object!" unless (klass.name == "WeakReference" ? ref.object : ref.__getobj__) == obj
    if i % 5000 == 0
      STDOUT.write('.')
      STDOUT.flush
    end
  end
  STDOUT.write("\n")
  
  collected = n
  id_to_ref.values.each do |ref|
    collected += 1 if ref.weakref_alive?
  end
  puts "#{collected} instances out of #{iterations} (#{((collected.to_f / iterations) * 100).round}%) objects refererenced by #{klass} were reclaimed by the garbage collector"
  
  if failures == 0
    puts "SUCCESS: even with #{n} object id's being reused in #{iterations} iterations"
  else
    puts "FAILURE: #{failures} instances of object ids being incorrectly referenced in #{iterations} iterations"
  end
end

if $0 == __FILE__
  iterations = ARGV.first.to_i
  if defined?(WeakReference)
    # Compatibilty hack to make WeakReference duck type like a WeakRef for the test.
    module WeakRefCompatibility
      def __getobj__
        object
      end
      
      def weakref_alive?
        object != nil
      end
    end
    WeakReference.send(:include, WeakRefCompatibility)
    test_weak_references WeakReference, iterations
  end
  test_weak_references WeakRef, iterations
end
weakref_patch.diff (3.95 KB, text/x-diff)
13c13
< require 'thread'
---
> require 'monitor'
22,24c22,25
<   @@mutex = Mutex.new
<   @@final = lambda {|id|
<     @@mutex.synchronize {
---
>   @@monitor = Monitor.new
> 
>   @@object_finalizer = lambda do |id|
>     @@monitor.synchronize do
27,30c28,31
< 	for rid in rids
< 	  @@id_rev_map.delete(rid)
< 	end
< 	@@id_map.delete(id)
---
>         rids.each do |rid|
>           @@id_rev_map.delete(rid)
>         end
>         @@id_map.delete(id)
31a33,37
>     end
>   end
>   
>   @@reference_finalizer = lambda do |id|
>     @@monitor.synchronize do
34,36c40,50
< 	@@id_rev_map.delete(id)
< 	@@id_map[rid].delete(id)
< 	@@id_map.delete(rid) if @@id_map[rid].empty?
---
>         obj = ObjectSpace._id2ref(referenced_object_id) rescue nil
>         if obj
>           backreferences = obj.instance_variable_get(:@__weak_backreferences__) if obj.instance_variable_defined?(:@__weak_backreferences__)
>           if backreferences
>             backreferences.delete(object_id)
>             obj.remove_instance_variable(:@__weak_backreferences__) if backreferences.empty?
>           end
>         end
>         @@id_rev_map.delete(id)
>         @@id_map[rid].delete(id)
>         @@id_map.delete(rid) if @@id_map[rid].empty?
38,39c52,53
<     }
<   }
---
>     end
>   end
42,45c56,59
<     @__id = orig.object_id
<     ObjectSpace.define_finalizer orig, @@final
<     ObjectSpace.define_finalizer self, @@final
<     @@mutex.synchronize {
---
>     @__id = orig.__id__
>     ObjectSpace.define_finalizer orig, @@object_finalizer
>     ObjectSpace.define_finalizer self, @@reference_finalizer
>     @@monitor.synchronize do
47,49c61,64
<     }
<     @@id_map[@__id].push self.object_id
<     @@id_rev_map[self.object_id] = @__id
---
>       @@id_map[@__id].push self.__id__
>       @@id_rev_map[__id__] = @__id
>       __add_weakref_backreference__(orig)
>     end
54,56c69
<     unless @@id_rev_map[self.object_id] == @__id
<       Kernel::raise RefError, "Invalid Reference - probably recycled", Kernel::caller(2)
<     end
---
>     obj = nil
58c71,74
<       ObjectSpace._id2ref(@__id)
---
>       if @@id_rev_map[self.__id__] == @__id
>         obj = ObjectSpace._id2ref(@__id)
>         obj = nil unless __verify_weakref_backreference__(obj)
>       end
60c76
<       Kernel::raise RefError, "Invalid Reference - probably recycled", Kernel::caller(2)
---
>       # Object has been garbage collected.
61a78,79
>     Kernel::raise(RefError, "Invalid Reference - probably recycled", Kernel::caller(1)) unless obj
>     obj
62a81
>   
67c86,93
<     @@id_rev_map[self.object_id] == @__id
---
>     if @@id_rev_map[self.__id__] == @__id
>       obj = ObjectSpace._id2ref(@__id)
>       __verify_weakref_backreference__(obj)
>     else
>       false
>     end
>   rescue RangeError
>     false
68a95,116
>   
>   private
> 
>     # Add a backreference to the weakref object id in the referenced object to protect
>     # against the object id being recycled before the finalizers have a chance to run.
>     def __add_weakref_backreference__(obj)
>       backreferences = obj.instance_variable_get(:@__weak_backreferences__) if obj.instance_variable_defined?(:@__weak_backreferences__)
>       unless backreferences
>         backreferences = []
>         obj.instance_variable_set(:@__weak_backreferences__, backreferences)
>       end
>       backreferences << object_id
>     end
> 
>     # Verify that the object is the one the weakref thinks it is. On runtimes where
>     # object ids are recycled after garbage collection and reallocated to new objects,
>     # there is a chance that the reference could find an object at the specified
>     # location in the ObjectSpace that is not the one it originally referenced.
>     def __verify_weakref_backreference__(obj)
>       backreferences = obj.instance_variable_get(:@__weak_backreferences__) if obj.instance_variable_defined?(:@__weak_backreferences__)
>       backreferences && backreferences.include?(self.__id__)
>     end
72c120
< #  require 'thread'
---
>   #  require 'thread'

In This Thread

Prev Next