[#31647] [Backport #3666] Backport of r26311 (Bug #2587) — Luis Lavena <redmine@...>

Backport #3666: Backport of r26311 (Bug #2587)

13 messages 2010/08/07

[#31666] [Bug #3677] unable to run certain gem binaries' in windows 7 — Roger Pack <redmine@...>

Bug #3677: unable to run certain gem binaries' in windows 7

10 messages 2010/08/10

[#31676] [Backport #3680] Splatting calls to_ary instead of to_a in some cases — Tomas Matousek <redmine@...>

Backport #3680: Splatting calls to_ary instead of to_a in some cases

10 messages 2010/08/11

[#31681] [Bug #3683] getgrnam on computer with NIS group (+)? — Rocky Bernstein <redmine@...>

Bug #3683: getgrnam on computer with NIS group (+)?

13 messages 2010/08/11

[#31843] Garbage Collection Question — Asher <asher@...>

This question is no doubt a function of my own lack of understanding, but I think that asking it will at least help some other folks see what's going on with the internals during garbage collection.

17 messages 2010/08/25
[#31861] Re: Garbage Collection Question — Roger Pack <rogerdpack2@...> 2010/08/26

> The question in short: when an object goes out of scope and has no

[#31862] Re: Garbage Collection Question — Asher <asher@...> 2010/08/26

Right - so how does a pointer ever get off the stack?

[#31873] Re: Garbage Collection Question — Kurt Stephens <ks@...> 2010/08/27

On 8/26/10 11:51 AM, Asher wrote:

[#31894] Re: Garbage Collection Question — Asher <asher@...> 2010/08/27

I very much appreciate the response, and this is helpful in describing the narrative, but it's still a few steps behind my question - but it may very well have clarified some points that help us get there.

[#31896] Re: Garbage Collection Question — Evan Phoenix <evan@...> 2010/08/27

You have introduced something called a "root node" without defining it. What do you mean by this?

[#31885] Avoiding $LOAD_PATH pollution — Eric Hodel <drbrain@...7.net>

Last year Nobu asked me to propose an API for adding an object to

21 messages 2010/08/27

[#31947] not use system for default encoding — Roger Pack <rogerdpack2@...>

It strikes me as a bit "scary" to use system locale settings to

19 messages 2010/08/30

[#31971] Change Ruby's License to BSDL + Ruby's dual license — "NARUSE, Yui" <naruse@...>

Ruby's License will change to BSDL + Ruby's dual license

16 messages 2010/08/31

[ruby-core:31895] Re: Garbage Collection Question (Fixed: Plaintext)

From: Asher <asher@...>
Date: 2010-08-27 15:28:55 UTC
List: ruby-core #31895
Sorry for all the junk in the previous message, as well as for this duplication. Not sure how the previous message became non-plaintext. This is the same email, hopefully readable.

Asher

Previous message:

I very much appreciate the response, and this is helpful in describing the narrative, but it's still a few steps behind my question - but it may very well have clarified some points that help us get there. 

Let's stick with the example: a local variable is set as a reference to an object, the local variable is then set to nil so there is no longer a live reference to the object. No other ruby space commands have gone on, so unless Ruby is keeping junk behind the scenes, there should be no references - not on the Ruby stack, not on the C stack. How does this object get collected? As shown by the example, it is missed during the next attempt to GC (as well as any repeated attempts at this point). 

So what will change to make that object collectable? Are you suggesting that because it is in Ruby's root node that it gets treated such that it won't be GC'd until the program terminates? 

Let's assume this is the case. I should therefore be able to write a script that creates a non-root object as a child to another object inside method scope, allow that method to go out of scope, and expect that the object will be GC'd (as it is neither a root node nor does it have any live references). 

This seems to be validated by the following Ruby code:

> require 'pp'
> 
> class Hash::Weak < Hash
>   
>   def []( key )
> 
>     # get the stored ID - a FixNum, not an object reference to our weak-referenced object
>     obj_id = super( key.to_sym )
> 
>     # theoretically this should cause non-referenced objects to get cleaned up 
>     # so long as nothing looks like a pointer or reference to it
>     ObjectSpace.garbage_collect
> 
>     # now get our object from ID
>     # if it had no references it should have been GC'd and we should get an 
>     # rb_eRangeError "is not id value" (expected) or "is recycled object" (possible)
>     obj = ObjectSpace._id2ref( obj_id )
> 
>     return obj
> 
>   end
>   
>   def []=( key, object )
> 
>     # FixNum have a constant ID for value, so can't be copied and can't be garbage collected
>     # so object.__id__ cannot be a reference to a child of object and therefore cannot prevent
>     # garbage collection on the object
>     super( key.to_sym, object.__id__ )
> 
>   end
>   
> end
> 
> ##################################################
> 
> # non-rootnode demo
> 
> $weak_hash = Hash::Weak.new
> 
> class TestClass
>   def test_method
>     child_test_object = Object.new
>     puts 'storing test object'
>     $weak_hash[ :key ] = child_test_object
>     puts 'hash now contains object id: ' + $weak_hash.pretty_inspect    
>   end
> end
> test_object	=	TestClass.new
> 
> test_object.test_method
> puts 'id in hash should no longer be valid, as it is out of scope: '
> invalid_key = $weak_hash[ :key ]
> pp invalid_key

Output: 

> storing test object
> hash now contains object id: {:key=>2160173880}
> id in hash should no longer be valid, as it is out of scope:
> /Users/ahaig/Projects/rp/ruby/weakhash/projects/RPWeakHash/weakhash.rb:21:in `_id2ref': 0x00000080c1a338 is recycled object (RangeError)	
> from /Users/ahaig/Projects/rp/ruby/weakhash/projects/RPWeakHash/weakhash.rb:21:in `[]'
> from /Users/ahaig/Projects/rp/ruby/weakhash/projects/RPWeakHash/weakhash.rb:56:in `<main>'

So that works as expected (so long as GC is run manually; if not run, object is obviously still valid). 

So let's try the same thing in this example that we did with the root node example:

> <weak hash code identical to above, so omitted>
> 
> # non-rootnode demo with nil-setting
> 
> $weak_hash = Hash::Weak.new
> 
> class TestClass
>   def test_method
>     child_test_object = Object.new
>     puts 'storing test object'
>     $weak_hash[ :key ] = child_test_object
>     puts 'hash now contains object id: ' + $weak_hash.pretty_inspect    
> 
>     puts 'setting variable referring to test object (ID: ' + child_test_object.__id__.to_s + ') to nil'
>     child_test_object = nil
>     puts 'ID for variable referring to test object is now: ' + child_test_object.__id__.to_s
> 
>     print 'getting test object (should fail with rb_eRangeError): '
>     invalid_key = $weak_hash[ :key ]
>     pp invalid_key
>   end
> end
> test_object	=	TestClass.new
> 
> test_object.test_method
> puts 'id in hash should no longer be valid, as it is out of scope: '
> invalid_key = $weak_hash[ :key ]
> pp invalid_key

Output: 

> storing test object
> hash now contains object id: {:key=>2160328280}
> setting variable referring to test object (ID: 2160328280) to nil
> ID for variable referring to test object is now: 4
> getting test object (should fail with rb_eRangeError):
> /Users/ahaig/Projects/rp/ruby/weakhash/projects/RPWeakHash/weakhash.rb:21:in `_id2ref': 0x00000080c3fe58 is recycled object (RangeError)
> 	
> from /Users/ahaig/Projects/rp/ruby/weakhash/projects/RPWeakHash/weakhash.rb:21:in `[]'
> 	
> from /Users/ahaig/Projects/rp/ruby/weakhash/projects/RPWeakHash/weakhash.rb:56:in `test_method'
> 	
> from /Users/ahaig/Projects/rp/ruby/weakhash/projects/RPWeakHash/weakhash.rb:62:in `<main>'

So that also works as expected. 

It seems the only time that it does not work as expected, then, is when the object was instantiated with a reference in the root node. 

Of course, that is one of the most likely places for people to instantiate a reference. So can anything be done about this? Are we simply doomed to wait until program termination for any objects allocated in the root node to disappear? 

I intend to look into the patch suggested by brabuhr@gmail.com (https://sites.google.com/site/brentsrubypatches/), which (so far as this issue is concerned) appears to amount to:

> VALUE *rb_gc_stack_end = (VALUE *)STACK_GROW_DIRECTION;
> #define rb_gc_wipe_stack() {   		\
>   VALUE *sp = alloca(0);         		\
>   VALUE *end = rb_gc_stack_end;  \
>   rb_gc_stack_end = sp;          		\
>   __stack_zero(end, sp);   			\

And some other basic support. I will follow up on that once I have some time to experiment (particularly sense the patch is intended for 1.8.7 not 1.9.2). Any particular thoughts on this approach? Presumably there is some reason it has not been patched to do so? 

In any case, any thoughts on any of this will be much appreciated.

Asher

On Aug 26, 2010, at 10:43 PM, Kurt Stephens wrote:

> On 8/26/10 11:51 AM, Asher wrote:
>> Right - so how does a pointer ever get off the stack?
>> 
> When a C function returns, the C stack pointer register (usually called "SP") is reset to the frame pointer (sometimes this register is called "FP").  The FP points to the current function arguments.  The area between the SP and the FP +- the space for arguments (and the other machine registers) represent the local variables, temporaries and arguments of the current function call (sometimes called an "activation record").
> 
> Load any C program under a debugger and you can see the assembly code.
> 
> The MRI GC knows where "top" (SP) and the bottom of the stack is because of mostly portable conventions on how C compilers generate code that manipulate SP and FP and how the operating system lays out the process' memory.  The stack, the machine registers and some global variables are part of what is sometimes called the "root set".
> 
> The MRI GC scans the root set for values that "look like they point to Ruby objects" and "marks" those objects recursively as "in use".  Any unmarked objects ("not in use") are definitely not referenced by anything else and can be deallocated ("sweeped").  The GC must "stop-the-world" while it does this "marking" and "sweeping" -- nothing else can happen till this finishes.   If the GC couldn't sweep anything, it allocates more memory from the OS (by calling malloc(), which calls something at a much lower level (sbrk() or mmap() or something else).
> 
>> For instance, in my example, where the variable with reference to the object has been assigned nil - the same thing occurs if the variable goes out of scope.
>> 
>> So in both of those cases, the object "should" be garbage collected; I understand that it's possible, due to conservative GC, that it might mistake a number on the stack (a long), etc. as a valid pointer, but generally when GC runs it should decide that the var (which has no valid ruby references) is no longer live and should be GC'd. Or am I missing something?
>> 
>> So we have a var with no references in Ruby that is being marked as live by the GC because the pointer has not yet been deallocated. So how does it ever get deallocated in order to not be marked as live?
>> 
>> If what I am seeing is the case (and I assume it cannot be and that I am missing something) then the object would never be garbage collected.
>> 
>> So how does GC actually occur?
> 
> Collection occurs in MRI when a new object is needed and there are no unused objects left around and/or there was a certain number of allocations since the last GC.
> 
>> What causes the pointer to be deallocated?
>> 
> "Pointers" are never allocated or deallocated as in malloc()/free(). Only objects that have no references to them are deallocated.
> 
> The C compiler generates code that simply increments or decrements the SP or changes the FP -- Stacks are FIFOs.
> 
> The MRI GC is a very simple "stop-the-world", "mark-and-sweep" "conservative" collector.  Conservative meaning "treat anything that looks like a pointer to an object as a pointer to an object".  This can cause conservative collectors to keep some objects around longer than they should.  This is also be cause most C compilers leave garbage (old pointers) on the stack.

<snip>


In This Thread

Prev Next