[#31647] [Backport #3666] Backport of r26311 (Bug #2587) — Luis Lavena <redmine@...>

Backport #3666: Backport of r26311 (Bug #2587)

13 messages 2010/08/07

[#31666] [Bug #3677] unable to run certain gem binaries' in windows 7 — Roger Pack <redmine@...>

Bug #3677: unable to run certain gem binaries' in windows 7

10 messages 2010/08/10

[#31676] [Backport #3680] Splatting calls to_ary instead of to_a in some cases — Tomas Matousek <redmine@...>

Backport #3680: Splatting calls to_ary instead of to_a in some cases

10 messages 2010/08/11

[#31681] [Bug #3683] getgrnam on computer with NIS group (+)? — Rocky Bernstein <redmine@...>

Bug #3683: getgrnam on computer with NIS group (+)?

13 messages 2010/08/11

[#31843] Garbage Collection Question — Asher <asher@...>

This question is no doubt a function of my own lack of understanding, but I think that asking it will at least help some other folks see what's going on with the internals during garbage collection.

17 messages 2010/08/25
[#31861] Re: Garbage Collection Question — Roger Pack <rogerdpack2@...> 2010/08/26

> The question in short: when an object goes out of scope and has no

[#31862] Re: Garbage Collection Question — Asher <asher@...> 2010/08/26

Right - so how does a pointer ever get off the stack?

[#31873] Re: Garbage Collection Question — Kurt Stephens <ks@...> 2010/08/27

On 8/26/10 11:51 AM, Asher wrote:

[#31894] Re: Garbage Collection Question — Asher <asher@...> 2010/08/27

I very much appreciate the response, and this is helpful in describing the narrative, but it's still a few steps behind my question - but it may very well have clarified some points that help us get there.

[#31896] Re: Garbage Collection Question — Evan Phoenix <evan@...> 2010/08/27

You have introduced something called a "root node" without defining it. What do you mean by this?

[#31885] Avoiding $LOAD_PATH pollution — Eric Hodel <drbrain@...7.net>

Last year Nobu asked me to propose an API for adding an object to

21 messages 2010/08/27

[#31947] not use system for default encoding — Roger Pack <rogerdpack2@...>

It strikes me as a bit "scary" to use system locale settings to

19 messages 2010/08/30

[#31971] Change Ruby's License to BSDL + Ruby's dual license — "NARUSE, Yui" <naruse@...>

Ruby's License will change to BSDL + Ruby's dual license

16 messages 2010/08/31

[ruby-core:31902] Re: Garbage Collection Question

From: Evan Phoenix <evan@...>
Date: 2010-08-27 18:05:34 UTC
List: ruby-core #31902
My knowledge about the insides of 1.9 is less strong than 1.8, so I'm not fully versed in how 1.9 now stores locals.

Anyway, one issue with your testing methodology is you don't define when the GC will happen. If you do some work in a method and return from it, even though there are no references to an object, if the GC hasn't run yet, then _id2ref will be able to return it. If you demand that returning from a local scope will cause the object be be treated as garbage, then you con't blindly use _id2ref. Ruby, and just about all GC languages, don't work that way.

Your examples do not include calling GC.start to force a GC, thus I wonder if this is the source of your problem. Remember that by default, the GC runs whenever it wants. So you can't depend on it to run an certain times.

 - Evan

On Aug 27, 2010, at 9:33 AM, Asher wrote:

> 
> On Aug 27, 2010, at 12:09 PM, Evan Phoenix wrote:
> 
>> You have introduced something called a "root node" without defining it. What do you mean by this?
> 
> The first node that runs when you run a script (ie. call ruby_run_node ), which also defines the set of root references. 
> 
>> I'm assuming here you mean that in your case, if you allocate the object in the script body, then set the local to nil, you can observe that the object appears to not be collected.
> 
> What you can see with my examples, though, is that it does happen with _all_ objects allocated on the root node 
> 
>> As has been stated in the thread already, this is an artifact of the conservative GC. Even though you have set the local to nil, a reference to the object may still remain on the C stack. That reference can't be seen by ruby code because it is in stack memory that gcc setup and didn't clear when the value wasn't needed anymore.
> 
> Right - I understand this conceptually. I want to know "where" on the C stack this "might" remain. It shouldn't be an obtuse question - Ruby is allocating each and every object, and I'm not using any C pointers for the particular example, so there is nothing else in my C stack (in this case, "I" don't have a C stack, only Ruby does). 
> 
> So Ruby is holding a reference somewhere in its stack, possibly because of 
> 
>> This is unfortunate, but not the end of the world.
> 
> In my particular use case (not the example), it is the end of the world and requires re-designing the entire way I'm handling T_DATA, such that I pass back a new T_DATA every time an existing underlying C object is requested. I want to store the first T_DATA created for this object in a weak hash and pass it back as requested - allowing it to be collected as appropriate. This seems to work in all contexts but the root node, where the result is that one expects to get a GC'd object (which can thus be caught and returned as nil) but ends up with a valid obj (which shouldn't be valid). 
> 
> The result is that one can ask for an object that doesn't exist, and instead of being told that it doesn't exist get back an old object that wasn't what one wanted (one wanted to know that it did not exist in this context, not get whatever random last object was created in the slot).
> 
> This example also, I believe, makes it evident that "root node" is not necessarily the actual root but can also be any root relative to execution context. In other words, a variable _will not_ be GC'd until one has left the frame in which it was defined, even if all references are set nil. 
> 
> Example: 
> 
>  it "can be created with a name string and home directory string" do
>    @environment = RPDB::Environment.new( $environment_name.to_s, $environment_path )
>    @environment.should_not == nil
>    @environment.is_a?( RPDB::Environment ).should == true
>    @environment.directory.should == $environment_path
>  end
> 
>  it "can be created with a name symbol" do
>    environment = RPDB.environment_with_name( $environment_name )
>    environment.should == nil
>    @environment = RPDB::Environment.new( $environment_name )
>    @environment.should_not == nil
>    @environment.is_a?( RPDB::Environment ).should == true
>    @environment.directory.should == './'
>  end
> 
> The last line of the second example does not end up with the default path ('./') because an existing reference is found when it should not be.
> 
> It seems, thus, that writing a weak hash is impossible given the current state of GC. This seems rather problematic. 
> 
>> It doesn't happen with every object allocated in a script body, only sometimes.
> 
> No, it happens _every_ time. See examples. 
> 
>> The patch set you were pointed to goes to lengths to clear the stack space as much as it can so that there are none of these phantom references to confuse the GC. It does this by breaking up the main eval function into smaller functions (allowing stack space to be allocated and deallocated within the eval itself) and forcibly clearing the stack with memset.
> 
> Right... and I was trying to look where that would be appropriately integrated into 1.9.2, but my attempts have not been successful. I believe that this is an indication that that is not the problem in question here- that the problem has to do with the clearing of the present stack, rather than the clearing of stack frames that have been passed. 
> 
> In other words, the patch clears old stack frames, but the problem here is that we have data remaining in the present stack frame that is not expected to still exist. 
> 
> This is obviously a function of the GC's conservative nature, but I am trying to figure out what my best option is for circumventing the unexpected behavior. 
> 
> Additionally, On Aug 27, 2010, at 12:13 PM, Roger Pack wrote:
> 
>> Unfortunately you'll have to assume that there is still some "bad ref"
>> around to it.
>> One trick is to try and nest whatever you "violently" need to be
>> collected deep in some sub routine, then call GC.start *after*
>> recursing back up from that sub routine.
> 
> 
> It does seem to be the answer that things are leaning toward, but I want to at least understand at a lower level precisely what is occurring to prevent this specific collection. It seems (based on my description of when it occurs) to be systemic rather than sporadic, so it should be possible to at least narrow it down to a specific place in code where a reference is being left, even if it is not so easy to adapt that code to do otherwise. 
> 
> Best,
> Asher
> 
> 
> 


In This Thread