[#13161] hacking on the "heap" implementation in gc.c — Lloyd Hilaiel <lloyd@...>

Hi all,

16 messages 2007/11/01

[#13182] Thinking of dropping YAML from 1.8 — Urabe Shyouhei <shyouhei@...>

Hello all.

14 messages 2007/11/03

[#13315] primary encoding and source encoding — David Flanagan <david@...>

I've got a couple of questions about the handling of primary encoding.

29 messages 2007/11/08
[#13331] Re: primary encoding and source encoding — Yukihiro Matsumoto <matz@...> 2007/11/09

Hi,

[#13368] method names in 1.9 — "David A. Black" <dblack@...>

Hi --

61 messages 2007/11/10
[#13369] Re: method names in 1.9 — Yukihiro Matsumoto <matz@...> 2007/11/10

Hi,

[#13388] Re: method names in 1.9 — Charles Oliver Nutter <charles.nutter@...> 2007/11/11

Yukihiro Matsumoto wrote:

[#13403] Re: method names in 1.9 — "Austin Ziegler" <halostatue@...> 2007/11/11

On 11/11/07, Charles Oliver Nutter <charles.nutter@sun.com> wrote:

[#13410] Re: method names in 1.9 — David Flanagan <david@...> 2007/11/11

Austin Ziegler wrote:

[#13413] Re: method names in 1.9 — Charles Oliver Nutter <charles.nutter@...> 2007/11/11

David Flanagan wrote:

[#13423] Re: method names in 1.9 — Jordi <mumismo@...> 2007/11/12

Summing it up:

[#13386] Re: method names in 1.9 — Trans <transfire@...> 2007/11/11

[#13391] Re: method names in 1.9 — Matthew Boeh <mboeh@...> 2007/11/11

On Sun, Nov 11, 2007 at 05:50:18PM +0900, Trans wrote:

[#13457] mingw rename — "Roger Pack" <rogerpack2005@...>

Currently for different windows' builds, the names for RUBY_PLATFORM

13 messages 2007/11/13

[#13485] Proposal: Array#walker — Wolfgang Nádasi-Donner <ed.odanow@...>

Good morning all together!

23 messages 2007/11/14
[#13486] Re: Proposal: Array#walker — Wolfgang Nádasi-Donner <ed.odanow@...> 2007/11/14

A nicer version may be...

[#13488] Re: Proposal: Array#walker — Trans <transfire@...> 2007/11/14

[#13495] Re: Proposal: Array#walker — Trans <transfire@...> 2007/11/14

[#13498] state of threads in 1.9 — Jordi <mumismo@...>

Are Threads mapped to threads on the underlying operating system in

30 messages 2007/11/14
[#13519] Re: state of threads in 1.9 — "Bill Kelly" <billk@...> 2007/11/14

[#13526] Re: state of threads in 1.9 — Eric Hodel <drbrain@...7.net> 2007/11/14

On Nov 14, 2007, at 11:18 , Bill Kelly wrote:

[#13528] test/unit and miniunit — Ryan Davis <ryand-ruby@...>

When is the 1.9 freeze?

17 messages 2007/11/14

[#13564] Thoughts about Array#compact!, Array#flatten!, Array#reject!, String#strip!, String#capitalize!, String#gsub!, etc. — Wolfgang Nádasi-Donner <ed.odanow@...>

Good evening all together!

53 messages 2007/11/15
[#13575] Re: Thoughts about Array#compact!, Array#flatten!, Array#reject!, String#strip!, String#capitalize!, String#gsub!, etc. — "Nikolai Weibull" <now@...> 2007/11/15

On Nov 15, 2007 8:14 PM, Wolfgang N=E1dasi-Donner <ed.odanow@wonado.de> wro=

[#13578] Re: Thoughts about Array#compact!, Array#flatten!, Array#reject!, String#strip!, String#capitalize!, String#gsub!, etc. — Michael Neumann <mneumann@...> 2007/11/16

Nikolai Weibull schrieb:

[#13598] wondering about #tap (was: Re: Thoughts about Array#compact!, Array#flatten!, Array#reject!, String#strip!, String#capitalize!, String#gsub!, etc.) — "David A. Black" <dblack@...> 2007/11/16

Hi --

[#13605] Re: wondering about #tap (was: Re: Thoughts about Array#compact!, Array#flatten!, Array#reject!, String#strip!, String#capitalize!, String#gsub!, etc.) — Trans <transfire@...> 2007/11/16

[#13612] Re: wondering about #tap (was: Re: Thoughts about Array#compact!, Array#flatten!, Array#reject!, String#strip!, String#capitalize!, String#gsub!, etc.) — "David A. Black" <dblack@...> 2007/11/16

Hi --

[#13624] Re: wondering about #tap (was: Re: Thoughts about Array#compact!, Array#flatten!, Array#reject!, String#strip!, String#capitalize!, String#gsub!, etc.) — "Nikolai Weibull" <now@...> 2007/11/16

On Nov 16, 2007 12:40 PM, David A. Black <dblack@rubypal.com> wrote:

[#13632] Re: wondering about #tap — David Flanagan <david@...> 2007/11/16

David A. Black wrote:

[#13634] Re: wondering about #tap — "David A. Black" <dblack@...> 2007/11/16

Hi --

[#13636] Re: wondering about #tap — "Rick DeNatale" <rick.denatale@...> 2007/11/16

On Nov 16, 2007 12:40 PM, David A. Black <dblack@rubypal.com> wrote:

[#13637] Re: wondering about #tap — murphy <murphy@...> 2007/11/16

Rick DeNatale wrote:

[#13640] Re: wondering about #tap — Wolfgang Nádasi-Donner <ed.odanow@...> 2007/11/16

murphy schrieb:

[#13614] Suggestion for native thread tests — "Eust痃uio Rangel" <eustaquiorangel@...>

Hi!

12 messages 2007/11/16

[#13685] Problems with \M-x in utf-8 encoded strings — Wolfgang Nádasi-Donner <ed.odanow@...>

Hi!

11 messages 2007/11/18

[#13741] retry semantics changed — Dave Thomas <dave@...>

In 1.8, I could write:

46 messages 2007/11/23
[#13742] Re: retry semantics changed — "Brian Mitchell" <binary42@...> 2007/11/23

On Nov 23, 2007 12:06 PM, Dave Thomas <dave@pragprog.com> wrote:

[#13743] Re: retry semantics changed — Dave Thomas <dave@...> 2007/11/23

[#13746] Re: retry semantics changed — Yukihiro Matsumoto <matz@...> 2007/11/23

Hi,

[#13747] Re: retry semantics changed — Dave Thomas <dave@...> 2007/11/23

[#13748] Re: retry semantics changed — Yukihiro Matsumoto <matz@...> 2007/11/23

Hi,

[#13749] Re: retry semantics changed — Dave Thomas <dave@...> 2007/11/23

Re: Unrecovered memory leak thoughts.

From: "Roger Pack" <rogerpack2005@...>
Date: 2007-11-08 18:59:51 UTC
List: ruby-core #13309
> >> First, it's pretty inefficient in that it imposes an overhead on every
> >> variable assignment to adjust reference counts.

Not an extremely big (code-wide) amount of overhead: the equivalent of
ptr->as.whatever.count++
probably not too expensive, though it does dirty the memory ...but
since it's in cache anyway, the write expense will probably be low
(compared to the benefits of not marking the entire heap as the GC
does).


 Second, it can easily
> >> leave permanently noncollectable constellations of  unreachable
> >> objects which form reference cycles.

Looks like to avoid those you need to require 'container objects'
(those that can contain references to others) to provide a method that
returns a list of objects they point to (so you can traverse them and
stomp out those cycles).

This means that the Ruby C containers would need to provide those, and
also (as you noted), extensions that create objects that reference
others (in their C code) would need to provide the same.  Basically
any extension that could reference itself somehow would need to
provide them.  So the problem is that classes can reference
themselves, too, in Ruby, so we'd need to think about it for awhile.

We would also need to create traverse methods for ruby's container
classes (can an array reference itself?  I think it can).  I didn't
say it would be easy :)

> > I question how significant reference counting would be compared to
> > everything else going on in the interpreter.  Python uses reference
> > counting and seems to have decent performance compared to Ruby.
> >
> > The bigger issue is that extension writers would have to rewrite their
> > extensions to increment the reference count when they do an assignment.

Yeah--if they do assignment.  Most assignment calls are probably made
by Ruby, so tweaking that to increment assignment would probably do
most of the dirty work.  I could be wrong, being familiar only with
gc.c, and not eval.c (and not with extensions).  My guess is that most
extensions just use their own object types, which aren't theoretically
self referencing, so they wouldn't have to change too much.  Could be
wrong.

Normal assignments call (ruby's) = (you can't override it) so I think
that modifying that would kill most assignment probs.  Maybe :)

Extensions themselves for their own personal objects usually use
ruby_x_malloc or whatever it is, which isn't tracked by the GC, and
isn't a Ruby object, and extensions already provide their own
finalizers, but yeah, they may need to change.  Their finalizers would
need to call 'dec' on contained (Ruby) objects.  I'm not sure what the
implication is, and to what extent real code changes would be
necessary.


> So my challenge to the Ruby community is to come up with test suites for
> the garbage collector. Maybe I'm asking on the wrong list, though --
> should I be asking for this on the Rails list?
>
> <ducking>

I know that when I run rails the GC collects like 3x per page request,
which is...not good (at least in development mode)

<swinging> :)


> P.S.: I hear implementing reference counting is so non-trivial that it
> would literally have to have a *compelling* performance advantage to be
> worth the effort. Again, someone needs to come up with the test suites.

Ruby's GC fires 'every 8MB of alloc'ed Ruby objects' (current 1.8.6
SVN version), so say you have a prog that uses...500MB of memory, on
purpose, and then it's going to fluctuate, stably, between 500 and
600MB.
That means that ever 8MB of alloc'ed memory, it is going to traverse
all ~500MB of heap space, marking and traversing happily, then
sweeping the entire thing, most of which is not freed.
I've heard complaints of people who use large amounts of memory (>1GB)
that the GC takes up to 50 seconds to complete.  That makes Ruby
unsuitable for large memory real-time apps.  Also note that for large
apps with a large heap, collecting will be pulling every page from
virtual memory, accessing it, and marking it dirty.  So the current GC
basically requires the entire heap to fit into memory (regardless of
whether it's actually being used).  If you run out of RAM that would
really hurt you as every GC has to thrash for each collection.  It
also resets the L2 cache and then you get to start over and wait for
the ominous next GC.
Only a problem for large memory use, however.  Unfortunately this
seems quite common for current Ruby apps :)

Your point is well taken, however: for most cases, the GC doesn't hurt
us, so why bother :)
> P.P.S: If your Ruby application is spending a lot of time in the garbage
> collector, you may be doing something wrong at the Ruby code level.

Believe it or not almost any ruby app that is 'chugging' (running and
processing something) is most likely alloc'ing tons of mem and calling
the GC frequently, which GC is doing most of its work redundantly
(it's doing the same work over and over again).  If it is legitimately
using a lot of heap (as mentioned above), then that means it is most
likely operating very slowly :)  So it is possible for the code to be
legit.


TODO However, as pointed out by a recent email to this list, there may be some

Or
> you might need to allocate a bigger heap (assuming that's possible
> without changing the interpreter at the source level -- I haven't looked
> at this yet.)

It is indeed possible at the C source level.  It's a commonly applied
band-aid :)  I have thought of changing it so users can dynamically
set their own levels of frequency for the GC.  Again that feels like a
band-aid, with expensive GC's still occuring, and still randomly, but
with less frequency.  Every had momentary lag in a game...?

Anyway back to the memory leak hunt.

Thanks for reading!
-Roger

In This Thread