[#15707] Schedule for the 1.8.7 release — "Akinori MUSHA" <knu@...>

Hi, developers,

21 messages 2008/03/01

[#15740] Copy-on-write friendly garbage collector — Hongli Lai <hongli@...99.net>

Hi.

31 messages 2008/03/03
[#15742] Re: Copy-on-write friendly garbage collector — Yukihiro Matsumoto <matz@...> 2008/03/03

Hi,

[#15829] Re: Copy-on-write friendly garbage collector — Daniel DeLorme <dan-ml@...42.com> 2008/03/08

Yukihiro Matsumoto wrote:

[#15756] embedding Ruby 1.9.0 inside pthread — "Suraj Kurapati" <sunaku@...>

Hello,

18 messages 2008/03/03
[#15759] Re: embedding Ruby 1.9.0 inside pthread — Nobuyoshi Nakada <nobu@...> 2008/03/04

Hi,

[#15760] Re: embedding Ruby 1.9.0 inside pthread — Yukihiro Matsumoto <matz@...> 2008/03/04

Hi,

[#15762] Re: embedding Ruby 1.9.0 inside pthread — "Suraj N. Kurapati" <sunaku@...> 2008/03/04

Yukihiro Matsumoto wrote:

[#15783] Adding startup and shutdown to Test::Unit — Daniel Berger <Daniel.Berger@...>

Hi all,

15 messages 2008/03/04

[#15835] TimeoutError in core, timeouts for ConditionVariable#wait — MenTaLguY <mental@...>

I've been reworking JRuby's stdlib to improve performance and fix

10 messages 2008/03/09

[#15990] Recent changes in Range#step behavior — "Vladimir Sizikov" <vsizikov@...>

Hi,

35 messages 2008/03/23
[#15991] Re: Recent changes in Range#step behavior — Dave Thomas <dave@...> 2008/03/23

[#15993] Re: Recent changes in Range#step behavior — "Vladimir Sizikov" <vsizikov@...> 2008/03/23

Hi Dave,

[#15997] Re: Recent changes in Range#step behavior — Dave Thomas <dave@...> 2008/03/23

[#16024] Re: Recent changes in Range#step behavior — "Vladimir Sizikov" <vsizikov@...> 2008/03/26

Hi Dave,

[#16025] Re: Recent changes in Range#step behavior — Yukihiro Matsumoto <matz@...> 2008/03/26

Hi,

[#16026] Re: Recent changes in Range#step behavior — Dave Thomas <dave@...> 2008/03/26

[#16027] Re: Recent changes in Range#step behavior — Yukihiro Matsumoto <matz@...> 2008/03/26

Hi,

[#16029] Re: Recent changes in Range#step behavior — Dave Thomas <dave@...> 2008/03/26

[#16030] Re: Recent changes in Range#step behavior — Yukihiro Matsumoto <matz@...> 2008/03/26

Hi,

[#16031] Re: Recent changes in Range#step behavior — Dave Thomas <dave@...> 2008/03/26

[#16032] Re: Recent changes in Range#step behavior — "Vladimir Sizikov" <vsizikov@...> 2008/03/26

On Wed, Mar 26, 2008 at 7:01 PM, Dave Thomas <dave@pragprog.com> wrote:

[#16033] Re: Recent changes in Range#step behavior — Dave Thomas <dave@...> 2008/03/26

[#16041] Re: Recent changes in Range#step behavior — David Flanagan <david@...> 2008/03/26

Dave Thomas wrote:

Re: Copy-on-write friendly garbage collector

From: Hongli Lai <hongli@...99.net>
Date: 2008-03-14 15:53:39 UTC
List: ruby-core #15902
Daniel Berger wrote:
> I recommend against this. Most people are simply not going to know
> when they should invoke it and when they shouldn't. It's an
> implementation detail that should be hidden from the user IMHO.
> Besides, it really worth splitting it out like this over a 5% speed
> hit? Or am I missing some other implementation detail?

Apparently people care about even a 5% speed hit, or we wouldn't still 
be discussing this.

However, it is true that most scripts will not benefit from a 
copy-on-write friendly garbage collector. Only very few scripts rely on 
it for saving memory. As far as I know, the program that I'm developing 
is the first Ruby program that uses copy-on-write friendliness for 
important stuff. There is both a legit reason for wanting the garbage 
collector to be fast in the normal case, and for wanting it to be 
not-so-fast but copy-on-write friendly in some edge cases.

That said, exposing implementation details is generally not a nice thing 
to do. That's why I've written extensive documentation for the 
"copy_on_write_friendly?" and "copy_on_write_friendly=" methods, which 
explains that the behavior might be different across different Ruby 
implementations and/or different platforms, and that they should only 
use these methods if they know what they're doing.

Please see the attached patch. This patch is made against Ruby 1.8, 
relative to my last patch.

Attachments (1)

ruby-pluggable-mark-tables.diff (13.6 KB, text/x-patch)
(in /home/hongli/Projects/ruby/cowruby)
diff --git a/common.mk b/common.mk
index f57a0da..404cd9d 100644
--- a/common.mk
+++ b/common.mk
@@ -386,7 +386,8 @@ gc.$(OBJEXT): {$(VPATH)}gc.c {$(VPATH)}ruby.h config.h \
   {$(VPATH)}defines.h {$(VPATH)}intern.h {$(VPATH)}missing.h \
   {$(VPATH)}rubysig.h {$(VPATH)}st.h {$(VPATH)}node.h \
   {$(VPATH)}env.h {$(VPATH)}re.h {$(VPATH)}regex.h \
-  {$(VPATH)} pointerset.h {$(VPATH)} marktable.c
+  {$(VPATH)}pointerset.h {$(VPATH)}marktable.h \
+  {$(VPATH)}marktable.c {$(VPATH)}fastmarktable.c
 hash.$(OBJEXT): {$(VPATH)}hash.c {$(VPATH)}ruby.h config.h \
   {$(VPATH)}defines.h {$(VPATH)}intern.h {$(VPATH)}missing.h \
   {$(VPATH)}st.h {$(VPATH)}util.h {$(VPATH)}rubysig.h
@@ -415,7 +416,7 @@ parse.$(OBJEXT): {$(VPATH)}parse.c {$(VPATH)}ruby.h config.h \
   {$(VPATH)}defines.h {$(VPATH)}intern.h {$(VPATH)}missing.h \
   {$(VPATH)}env.h {$(VPATH)}node.h {$(VPATH)}st.h \
   {$(VPATH)}regex.h {$(VPATH)}util.h {$(VPATH)}lex.c
-pointerset.$(OBJEXT): {$(VPATH)}pointerset.c
+pointerset.$(OBJEXT): {$(VPATH)}pointerset.c {$(VPATH)}pointerset.h
 prec.$(OBJEXT): {$(VPATH)}prec.c {$(VPATH)}ruby.h config.h \
   {$(VPATH)}defines.h {$(VPATH)}intern.h {$(VPATH)}missing.h
 process.$(OBJEXT): {$(VPATH)}process.c {$(VPATH)}ruby.h config.h \
diff --git a/fastmarktable.c b/fastmarktable.c
new file mode 100644
index 0000000..8f8526d
--- /dev/null
+++ b/fastmarktable.c
@@ -0,0 +1,83 @@
+/**
+ * A mark table, used during a mark-and-sweep garbage collection cycle.
+ *
+ * This implementation is faster than marktable.c, but is *not*
+ * copy-on-write friendly. It stores mark information directly inside objects.
+ */
+#ifndef _FAST_MARK_TABLE_C_
+#define _FAST_MARK_TABLE_C_
+
+static void
+rb_fast_mark_table_init() {
+}
+
+static void
+rb_fast_mark_table_prepare() {
+}
+
+static void
+rb_fast_mark_table_finalize() {
+}
+
+static inline void
+rb_fast_mark_table_add(RVALUE *object) {
+	object->as.basic.flags |= FL_MARK;
+}
+
+static inline void
+rb_fast_mark_table_heap_add(struct heaps_slot *hs, RVALUE *object) {
+	object->as.basic.flags |= FL_MARK;
+}
+
+static inline int
+rb_fast_mark_table_contains(RVALUE *object) {
+	return object->as.basic.flags & FL_MARK;
+}
+
+static inline int
+rb_fast_mark_table_heap_contains(struct heaps_slot *hs, RVALUE *object) {
+	return object->as.basic.flags & FL_MARK;
+}
+
+static inline void
+rb_fast_mark_table_remove(RVALUE *object) {
+	object->as.basic.flags &= ~FL_MARK;
+}
+
+static inline void
+rb_fast_mark_table_heap_remove(struct heaps_slot *hs, RVALUE *object) {
+	object->as.basic.flags &= ~FL_MARK;
+}
+
+static inline void
+rb_fast_mark_table_add_filename(char *filename) {
+	filename[-1] = 1;
+}
+
+static inline int
+rb_fast_mark_table_contains_filename(const char *filename) {
+	return filename[-1];
+}
+
+static inline void
+rb_fast_mark_table_remove_filename(char *filename) {
+	filename[-1] = 0;
+}
+
+static void
+rb_use_fast_mark_table() {
+	rb_mark_table_init          = rb_fast_mark_table_init;
+	rb_mark_table_prepare       = rb_fast_mark_table_prepare;
+	rb_mark_table_finalize      = rb_fast_mark_table_finalize;
+	rb_mark_table_add           = rb_fast_mark_table_add;
+	rb_mark_table_heap_add      = rb_fast_mark_table_heap_add;
+	rb_mark_table_contains      = rb_fast_mark_table_contains;
+	rb_mark_table_heap_contains = rb_fast_mark_table_heap_contains;
+	rb_mark_table_remove        = rb_fast_mark_table_remove;
+	rb_mark_table_heap_remove   = rb_fast_mark_table_heap_remove;
+	rb_mark_table_add_filename  = rb_fast_mark_table_add_filename;
+	rb_mark_table_contains_filename = rb_fast_mark_table_contains_filename;
+	rb_mark_table_remove_filename   = rb_fast_mark_table_remove_filename;
+}
+
+#endif /* _FAST_MARK_TABLE_C_ */
diff --git a/gc.c b/gc.c
index 3ba5ec5..2843895 100644
--- a/gc.c
+++ b/gc.c
@@ -397,7 +397,9 @@ static int heap_slots = HEAP_MIN_SLOTS;
 
 static RVALUE *himem, *lomem;
 
+#include "marktable.h"
 #include "marktable.c"
+#include "fastmarktable.c"
 
 static void
 add_heap()
@@ -1718,6 +1720,7 @@ void ruby_init_stack(VALUE *addr
 void
 Init_heap()
 {
+    rb_use_fast_mark_table();
     rb_mark_table_init();
     if (!rb_gc_stack_start) {
 	Init_stack(0);
@@ -2209,17 +2212,52 @@ os_statistics()
 }
 
 /*
- * Returns whether this garbage collector is copy-on-write friendly.
+ *  call-seq:
+ *     GC.copy_on_write_friendly?     => true or false
+ *
+ *  Returns whether the garbage collector is copy-on-write friendly.
+ *
+ *  This method only has meaning on platforms that support the _fork_ system call.
+ *  Please consult the documentation for GC.copy_on_write_friendly= for additional
+ *  notes.
  */
 static VALUE
-rb_gc_cow_friendly()
+rb_gc_copy_on_write_friendly()
 {
-    return Qtrue;
+    if (rb_mark_table_init == rb_fast_mark_table_init) {
+	return Qtrue;
+    } else {
+	return Qfalse;
+    }
 }
 
+/*
+ *  call-seq:
+ *     GC.copy_on_write_friendly = _boolean_
+ *
+ *  Tell the garbage collector whether to be copy-on-write friendly.
+ *
+ *  Note that this is an implementation detail of the garbage collector. On some Ruby
+ *  implementations, the garbage collector may always be copy-on-write friendly. In that
+ *  case, this method will do nothing. Furthermore, copy-on-write friendliness has no
+ *  meaning on some platforms (such as Microsoft Windows), so setting this flag on those
+ *  platform is futile.
+ *
+ *  Please keep in mind that this flag is only advisory. Do not rely on it for anything
+ *  truly important.
+ *
+ *  In the mainline Ruby implementation, the copy-on-write friendly garbage collector is
+ *  slightly slower the non-copy-on-write friendly version.
+ */
 static VALUE
-rb_gc_test()
+rb_gc_set_copy_on_write_friendly(VALUE self, VALUE val)
 {
+    if (RTEST(val)) {
+	rb_use_bf_mark_table();
+    } else {
+	rb_use_fast_mark_table();
+    }
+    rb_mark_table_init();
     return Qnil;
 }
 
@@ -2239,8 +2277,8 @@ Init_GC()
     rb_define_singleton_method(rb_mGC, "enable", rb_gc_enable, 0);
     rb_define_singleton_method(rb_mGC, "disable", rb_gc_disable, 0);
     rb_define_method(rb_mGC, "garbage_collect", rb_gc_start, 0);
-    rb_define_singleton_method(rb_mGC, "cow_friendly?", rb_gc_cow_friendly, 0);
-    rb_define_singleton_method(rb_mGC, "test", rb_gc_test, 0);
+    rb_define_singleton_method(rb_mGC, "copy_on_write_friendly?", rb_gc_copy_on_write_friendly, 0);
+    rb_define_singleton_method(rb_mGC, "copy_on_write_friendly=", rb_gc_set_copy_on_write_friendly, 1);
 
     rb_mObSpace = rb_define_module("ObjectSpace");
     rb_define_module_function(rb_mObSpace, "each_object", os_each_obj, -1);
diff --git a/marktable.c b/marktable.c
index 412616f..84f88fe 100644
--- a/marktable.c
+++ b/marktable.c
@@ -1,3 +1,14 @@
+/**
+ * A mark table, used during a mark-and-sweep garbage collection cycle.
+ *
+ * This implementation is somewhat slower than fastmarktable.c, but is
+ * copy-on-write friendly. It stores mark information for objects in a bit
+ * field located at the beginning of the heap. Mark information for filenames
+ * are stored in a pointer set.
+ */
+#ifndef _MARK_TABLE_C_
+#define _MARK_TABLE_C_
+
 #include "pointerset.h"
 
 /* A mark table for filenames and objects that are not on the heap. */
@@ -58,19 +69,21 @@ find_position_in_bitfield(struct heaps_slot *hs, RVALUE *object,
 
 
 static void
-rb_mark_table_init()
+rb_bf_mark_table_init()
 {
-	mark_table = pointer_set_new();
+	if (mark_table == NULL) {
+		mark_table = pointer_set_new();
+	}
 }
 
 static void
-rb_mark_table_prepare()
+rb_bf_mark_table_prepare()
 {
 	last_heap = NULL;
 }
 
 static void
-rb_mark_table_finalize()
+rb_bf_mark_table_finalize()
 {
 	int i;
 
@@ -83,7 +96,7 @@ rb_mark_table_finalize()
 }
 
 static inline void
-rb_mark_table_add(RVALUE *object)
+rb_bf_mark_table_add(RVALUE *object)
 {
 	struct heaps_slot *hs;
 	unsigned int bitfield_index, bitfield_offset;
@@ -98,7 +111,7 @@ rb_mark_table_add(RVALUE *object)
 }
 
 static inline void
-rb_mark_table_heap_add(struct heaps_slot *hs, RVALUE *object)
+rb_bf_mark_table_heap_add(struct heaps_slot *hs, RVALUE *object)
 {
 	unsigned int bitfield_index, bitfield_offset;
 	find_position_in_bitfield(hs, object, &bitfield_index, &bitfield_offset);
@@ -106,7 +119,7 @@ rb_mark_table_heap_add(struct heaps_slot *hs, RVALUE *object)
 }
 
 static inline int
-rb_mark_table_contains(RVALUE *object)
+rb_bf_mark_table_contains(RVALUE *object)
 {
 	struct heaps_slot *hs;
 	unsigned int bitfield_index, bitfield_offset;
@@ -121,7 +134,7 @@ rb_mark_table_contains(RVALUE *object)
 }
 
 static inline int
-rb_mark_table_heap_contains(struct heaps_slot *hs, RVALUE *object)
+rb_bf_mark_table_heap_contains(struct heaps_slot *hs, RVALUE *object)
 {
 	unsigned int bitfield_index, bitfield_offset;
 	find_position_in_bitfield(hs, object, &bitfield_index, &bitfield_offset);
@@ -130,7 +143,7 @@ rb_mark_table_heap_contains(struct heaps_slot *hs, RVALUE *object)
 }
 
 static inline void
-rb_mark_table_remove(RVALUE *object)
+rb_bf_mark_table_remove(RVALUE *object)
 {
 	struct heaps_slot *hs;
 	unsigned int bitfield_index, bitfield_offset;
@@ -145,7 +158,7 @@ rb_mark_table_remove(RVALUE *object)
 }
 
 static inline void
-rb_mark_table_heap_remove(struct heaps_slot *hs, RVALUE *object)
+rb_bf_mark_table_heap_remove(struct heaps_slot *hs, RVALUE *object)
 {
 	unsigned int bitfield_index, bitfield_offset;
 	find_position_in_bitfield(hs, object, &bitfield_index, &bitfield_offset);
@@ -153,19 +166,37 @@ rb_mark_table_heap_remove(struct heaps_slot *hs, RVALUE *object)
 }
 
 static inline void
-rb_mark_table_add_filename(const char *filename)
+rb_bf_mark_table_add_filename(char *filename)
 {
 	pointer_set_insert(mark_table, (void *) filename);
 }
 
 static inline int
-rb_mark_table_contains_filename(const char *filename)
+rb_bf_mark_table_contains_filename(const char *filename)
 {
 	return pointer_set_contains(mark_table, (void *) filename);
 }
 
 static inline void
-rb_mark_table_remove_filename(const char *filename)
+rb_bf_mark_table_remove_filename(char *filename)
 {
 	pointer_set_delete(mark_table, (void *) filename);
 }
+
+static void
+rb_use_bf_mark_table() {
+	rb_mark_table_init          = rb_bf_mark_table_init;
+	rb_mark_table_prepare       = rb_bf_mark_table_prepare;
+	rb_mark_table_finalize      = rb_bf_mark_table_finalize;
+	rb_mark_table_add           = rb_bf_mark_table_add;
+	rb_mark_table_heap_add      = rb_bf_mark_table_heap_add;
+	rb_mark_table_contains      = rb_bf_mark_table_contains;
+	rb_mark_table_heap_contains = rb_bf_mark_table_heap_contains;
+	rb_mark_table_remove        = rb_bf_mark_table_remove;
+	rb_mark_table_heap_remove   = rb_bf_mark_table_heap_remove;
+	rb_mark_table_add_filename  = rb_bf_mark_table_add_filename;
+	rb_mark_table_contains_filename = rb_bf_mark_table_contains_filename;
+	rb_mark_table_remove_filename   = rb_bf_mark_table_remove_filename;
+}
+
+#endif /* _MARK_TABLE_C_ */
diff --git a/marktable.h b/marktable.h
new file mode 100644
index 0000000..3904fd6
--- /dev/null
+++ b/marktable.h
@@ -0,0 +1,17 @@
+#ifndef _MARK_TABLE_H_
+#define _MARK_TABLE_H_
+
+static void (*rb_mark_table_init)();
+static void (*rb_mark_table_prepare)();
+static void (*rb_mark_table_finalize)();
+static void (*rb_mark_table_add)(RVALUE *object);
+static void (*rb_mark_table_heap_add)(struct heaps_slot *hs, RVALUE *object);
+static int  (*rb_mark_table_contains)(RVALUE *object);
+static int  (*rb_mark_table_heap_contains)(struct heaps_slot *hs, RVALUE *object);
+static void (*rb_mark_table_remove)(RVALUE *object);
+static void (*rb_mark_table_heap_remove)(struct heaps_slot *hs, RVALUE *object);
+static void (*rb_mark_table_add_filename)(char *filename);
+static int  (*rb_mark_table_contains_filename)(const char *filename);
+static void (*rb_mark_table_remove_filename)(char *filename);
+
+#endif /* _MARK_TABLE_H_ */
diff --git a/pointerset.h b/pointerset.h
index 15ccab7..bc4733b 100644
--- a/pointerset.h
+++ b/pointerset.h
@@ -1,17 +1,56 @@
+/**
+ * A specialized set data structure, designed to only contain pointers.
+ * It will grow and shrink dynamically.
+ */
 #ifndef _POINTER_SET_H_
 #define _POINTER_SET_H_
 
 typedef void * PointerSetElement;
 typedef struct _PointerSet PointerSet;
 
+/**
+ * Create a new, empty pointer set.
+ */
 PointerSet   *pointer_set_new();
+
+/**
+ * Free the given pointer set.
+ */
 void         pointer_set_free(PointerSet *set);
 
+/**
+ * Insert the given pointer into the pointer set. The data that the
+ * pointer pointers to is not touched, so <tt>element</tt> may even be
+ * an invalid pointer.
+ */
 void         pointer_set_insert(PointerSet *set, PointerSetElement element);
+
+/**
+ * Remove the given pointer from the pointer set. Nothing will happen
+ * if the pointer isn't already in the set.
+ */
 void         pointer_set_delete(PointerSet *set, PointerSetElement element);
+
+/**
+ * Check whether the given pointer is in the pointer set.
+ */
 int          pointer_set_contains(PointerSet *set, PointerSetElement element);
+
+/**
+ * Clear the pointer set.
+ */
 void         pointer_set_reset(PointerSet *set);
+
+/**
+ * Return the number of pointers in the pointer set.
+ */
 unsigned int pointer_set_get_size(PointerSet *set);
+
+/**
+ * Return the amount of space that is used to store the pointers in the set.
+ *
+ * @invariant pointer_set_get_capacity(set) >= pointer_set_get_size(set)
+ */
 unsigned int pointer_set_get_capacity(PointerSet *set);
 
 #endif /* _POINTER_SET_H_ */
diff --git a/ruby.h b/ruby.h
index 32ea164..c43164e 100644
--- a/ruby.h
+++ b/ruby.h
@@ -454,11 +454,12 @@ struct RBignum {
 #define FL_SINGLETON FL_USER0
 #define FL_ALLOCATED (1<<6)
 #define FL_FINALIZE  (1<<7)
+#define FL_MARK      (1<<11)
 #define FL_TAINT     (1<<8)
 #define FL_EXIVAR    (1<<9)
 #define FL_FREEZE    (1<<10)
 
-#define FL_USHIFT    11
+#define FL_USHIFT    12
 
 #define FL_USER0     (1<<(FL_USHIFT+0))
 #define FL_USER1     (1<<(FL_USHIFT+1))

In This Thread