[#28687] [Bug #2973] rb_bug - Segmentation fault - error.c:213 — rudolf gavlas <redmine@...>

Bug #2973: rb_bug - Segmentation fault - error.c:213

10 messages 2010/03/16

[#28735] [Bug #2982] Ruby tries to link with both openssl and readline — Lucas Nussbaum <redmine@...>

Bug #2982: Ruby tries to link with both openssl and readline

16 messages 2010/03/18

[#28736] [Bug #2983] Ruby (GPLv2 only) tries to link to with readline (now GPLv3) — Lucas Nussbaum <redmine@...>

Bug #2983: Ruby (GPLv2 only) tries to link to with readline (now GPLv3)

10 messages 2010/03/18

[#28907] [Bug #3000] Open SSL Segfaults — Christian Höltje <redmine@...>

Bug #3000: Open SSL Segfaults

19 messages 2010/03/23

[#28924] [Bug #3005] Ruby core dump - [BUG] rb_sys_fail() - errno == 0 — Sebastian YEPES <redmine@...>

Bug #3005: Ruby core dump - [BUG] rb_sys_fail() - errno == 0

10 messages 2010/03/24

[#28954] [Feature #3010] slow require gems in ruby 1.9.1 — Miao Jiang <redmine@...>

Feature #3010: slow require gems in ruby 1.9.1

15 messages 2010/03/24

[#29179] [Bug #3071] Convert rubygems and rdoc to use psych — Aaron Patterson <redmine@...>

Bug #3071: Convert rubygems and rdoc to use psych

10 messages 2010/03/31

[ruby-core:28549] Re: [Feature #905] Add String.new(fixnum) to preallocate large buffer

From: Charles Oliver Nutter <headius@...>
Date: 2010-03-07 15:34:31 UTC
List: ruby-core #28549
On Sun, Mar 7, 2010 at 4:58 AM, Yusuke ENDOH <mame@tsg.ne.jp> wrote:
> Ko1 told me that GC makes the second benchmark slower than JRuby.
> In MRI, a string literal is duplicated whenever evaluated.
> I moved the literals out of the loop:

JRuby behaves the same, since literal strings are still separate
objects and mutable.

> =C2=A0results.report "'' <<" do
> =C2=A0 =C2=A0s =3D ''
> =C2=A0 =C2=A0s1, s2 =3D '.', 'word'
> =C2=A0 =C2=A0N.times { s << s1 << s2 }
> =C2=A0end
>
> =C2=A0ruby19
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0 =C2=A0 =C2=A0 =C2=A0user =C2=A0 =C2=A0 system =C2=A0 =C2=A0 =C2=A0total=
 =C2=A0 =C2=A0 =C2=A0 =C2=A0real
> =C2=A0'' << =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 6.810=
000 =C2=A0 0.040000 =C2=A0 6.850000 ( =C2=A06.851979)
>
> =C2=A0jruby
> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0 =C2=A0 =C2=A0 =C2=A0user =C2=A0 =C2=A0 system =C2=A0 =C2=A0 =C2=A0total=
 =C2=A0 =C2=A0 =C2=A0 =C2=A0real
> =C2=A0'' << =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 7.159=
000 =C2=A0 0.000000 =C2=A0 7.159000 ( =C2=A07.126000)
>
> Indeed, there is room for optimization in MRI, but in this case,
> it is not in string concatenation, I guess.

My numbers came out somewhat differently. Make sure you're running
with the JVM's "server" mode if you run on Hotspot (Sun/OpenJDK):

~/projects/jruby =E2=9E=94 jruby --server string_bench.rb
                          user     system      total        real
loop                  0.572000   0.000000   0.572000 (  0.523000)
'' <<                 1.470000   0.000000   1.470000 (  1.470000)

~/projects/jruby =E2=9E=94 ruby1.9 string_bench.rb
                          user     system      total        real
loop                  0.810000   0.000000   0.810000 (  0.838414)
'' <<                 2.670000   0.040000   2.710000 (  2.733041)

Here's numbers with a prototypical String.buffer implementation:

~/projects/jruby =E2=9E=94 jruby --server string_bench.rb
                          user     system      total        real
loop                  0.655000   0.000000   0.655000 (  0.606000)
'' <<                 1.390000   0.000000   1.390000 (  1.390000)
                          user     system      total        real
loop                  0.321000   0.000000   0.321000 (  0.321000)
'' <<                 1.241000   0.000000   1.241000 (  1.241000)
                          user     system      total        real
loop                  0.314000   0.000000   0.314000 (  0.314000)
'' <<                 1.229000   0.000000   1.229000 (  1.229000)

Of course, this 10-15% improvement could simply be because the JVM
does not provide a "realloc" for its arrays (for various reasons, some
of them presumably because it moves objects around in memory a lot).
In order to grow a string, we have to allocate a new array and copy
its contents. Under those circumstances, String.buffer makes a lot of
sense, since the copying can get expensive at large sizes.

I don't know enough about MRI internals to implement an equivalent
String.buffer, but here's the patch to JRuby:

diff --git a/src/org/jruby/RubyString.java b/src/org/jruby/RubyString.java
index 71e6b63..e618ec8 100644
--- a/src/org/jruby/RubyString.java
+++ b/src/org/jruby/RubyString.java
@@ -451,6 +451,11 @@ public class RubyString extends RubyObject
implements EncodingCapable {
     public static RubyString newStringLight(Ruby runtime, int size) {
         return new RubyString(runtime, runtime.getString(), new
ByteList(size), false);
     }
+
+    @JRubyMethod(meta =3D true)
+    public static IRubyObject buffer(ThreadContext context,
IRubyObject self, IRubyObject size) {
+        return newStringLight(context.getRuntime(),
(int)size.convertToInteger().getLongValue());
+    }

     public static RubyString newString(Ruby runtime, CharSequence str) {
         return new RubyString(runtime, runtime.getString(), str);

In This Thread