[#15359] Timeout::Error — Jeremy Thurgood <jerith@...>

Good day,

41 messages 2008/02/05
[#15366] Re: Timeout::Error — Eric Hodel <drbrain@...7.net> 2008/02/06

On Feb 5, 2008, at 06:20 AM, Jeremy Thurgood wrote:

[#15370] Re: Timeout::Error — Jeremy Thurgood <jerith@...> 2008/02/06

Eric Hodel wrote:

[#15373] Re: Timeout::Error — Nobuyoshi Nakada <nobu@...> 2008/02/06

Hi,

[#15374] Re: Timeout::Error — Jeremy Thurgood <jerith@...> 2008/02/06

Nobuyoshi Nakada wrote:

[#15412] Re: Timeout::Error — Nobuyoshi Nakada <nobu@...> 2008/02/07

Hi,

[#15413] Re: Timeout::Error — Jeremy Thurgood <jerith@...> 2008/02/07

Nobuyoshi Nakada wrote:

[#15414] Re: Timeout::Error — Nobuyoshi Nakada <nobu@...> 2008/02/07

Hi,

[#15360] reopen: can't change access mode from "w+" to "w"? — Sam Ruby <rubys@...>

I ran 'rake test' on test/spec [1], using

16 messages 2008/02/05
[#15369] Re: reopen: can't change access mode from "w+" to "w"? — Nobuyoshi Nakada <nobu@...> 2008/02/06

Hi,

[#15389] STDIN encoding differs from default source file encoding — Dave Thomas <dave@...>

This seems strange:

21 messages 2008/02/06
[#15392] Re: STDIN encoding differs from default source file encoding — Yukihiro Matsumoto <matz@...> 2008/02/06

Hi,

[#15481] very bad character performance on ruby1.9 — "Eric Mahurin" <eric.mahurin@...>

I'd like to bring up the issue of how characters are represented in

16 messages 2008/02/10

[#15528] Test::Unit maintainer — Kouhei Sutou <kou@...>

Hi Nathaniel, Ryan,

22 messages 2008/02/13

[#15551] Proc#curry — ts <decoux@...>

21 messages 2008/02/14
[#15557] Re: [1.9] Proc#curry — David Flanagan <david@...> 2008/02/15

ts wrote:

[#15558] Re: [1.9] Proc#curry — Yukihiro Matsumoto <matz@...> 2008/02/15

Hi,

[#15560] Re: Proc#curry — Trans <transfire@...> 2008/02/15

[#15585] Ruby M17N meeting summary — Martin Duerst <duerst@...>

This is a rough translation of the Japanese meeting summary

19 messages 2008/02/18

[#15596] possible bug in regexp lexing — Ryan Davis <ryand-ruby@...>

current:

17 messages 2008/02/19

[#15678] Re: [ANN] MacRuby — "Rick DeNatale" <rick.denatale@...>

On 2/27/08, Laurent Sansonetti <laurent.sansonetti@gmail.com> wrote:

18 messages 2008/02/28
[#15679] Re: [ANN] MacRuby — "Laurent Sansonetti" <laurent.sansonetti@...> 2008/02/28

On Thu, Feb 28, 2008 at 6:33 AM, Rick DeNatale <rick.denatale@gmail.com> wrote:

[#15680] Re: [ANN] MacRuby — Yukihiro Matsumoto <matz@...> 2008/02/28

Hi,

[#15683] Re: [ANN] MacRuby — "Laurent Sansonetti" <laurent.sansonetti@...> 2008/02/28

On Thu, Feb 28, 2008 at 1:51 PM, Yukihiro Matsumoto <matz@ruby-lang.org> wrote:

Re: very bad character performance on ruby1.9

From: "Eric Mahurin" <eric.mahurin@...>
Date: 2008-02-10 23:27:54 UTC
List: ruby-core #15490
On Feb 10, 2008 2:16 PM, Vincent Isambart <vincent.isambart@gmail.com>
wrote:

> Hi,
>
> > I'd like to bring up the issue of how characters are represented in
> > ruby 1.9 from a performance standpoint.  In a recent ruby-quiz
> > (parsing JSON), the fastest pure-ruby solution was simply an LL(1)
> > parser that looked at one character at a time (it beat various
> > Regexp solutions).  With ruby 1.9, the runtime increased by 4X
> > making it a slow solution.  A simple benchmark is at the end of this
> > message that counts words in an LL(1) fashion.  With ruby 1.8.6, it
> > can could the words in Homer's Iliad in 1.46s on my machine and in
> > ruby 1.9 (from ubuntu gutsy) it takes 52.87s (36X increase in
> > runtime).
>
> I'm surprised that the fastest parsing done in Ruby was with a
> handwritten parser.


I was also surprised that the LL(1) JSON parser was so fast.  I was
expecting a StringScanner solution to beat it.  I also optimized the
StringScanner solution with some of the same techniques.


> When I tried to code a small XML parser in Ruby
> the fastest solution was using StringScanner. And for your small
> example, I rewrote a version using StringScanner that's faster in both
> 1.8 and 1.9 (and it's faster in 1.9). And it's shorter and (I think)
> more readable.
>
> require 'strscan'
>
> strscan = StringScanner.new(text)
> punctuation = spacing = words = 0
> while not strscan.eos?
>   if strscan.skip(/[a-zA-Z_]+/)
>     words += 1
>   elsif strscan.skip(/\s+/)
>     spacing += 1
>   else
>     strscan.skip(/./)
>     punctuation += 1
>   end
> end


I also wrote about the same thing for this benchmark.  To make it work for
an IO (reading one line at a time), you'll need a little more.  I'm not
arguing that StringScanner solution isn't faster in this case.  It is.  For
lexers and parsers I done by hand, I usually find LL(1) and StringScanner
solutions are in the same ball-park.

The benchmark I posted was meant to show the slowdown for an application
dealing directly with characters.

In 1.8.6, you could think about making a pure-ruby Regexp-like (pattern
matching) like replacement.  This is not feasible in 1.9 because of
performance.

Most popular parsers written in Ruby (ERB, json-pure, RedCloth) use
> Regexps (some with and some without StringScanner).
>

Anything using Regexp won't have an issue.

The stuff I'm doing generates parsers and lexers from the ground up (LL(1)
with LL(*) where necessary).  I don't use Regexp mainly because Regexp is
too limiting (hard to apply to an IO).  Since I found the performance in
1.8.6 to be reasonable without Regexp (and I already can do much more than
Regexp), I didn't see the  need to deal with the complexity of adding
Regexp.


>
> > Please consider this significant performance issue in ruby 1.9.
>
> I am not sure this particular case is really a significant issue.
>

For me it definitely is.  Based on the JSON parser, I expect any of my
generated lexers or character parsers to be around 4X slower.

In This Thread