[#6143] — Christophe Poucet <christophe.poucet@...>

Hello,

17 messages 2005/10/04
[#6147] Re: patch.tgz — nobu.nokada@... 2005/10/04

Hi,

[#6199] Kernel rdoc HTML file not being created when rdoc is run on 1.8.3 — James Britt <ruby@...>

When 1.8.3 came out, I grabbed the source and ran rdoc on it. After

9 messages 2005/10/08

[#6251] RubyGems, upstream releases and idempotence of packaging — Mauricio Fern疣dez <mfp@...>

[sorry for the very late reply; I left this message in +postponed and forgot

14 messages 2005/10/12

[#6282] Wilderness: Need Code to invoke ELTS_SHARED response — "Charles E. Thornton" <ruby-core@...>

Testing the My Object Dump and I am trying to cause creation

13 messages 2005/10/14
[#6283] Re: Wilderness: Need Code to invoke ELTS_SHARED response — Mauricio Fern疣dez <mfp@...> 2005/10/14

On Fri, Oct 14, 2005 at 05:04:59PM +0900, Charles E. Thornton wrote:

[#6288] Re: Wilderness: Need Code to invoke ELTS_SHARED response — "Charles E. Thornton" <ruby-core@...> 2005/10/14

Mauricio Fern疣dez wrote:

[#6365] Time for built-in Rational and Complex classes? — Gavin Sinclair <gsinclair@...>

There has been some support for, but no comment on, RCR #260 ("Make

12 messages 2005/10/24
[#6366] Re: Time for built-in Rational and Complex classes? — "Ara.T.Howard" <Ara.T.Howard@...> 2005/10/24

On Mon, 24 Oct 2005, Gavin Sinclair wrote:

[#6405] Re: [PATCH] Pathname.exists?() — "Berger, Daniel" <Daniel.Berger@...>

12 messages 2005/10/25
[#6406] Re: [PATCH] Pathname.exists?() — TRANS <transfire@...> 2005/10/25

On 10/25/05, Berger, Daniel <Daniel.Berger@qwest.com> wrote:

[#6408] Re: [PATCH] Pathname.exists?() — Gavin Sinclair <gsinclair@...> 2005/10/25

On 10/26/05, TRANS <transfire@gmail.com> wrote:

[#6442] Wilderness: I Have formatted README.EXT into an HTML Document — "Charles E. Thornton" <ruby-core@...>

I have taken README.EXT (English Version Only) and have reformatted

14 messages 2005/10/27

[#6469] csv.rb a start on refactoring. — Hugh Sasse <hgs@...>

For a database application I found using CSV to be rather slow.

50 messages 2005/10/28
[#6470] Re: csv.rb a start on refactoring. — "Ara.T.Howard" <Ara.T.Howard@...> 2005/10/28

[#6471] Re: csv.rb a start on refactoring. — James Edward Gray II <james@...> 2005/10/28

On Oct 28, 2005, at 8:53 AM, Ara.T.Howard wrote:

[#6474] Re: csv.rb a start on refactoring. — "Ara.T.Howard" <Ara.T.Howard@...> 2005/10/28

On Fri, 28 Oct 2005, James Edward Gray II wrote:

[#6484] Re: csv.rb a start on refactoring. — James Edward Gray II <james@...> 2005/10/29

On Oct 28, 2005, at 9:58 AM, Ara.T.Howard wrote:

[#6485] Re: csv.rb a start on refactoring. — "Ara.T.Howard" <Ara.T.Howard@...> 2005/10/29

On Sat, 29 Oct 2005, James Edward Gray II wrote:

[#6486] Re: csv.rb a start on refactoring. — James Edward Gray II <james@...> 2005/10/29

On Oct 28, 2005, at 8:25 PM, Ara.T.Howard wrote:

[#6487] Re: csv.rb a start on refactoring. — "Ara.T.Howard" <Ara.T.Howard@...> 2005/10/29

On Sat, 29 Oct 2005, James Edward Gray II wrote:

[#6491] Re: csv.rb a start on refactoring. — James Edward Gray II <james@...> 2005/10/29

On Oct 28, 2005, at 8:43 PM, Ara.T.Howard wrote:

[#6493] Re: csv.rb a start on refactoring. — James Edward Gray II <james@...> 2005/10/29

On Oct 28, 2005, at 10:06 PM, James Edward Gray II wrote:

[#6496] Re: csv.rb a start on refactoring. — "Ara.T.Howard" <Ara.T.Howard@...> 2005/10/29

On Sun, 30 Oct 2005, James Edward Gray II wrote:

[#6502] Re: csv.rb a start on refactoring. — James Edward Gray II <james@...> 2005/10/30

On Oct 29, 2005, at 12:11 PM, Ara.T.Howard wrote:

[#6505] Re: csv.rb a start on refactoring. — "Ara.T.Howard" <Ara.T.Howard@...> 2005/10/30

On Mon, 31 Oct 2005, James Edward Gray II wrote:

[#6511] Planning FasterCSV (was Re: csv.rb a start on refactoring.) — James Edward Gray II <james@...> 2005/10/30

I've decided to create a FasterCSV library, based on the code we

[#6516] Re: Planning FasterCSV (was Re: csv.rb a start on refactoring.) — "Ara.T.Howard" <Ara.T.Howard@...> 2005/10/31

On Mon, 31 Oct 2005, James Edward Gray II wrote:

[#6518] Re: Planning FasterCSV (was Re: csv.rb a start on refactoring.) — "NAKAMURA, Hiroshi" <nakahiro@...> 2005/10/31

-----BEGIN PGP SIGNED MESSAGE-----

Re: csv.rb a start on refactoring.

From: James Edward Gray II <james@...>
Date: 2005-10-30 16:06:27 UTC
List: ruby-core #6502
On Oct 29, 2005, at 12:11 PM, Ara.T.Howard wrote:

> it may or may not be tricky to get these failing cases working though:

This version passes all of your edge cases:

module CSV2
   def self::parse_line( data )
     io = if data.is_a?(IO) then data else StringIO.new(data) end
     line = ""

     loop do
       line  += io.gets
       parse = line.dup
       parse.chomp!

       csv = if parse.sub!(/\A,+/, "") then [nil] * $&.length else  
Array.new end
       parse.gsub!(/\G(?:^|,)(?:"((?>[^"]*)(?>""[^"]*)*)"|([^",]*))/) do
         csv << if $1.nil?
           if $2 == "" then nil else $2 end
         else
           $1.gsub('""', '"')
         end
         ""
       end

       break csv if parse.empty?
     end
   end
end

Here's how it is holding up speed wise:

Neo:~/Desktop$ cat bm_csv.rb
#!/usr/local/bin/ruby -w

require "csv"
require "benchmark"
require "stringio"
require "fast_csv"

def parse_csv( data )
   io = if data.is_a?(IO) then data else StringIO.new(data) end
   line = ""

   loop do
     line  += io.gets
     parse = line.dup
     parse.chomp!

     csv = if parse.sub!(/\A,+/, "") then [nil] * $&.length else  
Array.new end
     parse.gsub!(/\G(?:^|,)(?:"((?>[^"]*)(?>""[^"]*)*)"|([^",]*))/) do
       csv << if $1.nil?
         if $2 == "" then nil else $2 end
       else
         $1.gsub('""', '"')
       end
       ""
     end

     break csv if parse.empty?
   end
end

DATA  = %Q{Ten Thousand,10000, 2710 ,,"10,000","It's ""10 Grand"",  
baby",10K}
TESTS = 50000

fast = FastCsv.new
Benchmark.bm do |timings|
   timings.report("CSV") { TESTS.times { CSV.parse_line(DATA) } }
   timings.report("FastCsv") { TESTS.times { fast.parse(DATA) } }
   timings.report("Regexp") { TESTS.times { parse_csv(DATA) } }
end
Neo:~/Desktop$ ruby bm_csv.rb
       user     system      total        real
CSV 18.370000   0.060000  18.430000 ( 18.498160)
FastCsv  3.640000   0.010000   3.650000 (  3.671689)
Regexp  3.530000   0.020000   3.550000 (  3.560493)

FastCsv was posted to Ruby Talk last night, but I'm using the  
refactored version by Stefen Lang that was added today.  It does not  
pass all of your tests.

James Edward Gray II


In This Thread