[#58149] [ruby-trunk - Feature #9076][Open] New one-argument block syntax: &. — "asterite (Ary Borenszweig)" <ary@...>

23 messages 2013/11/04

[#58176] [ruby-trunk - Bug #9082][Open] popen3 hangs when stderr gets lots of output — "rosenfeld (Rodrigo Rosenfeld Rosas)" <rr.rosas@...>

15 messages 2013/11/05

[#58207] [ruby-trunk - Bug #9089][Open] rb_fix2uint no longer raises a RangeError when given negative values — "NoKarma (Arthur Schreiber)" <schreiber.arthur@...>

9 messages 2013/11/06

[#58243] [ruby-trunk - Feature #9098][Open] Indent heredoc against the left margin by default when "indented closing identifier" is turned on. — "sikachu (Prem Sichanugrist)" <s@...>

24 messages 2013/11/09

[#58306] [ruby-trunk - Bug #9106][Open] 'gem install' doesn't copy .so files of ext libs — "tagomoris (Satoshi TAGOMORI)" <tagomoris@...>

15 messages 2013/11/13

[#58324] [ruby-trunk - Feature #9108][Open] Hash sub-selections — "wardrop (Tom Wardrop)" <tom@...>

28 messages 2013/11/14

[#58342] [ruby-trunk - Feature #9112][Open] Make module lookup more dynamic (Including modules into a module after it has already been included) — "PragTob (Tobias Pfeiffer)" <pragtob@...>

16 messages 2013/11/14

[#58350] [ruby-trunk - Feature #9113][Open] Ship Ruby for Linux with jemalloc out-of-the-box — "sam.saffron (Sam Saffron)" <sam.saffron@...>

59 messages 2013/11/15

[#58374] [ruby-trunk - Bug #9115][Open] Logger traps all exceptions; breaks Timeout — "cphoenix (Chris Phoenix)" <cphoenix@...>

10 messages 2013/11/16

[#58375] [ruby-trunk - Feature #9116][Open] String#rsplit missing — "artagnon (Ramkumar Ramachandra)" <artagnon@...>

12 messages 2013/11/16

[#58396] [ruby-trunk - Bug #9121][Open] [PATCH] Remove rbtree implementation of SortedSet due to performance regression — "xshay (Xavier Shay)" <contact@...>

15 messages 2013/11/18

[#58404] [ruby-trunk - Feature #9123][Open] Make Numeric#nonzero? behavior consistent with Numeric#zero? — "sferik (Erik Michaels-Ober)" <sferik@...>

40 messages 2013/11/18

[#58411] [ruby-trunk - Bug #9124][Open] TestSocket errors in test-all on Arch 64-bit — "jonforums (Jon Forums)" <redmine@...>

14 messages 2013/11/18

[#58438] [ruby-trunk - Bug #9129][Open] Regression in support for IPv6 literals in URIs with Net::HTTP — "kallistec (Daniel DeLeo)" <dan@...>

11 messages 2013/11/19

[#58545] [ruby-trunk - Feature #9145][Open] Queue#pop(true) return nil if empty instead of raising ThreadError — "jsc (Justin Collins)" <redmine@...>

9 messages 2013/11/24

[#58653] [ruby-trunk - Bug #9170][Open] Math.sqrt returns different types when mathn is included; breaks various gems - this bug can be reproduced in Ruby 1.8 as well — "kranzky (Jason Hutchens)" <JasonHutchens@...>

7 messages 2013/11/28

[ruby-core:58540] [ruby-trunk - Bug #8560][Closed] CSV, skip_lines option causes error when passing a string

From: "JEG2 (James Gray)" <jeg2@...>
Date: 2013-11-23 23:14:16 UTC
List: ruby-core #58540
Issue #8560 has been updated by JEG2 (James Gray).

Category set to lib
Status changed from Open to Closed
% Done changed from 0 to 100


----------------------------------------
Bug #8560: CSV, skip_lines option causes error when passing a string
https://bugs.ruby-lang.org/issues/8560#change-43121

Author: kstevens715 (Kyle Stevens)
Status: Closed
Priority: Low
Assignee: JEG2 (James Gray)
Category: lib
Target version: 
ruby -v: ruby 2.0.0p0 (2013-02-24 revision 39474) [x86_64-linux]
Backport: 1.9.3: UNKNOWN, 2.0.0: UNKNOWN


There seems to be a bug in the CSV class when using the skip_lines option.This option is currently undocumented, but according to the GitHub PR, it accepts any object that responds to `match`. String responds to match, so one would imagine a string can be used. However, String#match can take either a Regexp or another String. If the argument passed is a string, it will first be converted to a Regexp.

So if you pass a string to the skip_lines option, it will attempt to convert the row in the csv file to a Regexp. This doesn't make sense, and it can also lead to exceptions. For example, if the csv contains a row with the data, "# )", you will get an error, "RegexpError: unmatched close parenthesis: /# )/".

My particular use case when I found this problem was I was trying to ignore lines beginning with a "#". 

This was my first, unsuccessful attempt:
csv = CSV.open(FILE_NAME, skip_lines: "#", encoding: "ISO8859-1")

What I ended up having to do was:
csv = CSV.open(FILE_NAME, skip_lines: /\A#/, encoding: "ISO8859-1")

This isn't a huge problem, since there's a perfectly acceptable work-around. However, it would be a very easy mistake to make and it could be a difficult problem for someone to debug. It could lead to quite strange behavior if each row in the csv is converted to a Regexp.

I think the skip_lines option should be converted to a Regexp if it's a string, because the alternative is the CSV row is going to be converted to a Regexp.

My proposal is if a string is passed to skip_lines, it should be converted to a regular expression to match the beginning of a line, excluding whitespace:
"#" => /\A\s+#/

I'd be willing to work on a pull request to implement a fix, but I'd love to hear some feedback first. I definitely think this should be fixed, but I'm not positive my solution is the best option...

Here is the original pull request that implemented this option:
https://github.com/ruby/ruby/pull/161

Thank you



-- 
http://bugs.ruby-lang.org/

In This Thread

Prev Next