[#55222] [ruby-trunk - Feature #8468][Feedback] Remove $SAFE — "shugo (Shugo Maeda)" <redmine@...>

20 messages 2013/06/01

[#55260] [ruby-trunk - Feature #8478][Open] The hash returned by Enumerable#group_by should have an empty array for its default value — "phiggins (Pete Higgins)" <pete@...>

8 messages 2013/06/02

[#55276] Re: [ruby-changes:28951] zzak:r41003 (trunk): * process.c: Improve Process::exec documentation — Tanaka Akira <akr@...>

2013/5/31 zzak <ko1@atdot.net>:

9 messages 2013/06/03

[#55306] [ruby-trunk - Feature #8490][Open] Bring ActiveSupport Enumerable#index_by to core — "rosenfeld (Rodrigo Rosenfeld Rosas)" <rr.rosas@...>

12 messages 2013/06/04

[#55330] [ruby-trunk - Feature #8499][Assigned] Importing Hash#slice, Hash#slice!, Hash#except, and Hash#except! from ActiveSupport — "mrkn (Kenta Murata)" <muraken@...>

30 messages 2013/06/06

[#55391] [ruby-trunk - Bug #8507][Open] Keyword splat does not convert arg to Hash — "stephencelis (Stephen Celis)" <stephen.celis@...>

16 messages 2013/06/09

[#55393] [ruby-trunk - Bug #8508][Open] Invalid byte sequence in UTF-8 (ArgumentError) in win32/registry.rb — "thasmo (Thomas Deinhamer)" <thasmo@...>

11 messages 2013/06/09

[#55528] [ruby-trunk - Bug #8538][Open] c method not pushed into the callstack when called, but popped when returned — deivid (David Rodríguez) <deivid.rodriguez@...>

9 messages 2013/06/17

[#55557] [ruby-trunk - misc #8543][Open] rb_iseq_load — "alvoskov (Alexey Voskov)" <alvoskov@...>

47 messages 2013/06/19

[#55558] [ruby-trunk - Feature #8544][Open] OpenURI should open 'file://' URIs — "silasdavis (Silas Davis)" <ruby-lang@...>

12 messages 2013/06/19

[#55580] [CommonRuby - Feature #8556][Open] MutexedDelegator as a trivial way to make an object thread-safe — "headius (Charles Nutter)" <headius@...>

19 messages 2013/06/21

[#55596] [ruby-trunk - Feature #8563][Open] Instance variable arguments — "sawa (Tsuyoshi Sawada)" <sawadatsuyoshi@...>

18 messages 2013/06/22

[#55638] [CommonRuby - Feature #8568][Open] Introduce RbConfig value for native word size, to avoid Fixnum#size use — "headius (Charles Nutter)" <headius@...>

18 messages 2013/06/24

[#55678] [ruby-trunk - Feature #8572][Open] Fiber should be a Enumerable — "mattn (Yasuhiro Matsumoto)" <mattn.jp@...>

13 messages 2013/06/28

[#55699] [ruby-trunk - Feature #8579][Open] Frozen string syntax — "charliesome (Charlie Somerville)" <charliesome@...>

20 messages 2013/06/29

[#55708] [ruby-trunk - Bug #8584][Assigned] Remove curses — "shugo (Shugo Maeda)" <redmine@...>

17 messages 2013/06/30

[ruby-core:55590] [ruby-trunk - Bug #8560][Open] CSV, skip_lines option causes error when passing a string

From: "kstevens715 (Kyle Stevens)" <kstevens715@...>
Date: 2013-06-22 06:53:22 UTC
List: ruby-core #55590
Issue #8560 has been reported by kstevens715 (Kyle Stevens).

----------------------------------------
Bug #8560: CSV, skip_lines option causes error when passing a string
https://bugs.ruby-lang.org/issues/8560

Author: kstevens715 (Kyle Stevens)
Status: Open
Priority: Low
Assignee: 
Category: 
Target version: 
ruby -v: ruby 2.0.0p0 (2013-02-24 revision 39474) [x86_64-linux]
Backport: 1.9.3: UNKNOWN, 2.0.0: UNKNOWN


There seems to be a bug in the CSV class when using the skip_lines option.This option is currently undocumented, but according to the GitHub PR, it accepts any object that responds to `match`. String responds to match, so one would imagine a string can be used. However, String#match can take either a Regexp or another String. If the argument passed is a string, it will first be converted to a Regexp.

So if you pass a string to the skip_lines option, it will attempt to convert the row in the csv file to a Regexp. This doesn't make sense, and it can also lead to exceptions. For example, if the csv contains a row with the data, "# )", you will get an error, "RegexpError: unmatched close parenthesis: /# )/".

My particular use case when I found this problem was I was trying to ignore lines beginning with a "#". 

This was my first, unsuccessful attempt:
csv = CSV.open(FILE_NAME, skip_lines: "#", encoding: "ISO8859-1")

What I ended up having to do was:
csv = CSV.open(FILE_NAME, skip_lines: /\A#/, encoding: "ISO8859-1")

This isn't a huge problem, since there's a perfectly acceptable work-around. However, it would be a very easy mistake to make and it could be a difficult problem for someone to debug. It could lead to quite strange behavior if each row in the csv is converted to a Regexp.

I think the skip_lines option should be converted to a Regexp if it's a string, because the alternative is the CSV row is going to be converted to a Regexp.

My proposal is if a string is passed to skip_lines, it should be converted to a regular expression to match the beginning of a line, excluding whitespace:
"#" => /\A\s+#/

I'd be willing to work on a pull request to implement a fix, but I'd love to hear some feedback first. I definitely think this should be fixed, but I'm not positive my solution is the best option...

Here is the original pull request that implemented this option:
https://github.com/ruby/ruby/pull/161

Thank you



-- 
http://bugs.ruby-lang.org/

In This Thread

Prev Next