From: Tomoaki Nishiyama Date: 2011-12-14T18:05:54+09:00 Subject: [ruby-core:41643] [ruby-trunk - Feature #5749] new method String#match_all needed Issue #5749 has been updated by Tomoaki Nishiyama. I proposed a similar one as each_match http://bugs.ruby-lang.org/issues/5606 A difference is to have the next offset by m.begin(0)+1 rather than m.end(0) "AKASATANA".each_match(/A.A/) will recognize AKA ASA ATA ANA (This, I think, cannot be done with scan. Is it?) Such different behavior might be controlled with an optional argument. I think we might merge the discussion to this issue rather than keeping too separate issues. Anyway, I'm glad to hear a similar demand for a function to get the MatchData objects, rather than scan() to set the trick. ---------------------------------------- Feature #5749: new method String#match_all needed http://redmine.ruby-lang.org/issues/5749 Author: Joey Zhou Status: Open Priority: Normal Assignee: Category: Target version: The String class should contain an instance method 'match_all', which is a mixture of 'match' and 'scan'. The method 'scan' is not a very powerful tool, its result(the yielding thing) is just a matched string or an array of captured strings. p 'a1bc2de3f'.scan(/(.)\d(.)/) # [["a", "b"], ["c", "d"], ["e", "f"]] If the regex argument contains groups, I even cannot get the whole matched string, and no information about the matched offsets. So, a 'match_all' is very necessary. It scan the string, finding every matched, and yielding *MatchData instance* to the following block. Here's a simple implemention in Ruby: class String def match_all(re,i=0) if block_given? while m = self.match(re,i) yield m i = m.end(0) end return self else ary = [] while m = self.match(re,i) ary << m i = m.end(0) end return ary end end end However, it is not efficient in the 'while m = self.match(re,i)' way, because it scan the string again and again. If string is UTF8-encoded and contains out-of-ASCII characters, I'm afraid getting the start index of it is so expensive. So, I think a built-in 'match_all' method, which behaves just like 'scan' but yield MatchData, is needed. Please consider it, thank you! -- http://redmine.ruby-lang.org