From: shyouhei@... Date: 2016-07-02T08:09:04+00:00 Subject: [ruby-core:76231] [Ruby trunk Bug#12452][Rejected] Regexp alternation does not backtrack to check the other alternatives if a match is found on the first one Issue #12452 has been updated by Shyouhei Urabe. Status changed from Open to Rejected This is how a regexp works. AFAIK perl, php, python, nodejs all behave the same way as ruby do. I'm afraid changing here brings more harm than good. ```perl % perl -e "'ab' =~ /a|ab/; warn $&" a at -e line 1. ``` ```php % php -a Interactive shell php > preg_match("/a|ab/", "ab", $m); echo $m[0]; a php > ``` ```python % python Python 2.7.11 (default, Jan 22 2016, 08:29:18) [GCC 4.2.1 Compatible Apple LLVM 7.0.2 (clang-700.1.81)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> import re >>> m = re.search("(a|ab)", "ab") >>> m.group(0) 'a' >>> ``` ```javascript % node --interactive > "ab".match(/a|ab/); [ 'a', index: 0, input: 'ab' ] > ``` ---------------------------------------- Bug #12452: Regexp alternation does not backtrack to check the other alternatives if a match is found on the first one https://bugs.ruby-lang.org/issues/12452#change-59463 * Author: Lucas Farias * Status: Rejected * Priority: Normal * Assignee: * ruby -v: 2.2.4 * Backport: 2.1: UNKNOWN, 2.2: UNKNOWN, 2.3: UNKNOWN ---------------------------------------- Hi, there is a problem with Regexps containing alternation where it returns the first matched alternative even if there is another alternative that would've resulted on a longer match, it's probably caused by not backtracking and checking the remaining alternatives. If you have /a|ab/.match("ab") it returns just "a". Many discussions that I found on the internet suggested to whoever was making the complaint was just change the order of the alternatives so that it tests first the longer alternative, but for regular expressions like /(abc|ab)(de|cdef)/.match("abcdef") the alternative of the first alternation that would result on a longer match is the shorter one. And these examples don't even have repetition. `irb(main):001:0> /a|ab/.match("ab")` `=> #` `irb(main):002:0> /(abc|ab)(de|cedf)/.match("abcdef")` `=> #` -- https://bugs.ruby-lang.org/ Unsubscribe: