From: "sam.saffron (Sam Saffron)" Date: 2013-04-03T09:13:41+09:00 Subject: [ruby-core:53912] [ruby-trunk - Feature #8110] Regex methods not changing global variables Issue #8110 has been updated by sam.saffron (Sam Saffron). @naruse There is a perf implication that really needs addressing here that would help elsewhere: in re.c, there is a whole bunch of work that can be avoided when NO_BACKREF is passed in for the match: In particular: match = match_alloc(rb_cMatch); onig_region_copy(RMATCH_REGS(match), regs); onig_region_free(regs, 0); } else { if (rb_safe_level() >= 3) OBJ_TAINT(match); else FL_UNSET(match, FL_TAINT); } RMATCH(match)->str = rb_str_new4(str); RMATCH(match)->regexp = re; RMATCH(match)->rmatch->char_offset_updated = 0; rb_backref_set(match); OBJ_INFECT(match, re); OBJ_INFECT(match, str); This in turn should improve the performance of regex matching with the /B option quite a lot. I have been looking at this recently due to some performance issues I noticed on Active Supports String#blank? The c implementation of: def blank? self !~ /[^[:space:]]/ end is the somewhat crazy: https://github.com/SamSaffron/fast_blank/blob/master/ext/fast_blank/fast_blank.c#L16-L55 This implementation is 5 to 8x faster. I vote for: * new option for Regexp like Regexp.new("foo", Regexp::NO_BACKREF) AND /foo/B You can then feature detect if its available by looking for Regexp::NO_BACKREF I do wonder how much faster this will be for my micro benchmark vs the native c implementation, when you are done can you ping me so I can bench it? (at sam.saffron@gmail.com) ---------------------------------------- Feature #8110: Regex methods not changing global variables https://bugs.ruby-lang.org/issues/8110#change-38128 Author: prijutme4ty (Ilya Vorontsov) Status: Assigned Priority: Normal Assignee: matz (Yukihiro Matsumoto) Category: core Target version: next minor It is useful to have methods allowing pattern matching without setting global variables. It can be very hard to understand where the problem is when you for example insert a string like `puts pat === my_str` and your program fails in a place which is far-far away from inserted place. This can happen due to replacing global variables of previous pattern match. I caught to this when placed pattern-match inside case-statement and shadowed global vars which were initially filled by match in when-statement. For now one can extract pattern matching into another method thus defining method-scope for that variables. But sometimes it looks like an overkill. May be simple method like #match_globalsafe can prevent that kind of errors. At least when a programmer see such a method in a list of methods, he's warned that usual match can cause such problems. -- http://bugs.ruby-lang.org/