From: janosch-x Date: 2022-05-18T08:21:52+00:00 Subject: [ruby-core:108609] [Ruby master Feature#18788] Support passing Regexp options as String to Regexp.new Issue #18788 has been updated by janosch-x (Janosch M��ller). > Please don't allow symbols. It may look cute (in some cases), but the options are essentially a set of single letters, and that's not what Symbols are about. My reasoning for supporting them was not cuteness but avoiding confusion. If we change only the processing of Strings, we will have this behavior: ``` Regexp.new('foo', 'i') # => /foo/i Regexp.new('foo', 'm') # => /foo/m Regexp.new('foo', :i) # => /foo/i # looks like it also works Regexp.new('foo', :m) # => /foo/i # slightly surprising ``` I'm happy to support only Strings, though. In this case we might want to consider raising an ArgumentError when a Symbol is passed, or even for anything that is not nil/true/false/Int/String? > please make it something like `Regexp.new('foo', :ignorecase, :multiline, :extend) # => /foo/imx` I don't think this is a viable option. `Regexp.new` already accepts up to 3 arguments. The third one is undocumented as far as I can tell, but it is used in the wild. [If a String starting with "n" or "N" is passed as third argument, the Regexp encoding is set to ASCII](https://github.com/ruby/ruby/blob/b41de3a1e8c36a5cc336b6f7cd3cb71126cf1a60/re.c#L3622-L3651). Arguably that makes consistency another reason for my proposal. ---------------------------------------- Feature #18788: Support passing Regexp options as String to Regexp.new https://bugs.ruby-lang.org/issues/18788#change-97646 * Author: janosch-x (Janosch M��ller) * Status: Open * Priority: Normal ---------------------------------------- ## Current situation `Regexp.new` takes an integer as second argument which needs to be ORed together from multiple constants: ``` Regexp.new('foo', Regexp::IGNORECASE | Regexp::MULTILINE | Regexp::EXTENDED) # => /foo/imx ``` Any other non-nil value is treated as `i` flag: ``` Regexp.new('foo', Object.new) # => /foo/i ``` ## Suggestion `Regexp.new` should support passing the regexp flags not only as an Integer, but also as a String or Symbol, like so: ``` Regexp.new('foo', 'i') # => /foo/i Regexp.new('foo', :i) # => /foo/i Regexp.new('foo', 'imx') # => /foo/imx Regexp.new('foo', :imx) # => /foo/imx # edge cases Regexp.new('foo', 'iii') # => /foo/i Regexp.new('foo', :iii) # => /foo/i Regexp.new('foo', '') # => /foo/ Regexp.new('foo', :'') # => /foo/ # unsupported flags could be ignored - # or raise an ArgumentError to reveal changed behavior? Regexp.new('foo', 'jmq') # => /foo/m Regexp.new('foo', :jmq) # => /foo/m Regexp.new('foo', '-m') # => /foo/m Regexp.new('foo', :'-m') # => /foo/m ``` ## Reasons 1. The constants are a bit cumbersome to use, particularly when building the regexp from variable data: ``` def make_regexp(regexp_body, opt_string) opt_int = 0 opt_int |= Regexp::IGNORECASE if opt_string.include?('i') opt_int |= Regexp::MULTILINE if opt_string.include?('m') opt_int |= Regexp::EXTENDED if opt_string.include?('x') Regexp.new(regexp_body, opt_int) end ``` 2. Passing a String or Symbol is already silently accepted, and people might get the wrong impression that it works: ``` Regexp.new('foo', 'i') # => /foo/i Regexp.new('foo', :i) # => /foo/i ``` ... but it doesn't really work: ``` Regexp.new('foo', 'x') # => /foo/i Regexp.new('foo', :x) # => /foo/i ``` ## Backwards compatibility This change would not be fully backwards compatible. Code that relies on the second argument being either a String/Symbol or nil to decide whether the Regexp should be case insensitive would break (unless the String or Symbol contains "i"). I can't come up with a scenario where one would write such code, though - except maybe code golfing? -- https://bugs.ruby-lang.org/ Unsubscribe: