From: "phluid61 (Matthew Kerwin)" <matthew@...>
Date: 2013-08-01T09:24:11+09:00
Subject: [ruby-core:56291] [ruby-trunk - Feature #8714] Non-interpolated regular expression literal


Issue #8714 has been updated by phluid61 (Matthew Kerwin).


Eregon (Benoit Daloze) wrote:
> > Off the top of my head, I can't think of how to construct a regexp literal to match a hash character at the end of the string (i.e. /#$/), without first constructing a string.
> 
> Well you can escape the "#": /\#$/ =~ "#" # => 0.

Of course!

> %r{#$} works too.

  irb(main):004:0> %r{#$}
  SyntaxError: (irb):4: syntax error, unexpected $undefined
  %r{#$}
       ^
  	from /usr/local/bin/irb:12:in `<main>'
  irb(main):005:0> %r{\#$}
  => /\#$/


> If you want to match at the end of the String, you should use /#\z/.

At the end of the line, then.  ;)

> But indeed simply /#$/ gives "unterminated regexp meets end of file".
> After all $/ is a global variable (the input record separator), so it is only logical it interpolates it.

Even if it's not a (valid, defined) global variable, the parser still attempts to interpolate it.  For example: /#$]/  (there is no $] in ruby)

> Also, /regexp/ literal needs escape only for #, \ and / if I am not mistaken,
> which is quite restricted compared to what must be escaped in "" or %Q.

That's only partly true.  # only need be escaped when it is followed by $, @ or {.  Therein lies the source of a lot of confusion.  From what I can see, ruby-doc.org says "Arbitrary Ruby expressions can be embedded into patterns with the #{...} construct." which is very easy to miss, and it's not always clear that "#$x" or /#$x/ are part of the #{...} construct.

I admit that this is a standard part of ruby interpolation, but "#$x#@y" is not commonly encountered in the wild, and is much more likely to occur in a (symbol-rich) regexp than a (typically human readable) string.  Thus I propose an option to construct regexps that don't treat # as special.

Note: I'd still expect other backslash-escapes (like \u{...}) to work in uninterpolated regexps, because even uninterpolated regexps should be able to do normal perly things like %R/\u{263a}\n/
----------------------------------------
Feature #8714: Non-interpolated regular expression literal
https://bugs.ruby-lang.org/issues/8714#change-40782

Author: phluid61 (Matthew Kerwin)
Status: Open
Priority: Normal
Assignee: 
Category: core
Target version: 


=begin

I propose a new %string for non-interpolated regexp literals: %R

It is common to see erroneous bug reports around the use of ((%#%)) in regexp literals, for example where (({/[$#]/})) raises a syntax error "unexpected $undefined", and this confuses people.  The only solution is to rearrange the regular expression (such as (({/[$#]/}))), which is not always desirable.

An non-interpolated regexp, such as (({%R/[$#]/})), would allow a much simpler resolution.

=== Known Issues

* the capitalisation is the opposite of %Q(interpolated) and %q(uninterpolated)
* %R was also proposed for literal Rationals in #8430, although I believe this has been superseded by the (({1.2r})) syntax

=end



-- 
http://bugs.ruby-lang.org/