[ruby-core:113841] [Ruby master Feature#19720] Warning for non-linear Regexps
From:
duerst via ruby-core <ruby-core@...>
Date:
2023-06-09 06:32:51 UTC
List:
ruby-core #113841
Issue #19720 has been updated by duerst (Martin D=FCrst).
Introducing such a warning *might* be a good idea. But there are several is=
sues:
1) The warning should only be used when asked for with an option (i.e. defa=
ult off).
2) To a very large extent, whether a regular expression is linear or not de=
pends on the implementation. (The recent improvements to the CRuby implemen=
tation show that very clearly). Does that mean that different implementatio=
ns would warn for different regular expressions?
3) In some cases, it may not be possible to conclusively say whether a regu=
lar expression will run in linear time or not. The proposed warning text ma=
kes this clear with the word "might".
4) Non-linear can be quadratic, cubic, or exponential, and so on. A quadrat=
ic case on data with limited length (e.g. out of a database with fixed fiel=
d lengths) might be absolutely harmless. Even an exponential case on very s=
hort data can be harmless.
5) In many cases, the only person who would do a DoS attack would be the pr=
ogrammer him/herself.
6) Overall, careful design and implementation is needed to make sure that t=
his doesn't become a "crying wolf" warning that quickly gets deactivated an=
d then no longer helps anybody.
----------------------------------------
Feature #19720: Warning for non-linear Regexps
https://bugs.ruby-lang.org/issues/19720#change-103489
* Author: Eregon (Benoit Daloze)
* Status: Open
* Priority: Normal
----------------------------------------
I believe the best way to solve ReDoS is to ensure all Regexps used in the =
process are linear.
Using `Regexp.timeout =3D 5.0` or so does not really prevent ReDoS, given e=
nough requests causing that timeout the servers will still be very unrespon=
sive.
To this purpose, we should make it easy to identify non-linear Regexps and =
fix them.
I suggest we either use
1. a performance warning (enabled with `Warning[:performance] =3D true`, #1=
9538) or
2. a new regexp warning category (enabled with `Warning[:regexp] =3D true`).
I think we should warn only once per non-linear Regexp, to avoid too many s=
uch warnings.
We could warn as soon as the Regexp is created, or on first match.
On first match might makes more sense for Ruby implementations which compil=
e the Regexp lazily (since that is costly during startup), and also avoids =
warning for Regexps which are never used (which can be good or bad).
OTOH, if the warning is enabled, we could always compile the Regexp eagerly=
(or at least checks whether it's linear), and that would then provide a be=
tter way to guarantee that all Regexps created so far are linear.
Because warnings are easily customizable, it is also possible to e.g. `rais=
e/abort` on such a warning, if one wants to ensure their application does n=
ot use a non-linear Regexp and so cannot be vulnerable to ReDoS:
```ruby
Warning.extend Module.new {
def warn(message, category: nil, **)
raise message if category =3D=3D :regexp
super
end
}
```
A regexp warning category seems better for that as it makes it easy to filt=
er by category, if a performance warning one would need to match the messag=
e which is less clean.
As a note, TruffleRuby already has a similar warning, as a command-line opt=
ion:
```
$ truffleruby --experimental-options --warn-truffle-regex-compile-fallback =
-e 'Gem'
truffleruby-dev/lib/mri/rubygems/version.rb:176: warning: Regexp /\A\s*([0-=
9]+(?>\.[0-9a-zA-Z]+)*(-[0-9A-Za-z-]+(\.[0-9A-Za-z-]+)*)?)?\s*\z/ at_start=
=3Dfalse encoding=3DUS-ASCII requires backtracking and will not match in li=
near time
truffleruby-dev/lib/mri/rubygems/requirement.rb:105: warning: Regexp /\A\s*=
(=3D|!=3D|>|<|>=3D|<=3D|~>)?\s*([0-9]+(?>\.[0-9a-zA-Z]+)*(-[0-9A-Za-z-]+(\.=
[0-9A-Za-z-]+)*)?)\s*\z/ at_start=3Dfalse encoding=3DUS-ASCII requires back=
tracking and will not match in linear time
```
So the warning message could be like
`FILE:LINE: warning: Regexp /REGEXP/ requires backtracking and might not ma=
tch in linear time and might cause ReDoS`
or more concise:
`FILE:LINE: warning: Regexp /REGEXP/ requires backtracking and might cause =
ReDoS`
--=20
https://bugs.ruby-lang.org/
______________________________________________
ruby-core mailing list -- ruby-core@ml.ruby-lang.org
To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
ruby-core info -- https://ml.ruby-lang.org/mailman3/postorius/lists/ruby-c=
ore.ml.ruby-lang.org/