From: "Eregon (Benoit Daloze) via ruby-core" Date: 2023-07-20T09:24:51+00:00 Subject: [ruby-core:114244] [Ruby master Feature#19720] Warning for non-linear Regexps Issue #19720 has been updated by Eregon (Benoit Daloze). @matz Given ReDoS are AFAIK the most common (by far) security vulnerability in Ruby, it warrants a reliable way to detect it, fix it and avoid it. This way seems the best to me. In fact I think only this approach or something very similar would be able to truly address ReDoS in Ruby. For instance https://www.ruby-lang.org/en/news/ shows there were 3 ReDoS in stdlib in just 3 months (28 March - 29 June). The warning would be opt-in, like performance warnings, so it would not be disruptive. For gem (and app) maintainers which care about security, I think it would make sense to enable the warning and potentially customize `Warning.warn` for the :regexp category to `raise` (as shown in description). That would discourage gems to use these often unsafe and non-linear regexp features. I think this is exactly what we want for the Ruby community to solve this very frequent and serious security problem. People can still use back references (or other extended regexp) features in their local scripts/programs if they like, and it won't even warn them unless they choose to enable regexp warnings. BTW the exact list of features not supported is at https://bugs.ruby-lang.org/issues/19104#note-3 I believe these Regexp features are not necessary for 99+% gems, and not using them is by far the easiest way to guarantee no ReDoS is possible. As we know, it's very very hard to estimate if a Regexp is susceptible to ReDoS if it's not detected as linear by the regexp engine (IOW it's very easy for programmers to accidentally write a non-linear regexp without the help of this warning). @matz Could you reconsider? Maybe I should attend a dev meeting to understand what are the concerns about this feature? ---------------------------------------- Feature #19720: Warning for non-linear Regexps https://bugs.ruby-lang.org/issues/19720#change-103932 * Author: Eregon (Benoit Daloze) * Status: Open * Priority: Normal ---------------------------------------- I believe the best way to solve ReDoS is to ensure all Regexps used in the process are linear. Using `Regexp.timeout = 5.0` or so does not really prevent ReDoS, given enough requests causing that timeout the servers will still be very unresponsive. To this purpose, we should make it easy to identify non-linear Regexps and fix them. I suggest we either use 1. a performance warning (enabled with `Warning[:performance] = true`, #19538) or 2. a new regexp warning category (enabled with `Warning[:regexp] = true`). I think we should warn only once per non-linear Regexp, to avoid too many such warnings. We could warn as soon as the Regexp is created, or on first match. On first match might makes more sense for Ruby implementations which compile the Regexp lazily (since that is costly during startup), and also avoids warning for Regexps which are never used (which can be good or bad). OTOH, if the warning is enabled, we could always compile the Regexp eagerly (or at least checks whether it's linear), and that would then provide a better way to guarantee that all Regexps created so far are linear. Because warnings are easily customizable, it is also possible to e.g. `raise/abort` on such a warning, if one wants to ensure their application does not use a non-linear Regexp and so cannot be vulnerable to ReDoS: ```ruby Warning.extend Module.new { def warn(message, category: nil, **) raise message if category == :regexp super end } ``` A regexp warning category seems better for that as it makes it easy to filter by category, if a performance warning one would need to match the message which is less clean. As a note, TruffleRuby already has a similar warning, as a command-line option: ``` $ truffleruby --experimental-options --warn-truffle-regex-compile-fallback -e 'Gem' truffleruby-dev/lib/mri/rubygems/version.rb:176: warning: Regexp /\A\s*([0-9]+(?>\.[0-9a-zA-Z]+)*(-[0-9A-Za-z-]+(\.[0-9A-Za-z-]+)*)?)?\s*\z/ at_start=false encoding=US-ASCII requires backtracking and will not match in linear time truffleruby-dev/lib/mri/rubygems/requirement.rb:105: warning: Regexp /\A\s*(=|!=|>|<|>=|<=|~>)?\s*([0-9]+(?>\.[0-9a-zA-Z]+)*(-[0-9A-Za-z-]+(\.[0-9A-Za-z-]+)*)?)\s*\z/ at_start=false encoding=US-ASCII requires backtracking and will not match in linear time ``` So the warning message could be like `FILE:LINE: warning: Regexp /REGEXP/ requires backtracking and might not match in linear time and might cause ReDoS` or more concise: `FILE:LINE: warning: Regexp /REGEXP/ requires backtracking and might cause ReDoS` -- https://bugs.ruby-lang.org/ ______________________________________________ ruby-core mailing list -- ruby-core@ml.ruby-lang.org To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org ruby-core info -- https://ml.ruby-lang.org/mailman3/postorius/lists/ruby-core.ml.ruby-lang.org/