From: sam.saffron@... Date: 2021-05-11T07:33:58+00:00 Subject: [ruby-core:103780] [Ruby master Feature#17837] Add support for Regexp timeouts Issue #17837 has been updated by sam.saffron (Sam Saffron). I tested with: ``` diff --git a/thread.c b/thread.c index 47e43ecb63..811b6e88a8 100644 --- a/thread.c +++ b/thread.c @@ -1573,25 +1573,29 @@ rb_thread_reg_match_time_limit_get() void rb_thread_reg_match_start(void) { - rb_thread_t *th = GET_THREAD(); if (reg_match_time_limit) { - th->reg_match_end_time = rb_hrtime_add(reg_match_time_limit, rb_hrtime_now()); - } - else { - th->reg_match_end_time = 0; + rb_thread_t *th = GET_THREAD(); + if (reg_match_time_limit) { + th->reg_match_end_time = rb_hrtime_add(reg_match_time_limit, rb_hrtime_now()); + } + else { + th->reg_match_end_time = 0; + } } } void rb_thread_reg_check_ints(void) { - rb_thread_t *th = GET_THREAD(); + if (reg_match_time_limit) { + rb_thread_t *th = GET_THREAD(); - if (th->reg_match_end_time && th->reg_match_end_time < rb_hrtime_now()) { - VALUE argv[2]; - argv[0] = rb_eRuntimeError; - argv[1] = rb_str_new2("regexp match timeout"); - rb_threadptr_raise(th, 2, argv); + if (th->reg_match_end_time && th->reg_match_end_time < rb_hrtime_now()) { + VALUE argv[2]; + argv[0] = rb_eRuntimeError; + argv[1] = rb_str_new2("regexp match timeout"); + rb_threadptr_raise(th, 2, argv); + } } rb_thread_check_ints(); ``` '10000000.times { /(abc)+/ =~ "abcabcabc" }' Before (min over 10 runs): 1.590 after 1.610 ~ 1.2% slower I can't figure out though how to squeeze back the perf on the big regex. ---------------------------------------- Feature #17837: Add support for Regexp timeouts https://bugs.ruby-lang.org/issues/17837#change-91893 * Author: sam.saffron (Sam Saffron) * Status: Open * Priority: Normal ---------------------------------------- ### Background ReDoS are a very common security issue. At Discourse we have seen a few through the years. https://owasp.org/www-community/attacks/Regular_expression_Denial_of_Service_-_ReDoS In a nutshell there are 100s of ways this can happen in production apps, the key is for an attacker (or possibly innocent person) to supply either a problematic Regexp or a bad string to test it with. ``` /A(B|C+)+D/ =~ "A" + "C" * 100 + "X" ``` Having a problem Regexp somewhere in a large app is a universal constant, it will happen as long as you are using Regexps. Currently the only feasible way of supplying a consistent safeguard is by using `Thread.raise` and managing all execution. This kind of pattern requires usage of a third party implementation. There are possibly issues with jRuby and Truffle when taking approaches like this. ### Prior art .NET provides a `MatchTimeout` property per: https://docs.microsoft.com/en-us/dotnet/api/system.text.regularexpressions.regex.matchtimeout?view=net-5.0 Java has nothing built in as far as I can tell: https://stackoverflow.com/questions/910740/cancelling-a-long-running-regex-match Node has nothing built in as far as I can tell: https://stackoverflow.com/questions/38859506/cancel-regex-match-if-timeout Golang and Rust uses RE2 which is not vulnerable to DoS by limiting features (available in Ruby RE2 gem) ``` irb(main):003:0> r = RE2::Regexp.new('A(B|C+)+D') => # irb(main):004:0> r.match("A" + "C" * 100 + "X") => nil ``` ### Proposal Implement `Regexp.timeout` which allow us to specify a global timeout for all Regexp operations in Ruby. Per Regexp would require massive application changes, almost all web apps would do just fine with a 1 second Regexp timeout. If `timeout` is set to `nil` everything would work as it does today, when set to second a "monitor" thread would track running regexps and time them out according to the global value. ### Alternatives I recommend against a "per Regexp" API as this decision is at the application level. You want to apply it to all regular expressions in all the gems you are consuming. I recommend against a move to RE2 at the moment as way too much would break ### See also: https://people.cs.vt.edu/davisjam/downloads/publications/Davis-Dissertation-2020.pdf https://levelup.gitconnected.com/the-regular-expression-denial-of-service-redos-cheat-sheet-a78d0ed7d865 -- https://bugs.ruby-lang.org/ Unsubscribe: