From: yoheimuta@... Date: 2021-03-02T09:51:03+00:00 Subject: [ruby-core:102706] [Ruby master Bug#17669] An exception still breaks monitor state and causes deadlock in 2.6.7 Issue #17669 has been reported by yoheimuta (Yohei Yoshimuta). ---------------------------------------- Bug #17669: An exception still breaks monitor state and causes deadlock in 2.6.7 https://bugs.ruby-lang.org/issues/17669 * Author: yoheimuta (Yohei Yoshimuta) * Status: Open * Priority: Normal * ruby -v: ruby 2.6.7p153 (2021-01-31 revision 67892) [x86_64-darwin19] * Backport: 2.5: UNKNOWN, 2.6: UNKNOWN, 2.7: UNKNOWN, 3.0: UNKNOWN ---------------------------------------- `lib/monitor.rb` provides Monitor. However, its state handling is weak for interrupts caused by Thread.kill for example timeout libraries even after introducing some uses of Thread.handle_interrupt at https://bugs.ruby-lang.org/issues/15992. Actually, timeout exception may happen everywhere. If it raised when the thread is executing right before the begin block, ```rb def mon_synchronize # Prevent interrupt on handling interrupts; for example timeout errors # it may break locking state. -> Thread.handle_interrupt(Exception => :never){ mon_enter } begin yield ensure Thread.handle_interrupt(EXCEPTION_NEVER){ mon_exit } end end ``` it breaks the state of the monitor and it causes deadlock. I can confirm that this happens either in 2.6.7 head and 2.6.6 release. ``` /bin/bash -c \ "date; ruby -v; ruby reproducible.rb; tail -n 10 /tmp/tmp.txt; date;" | tee ruby:2.6.7-macosx.log ``` ``` docker run -it --rm -v `pwd`:`pwd` -w `pwd` ruby:2.6.6-alpine3.13 /bin/ash -c \ "date; ruby -v; ruby reproducible.rb; tail -n 10 /tmp/tmp.txt; date;" | tee ruby:2.6.6-alpine3.13.log ``` Technically, 2.5.8 is also reproducible because it shares the same releated code. Incidentally, this doesn't happen in either 2.7.2 and 3.0.0 because [the monitor was reimplemented in C](https://bugs.ruby-lang.org/issues/16255). Our production busy puma servers have suffered this weakness susceptible to timeouts, which frequently causes completely hung worker threads in a process. The commit https://github.com/ruby/ruby/pull/4204/commits/e99c823f16918677b823255c44142910e02922c1 should fix this issue. ---Files-------------------------------- reproducible.rb (1.71 KB) ruby_2.6.6-alpine3.13.log (12.8 KB) ruby_2.6.7-macosx.log (3.73 KB) -- https://bugs.ruby-lang.org/ Unsubscribe: