From: eregontp@... Date: 2018-08-19T21:42:54+00:00 Subject: [ruby-core:88557] [Ruby trunk Bug#14999] ConditionVariable doesn't reacquire the Mutex if Thread#kill-ed Issue #14999 has been updated by Eregon (Benoit Daloze). normalperson (Eric Wong) wrote: > r64467 seems to make CI happy, at least Great! > I figured out I needed to reproduce the failure > reliably by using a single CPU (schedtool -a 0x1 -e ...) I could also verify it fails before your commit, and passes after with: taskset -c 1 mspec-run -R1000 library/conditionvariable/wait_spec.rb > "git bisect" points to r63498 ("thread_pthread.c: enable thread > cache by default"), which is HIGHLY unexpected. So it seems it doesn't happen in Ruby <= 2.5 due to thread caching being disabled then? > I'm not seeing how thread caching can be the cause of the bug, > but rather it makes an existing bug more apparent by being > faster and hitting the race sooner. No idea either, but I guess a bug could lurk in there. I did experience a few weird bugs when doing thread caching in TruffleRuby (e.g., inline caches based on pthread_self() are incorrect as multiple Ruby threads could use the same pthread thread) > What thread-caching does do is make the MSPECOPT "-R" option > noticeably faster Right, because creating threads is so much faster. Maybe the fact pthread startup is done each time has some effect on races (e.g., the pthread thread has a perfect view of the caller thread when starting, but not true with caching, but maybe the GVL still ensure that) > I think the only way to work reliably is to disable interrupt > checking when attempting to lock a mutex in ensure: At least if the interrupt causes to escape the function (e.g., Thread#raise/Thread#kill), then that escape must be delayed after re-acquiring the Mutex. In normal Mutex#lock and Mutex#synchronize, it's fine to raise/kill before taking the Mutex, because it's before entering the critical section, but unfortunately here we are inside the critical section. > it could still affect user-experience if Ctrl-C takes a while When would that happen? If some other Thread holds the Mutex too long? It's probably never a good idea to hold a Mutex for seconds or a long time, especially when using a ConditionVariable. It seems fair enough to delay the interrupt in such a case, as the main Thread is waiting for the Mutex. Maybe we could warn if re-acquiring takes too long and it becomes a problem in practice (I think it won't). ---------------------------------------- Bug #14999: ConditionVariable doesn't reacquire the Mutex if Thread#kill-ed https://bugs.ruby-lang.org/issues/14999#change-73615 * Author: Eregon (Benoit Daloze) * Status: Closed * Priority: Normal * Assignee: normalperson (Eric Wong) * Target version: * ruby -v: ruby 2.6.0dev (2018-08-16 trunk 64394) [x86_64-linux] * Backport: 2.3: UNKNOWN, 2.4: UNKNOWN, 2.5: UNKNOWN ---------------------------------------- These specs reproduce the issue: https://raw.githubusercontent.com/oracle/truffleruby/master/spec/ruby/library/conditionvariable/wait_spec.rb I can commit them, but of course they will fail. Feel free to just copy it to MRI's spec/ruby/library/conditionvariable/wait_spec.rb (I would do it when synchronizing specs anyway) I'd guess it's caused by the recent timer thread changes or so. This spec works fine on 2.5.1 and older. Just copy-pasting the spec out allows for a standalone reproducer: ~~~ ruby m = Mutex.new cv = ConditionVariable.new in_synchronize = false owned = nil th = Thread.new do m.synchronize do in_synchronize = true begin cv.wait(m) ensure owned = m.owned? $stderr.puts "\nThe Thread doesn't own the Mutex!" unless owned end end end # wait for m to acquire the mutex Thread.pass until in_synchronize # wait until th is sleeping (ie waiting) Thread.pass while th.status and th.status != "sleep" m.synchronize { cv.signal # Wait that the thread is blocked on acquiring the Mutex sleep 0.001 # Kill the thread, yet the thread should first acquire the Mutex before going on th.kill } th.join ~~~ Result on trunk: ~~~ The Thread doesn't own the Mutex! # terminated with exception (report_on_exception is true): Traceback (most recent call last): 1: from cv_bug.rb:7:in `block in
' cv_bug.rb:7:in `synchronize': Attempt to unlock a mutex which is not locked (ThreadError) Traceback (most recent call last): 1: from cv_bug.rb:7:in `block in
' cv_bug.rb:7:in `synchronize': Attempt to unlock a mutex which is not locked (ThreadError) ~~~ -- https://bugs.ruby-lang.org/ Unsubscribe: