[ruby-core:75048] [Ruby trunk Bug#12298] Indeterministic ruby behavior when another thread is killed

From: drbrain@...7.net
Date: 2016-04-20 18:17:13 UTC
List: ruby-core #75048
Issue #12298 has been updated by Eric Hodel.


In ruby_cleanup:

First we mark the main thread as killed: https://github.com/ruby/ruby/blob/trunk/eval.c#L186

Then we kill all threads: https://github.com/ruby/ruby/blob/trunk/eval.c#L191

Then we start at_exit hooks: (via ruby_finalize_1) https://github.com/ruby/ruby/blob/trunk/eval.c#L222

Sometimes ruby switches to the killed thread right away, so it sleeps in the ensure in your test case.  It can't switch back to the main thread because it is marked as killed.

Sometimes ruby doesn't switch to the killed thread so it gets two signals and never sleeps (because the second arrives in the ensure when it tries to sleep).

----------------------------------------
Bug #12298: Indeterministic ruby behavior when another thread is killed
https://bugs.ruby-lang.org/issues/12298#change-58171

* Author: Robert Pankowecki
* Status: Rejected
* Priority: Normal
* Assignee: 
* ruby -v: ruby 2.3.0p0 (2015-12-25 revision 53290) [x86_64-linux]
* Backport: 2.1: UNKNOWN, 2.2: UNKNOWN, 2.3: UNKNOWN
----------------------------------------
```
#!ruby
require 'securerandom'
class MyThread < ::Thread; end

def delay
  15
end

def run
  loop { work }
rescue Exception => e
  puts "#{Time.now} Exception"
ensure
  puts "#{Time.now} stopping agent"
end


def work
  puts "#{Time.now} start work"
  10_000_000.times { SecureRandom.hex }
  puts "finished work"
rescue StandardError => e
  puts "#{Time.now} Error"
ensure
  puts "#{Time.now} start sleep"
  sleep(delay)
  puts "#{Time.now} finished sleep"
end

t = MyThread.new{ run }

at_exit do
  puts "#{Time.now} killing thread"
  Thread.kill(t)
  puts "#{Time.now} killed thread"
end

sleep(10)
exit
```

I tried running this script multiple times in ruby 2.1.10, 2.2.4, 2.3.0 and I get inconsistent behavior. Sometimes the main thread does not wait for the second thread.

```
2016-04-15 11:07:09 +0200 start work
2016-04-15 11:07:19 +0200 killing thread
2016-04-15 11:07:19 +0200 killed thread
2016-04-15 11:07:19 +0200 stopping agent
```

And sometimes it does.

```
2016-04-15 11:07:26 +0200 start work
2016-04-15 11:07:36 +0200 killing thread
2016-04-15 11:07:36 +0200 killed thread
2016-04-15 11:07:36 +0200 start sleep
2016-04-15 11:07:51 +0200 finished sleep
2016-04-15 11:07:51 +0200 stopping agent
```

I appears that under higher CPU usage the 2nd scenario is more likely. In normal conditions the 1st happens more often probably.

I described [the whole story in my blogpost](http://blog.arkency.com/2016/04/how-i-hunted-the-most-odd-ruby-bug/)

I am not sure which behavior is ruby default (I assume not waiting for other threads) but sometimes apparently ruby does wait for other threads to finish.

---Files--------------------------------
a.txt (3.09 KB)
b.txt (3.58 KB)
12998.rb (508 Bytes)


-- 
https://bugs.ruby-lang.org/

Unsubscribe: <mailto:ruby-core-request@ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>

In This Thread

Prev Next