From: "kjtsanaktsidis (KJ Tsanaktsidis) via ruby-core" <ruby-core@...>
Date: 2023-08-18T14:28:21+00:00
Subject: [ruby-core:114405] [Ruby master Bug#19837] Concurrent calls to Process.waitpid2 misbehave on Ruby 3.1 & 3.2

Issue #19837 has been updated by kjtsanaktsidis (KJ Tsanaktsidis).


OK - I hope I've done this right, let me know otherwise :)

* Ruby 3.0: https://github.com/ruby/ruby/pull/8248
* Ruby 3.1: https://github.com/ruby/ruby/pull/8246
* Ruby 3.2: https://github.com/ruby/ruby/pull/8247

Also, I opened a PR on the main branch which adds _just_ the test I wrote for this issue - the existing implementation passes it though so there are no other changes needed: https://github.com/ruby/ruby/pull/8245

----------------------------------------
Bug #19837: Concurrent calls to Process.waitpid2 misbehave on Ruby 3.1 & 3.2
https://bugs.ruby-lang.org/issues/19837#change-104155

* Author: kjtsanaktsidis (KJ Tsanaktsidis)
* Status: Open
* Priority: Normal
* ruby -v: ruby 3.1.4p236 (2023-07-26 revision a8670865c0) [arm64-darwin22]
* Backport: 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN
----------------------------------------
On Ruby 3.1 & 3.2, if you have one thread blocked into a directed call to `Process.waitpid2` with a pid specified, a concurrent call to `Process.waitpid2 -1` will not be able to find & reap any other terminated child process, even one with a different pid that is not individually being waited on.

I've attached a Ruby program which should terminate but doesn't as a result of this bug, as well as a C program which demonstrates that the underlying syscalls (at least on Linux) do behave how you would expect. My reproduction creates two processes; a long-running process that does not exit, and a short one which does. There is a background thread calling `Process.waitpid2` on the long process. Then, a concurrent call to `Process.waitpid2 -1` does not notice that the short-running process has exited.

My `wait_bug.rb` program _does_ work properly and terminate on the current master branch of Ruby; I assume this is because all of the MJIT-related process management stuff with the waiting_pids & stuff has been cleaned up as part of the MJIT -> RJIT refactoring. Because of this, I'm not sure exactly how to make a patch; should I open a pair of PRs targeting the `ruby_3_2` and `ruby_3_1` branch?

Thanks!

---Files--------------------------------
wait_bug.c (1.55 KB)
wait_bug.rb (735 Bytes)


-- 
https://bugs.ruby-lang.org/
 ______________________________________________
 ruby-core mailing list -- ruby-core@ml.ruby-lang.org
 To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
 ruby-core info -- https://ml.ruby-lang.org/mailman3/postorius/lists/ruby-core.ml.ruby-lang.org/