From: charlie@... Date: 2017-08-09T23:49:16+00:00 Subject: [ruby-core:82321] [Ruby trunk Bug#13794] Infinite loop of sched_yield Issue #13794 has been updated by catphish (Charlie Smurthwaite). > Can you also check the value of timer_thread_pipe.owner_process? I don't have any broken processes available right now, but I will check as soon as I can. > How about checking owner_process before incrementing? I'm afraid this fix doesn't quite match up in my mind. To clarify, I am suggesting that timer_thread_pipe.writing is being incremented in the parent process before the fork occurs. This would still occur because the PID would match at that point. ---------------------------------------- Bug #13794: Infinite loop of sched_yield https://bugs.ruby-lang.org/issues/13794#change-66120 * Author: catphish (Charlie Smurthwaite) * Status: Open * Priority: Normal * Assignee: * Target version: * ruby -v: ruby 2.3.4p301 (2017-03-30 revision 58214) [x86_64-linux] * Backport: 2.2: UNKNOWN, 2.3: UNKNOWN, 2.4: UNKNOWN ---------------------------------------- I have been encountering an issue with processes hanging in an infinite loop of calling sched_yield(). The looping code can be found at https://github.com/ruby/ruby/blob/v2_3_4/thread_pthread.c#L1663 while (ATOMIC_CAS(timer_thread_pipe.writing, (rb_atomic_t)0, 0)) { native_thread_yield(); } It is my belief that by some mechanism I have not been able to identify, timer_thread_pipe.writing is incremented but it never decremented, causing this loop to run infinitely. I am not able to create a reproducible test case, however this issue occurs regularly in my production application. I have attached backtraces and thread lists from 2 processes exhibiting this behaviour. gdb confirms that timer_thread_pipe.writing = 1 in these processes. I believe one possibility of the cause is that rb_thread_wakeup_timer_thread() or rb_thread_wakeup_timer_thread_low() is called, and before it returns, another thread calls fork(), leaving the value of timer_thread_pipe.writing incremented, but leaving behind the thread that would normally decrement it. If this is correct, one solution would be to reset timer_thread_pipe.writing to 0 in native_reset_timer_thread() immediately after a fork. Other examples of similar bugs being reported: https://github.com/resque/resque/issues/578 https://github.com/zk-ruby/zk/issues/50 ---Files-------------------------------- backtrace_1.txt (14 KB) backtrace_2.txt (10.9 KB) -- https://bugs.ruby-lang.org/ Unsubscribe: