From: dazuma@... Date: 2018-12-14T21:32:47+00:00 Subject: [ruby-core:90534] [Ruby trunk Bug#15362] [PATCH] Avoid GCing dead stack after switching away from a fiber Issue #15362 has been updated by dazuma (Daniel Azuma). Did this get backported to ruby_2_5? I don't see a corresponding commit in the github mirror https://github.com/ruby/ruby/commits/ruby_2_5 ---------------------------------------- Bug #15362: [PATCH] Avoid GCing dead stack after switching away from a fiber https://bugs.ruby-lang.org/issues/15362#change-75687 * Author: alanwu (Alan Wu) * Status: Closed * Priority: Normal * Assignee: ioquatix (Samuel Williams) * Target version: 2.6 * ruby -v: * Backport: 2.5: REQUIRED ---------------------------------------- Hello! I have a patch that fixes Bug #14561. It's not a platform specific issue but it affects the default build configuration for MacOS and is causing segfaults on 2.5.x. I've put the test for this in a separate patch because I'm not sure if we want to have a 5 second test that only matters for non-default build configs and doesn't catch things reliably on Linux. I tested this on both trunk and ruby_2_5, on MacOS and on Linux, on various build configs. Please let me know if anything in my understanding is wrong. I've pasted my commit message below. ---- Fibers save execution contextes, and execution contexts include a native stack pointer. It may happen that a Fiber outlive the native thread it executed on. Consider the following code adapted from Bug #14561: ```ruby enum = Enumerator.new { |y| y << 1 } thread = Thread.new { enum.peek } # fiber constructed inside the # block and saved inside `enum` thread.join sleep 5 # thread finishes and thread cache wait time runs out. # Native thread exits, possibly freeing its stack. GC.start # segfault because GC tires to mark the dangling stack pointer # inside `enum`'s fiber ``` The problem is masked by FIBER_USE_COROUTINE and FIBER_USE_NATIVE, as those implementations already do what this commit does. Generally on Linux systems, FIBER_USE_NATIVE is 1 even when one uses `./configure --disable-fiber-coroutine`, since most Linux systems have getcontext() and setcontext() which turns on FIBER_USE_NATIVE. (compile with `make DEFS="-DFIBER_USE_NATIVE=0" to explicitly disable it) Furthermore, when both FIBER_USE_COROUTINE and FIBER_USE_NATIVE are off, and the GC reads from the stack of a dead native thread, MRI does not segfault on Linux. This is probably due to libpthread not marking the page where the dead stack lives as unreadable. Nevertheless, this use-after-free is visible through Valgrind. On ruby_2_5, this is an acute problem, since it doesn't have FIBER_USE_COROUTINE. Thread cache is also unavailable for 2.5.x, triggering this issue more often. (thread cache gives this bug a grace period since it makes native threads wait a little before exiting) This issue is very visible on MacOS on 2.5.x since libpthread marks the dead stack as unreadable, consistently turning this use-after-free into a segfault. Fixes Bug #14561 * cont.c: Set saved_ec.machine.stack_end to NULL when switching away from a fiber to keep the GC marking it. `saved_ec` gets rehydrated with a stack pointer if/when the fiber runs again. ---Files-------------------------------- 0001-Avoid-GCing-dead-stack-after-switching-away-from-a-f.patch (2.63 KB) 0001-Add-a-test-for-Bug-14561.patch (1.21 KB) -- https://bugs.ruby-lang.org/ Unsubscribe: