[ruby-core:121127] [Ruby master Bug#21150] Segfault in Ractor messes up libunwind (c backtrace info)
From:
"luke-gru (Luke Gruber) via ruby-core" <ruby-core@...>
Date:
2025-02-19 20:16:18 UTC
List:
ruby-core #121127
Issue #21150 has been reported by luke-gru (Luke Gruber).
----------------------------------------
Bug #21150: Segfault in Ractor messes up libunwind (c backtrace info)
https://bugs.ruby-lang.org/issues/21150
* Author: luke-gru (Luke Gruber)
* Status: Open
* Backport: 3.1: UNKNOWN, 3.2: UNKNOWN, 3.3: UNKNOWN, 3.4: UNKNOWN
----------------------------------------
On Macos Arm64 with llvm18, libunwind fails with SEGV when called within a Ractor. So, the bug
report fails with a SEGV and quits right before giving C backtrace information. It looks like:
```
-- Ruby level backtrace information ----------------------------------------
../ruby/test.rb:49:in 'block in <main>'
<internal:ractor>:902:in 'fail_assert'
-- Threading information ---------------------------------------------------
Total ractor count: 2
Ruby thread count for this ractor: 1
-- C level backtrace information -------------------------------------------
<internal:ractor>:902: [BUG] Segmentation fault at 0xfffffffffffffff8
```
It tried to dereference the value -8, it looks like.
### To reproduce:
`test.rb`:
```ruby
r = Ractor.new do
Ractor.fail_assert # to produce a bug report
end
r.take
```
`ractor.rb`:
```ruby
def self.fail_assert
__builtin_cexpr! %q{
VM_ASSERT(0), Qfalse
}
end
```
### System info:
`clang --version`:
Homebrew clang version 18.1.8
Target: arm64-apple-darwin24.3.0
Thread model: posix
`otool -L miniruby`:
/opt/homebrew/opt/llvm@18/lib/libunwind.1.dylib (compatibility version 1.0.0, current version 1.0.0)
I haven't tried to reproduce it on another system, but I did try with clang 16 and got the same results.
### Possible Causes
This is just a guess, but I think the coroutine context switching is messing up libunwind's stack unwinding heuristic.
### Other issues that this causes
Right now, if ruby receives a SEGV in a ractor, it tries to print the bug report and then receives another SEGV when
running the libunwind code. This hangs the program because the sigaction for the SEGV signal was installed without SA_NODEFER, so
that SEGV is blocked (masked) by the running handler. The program can't make any forward progress, so it hangs. The solution
here is just to install the fatal handlers with SA_NODEFER. There is code already that checks if the bug report has already been called
and it just aborts the process.
--
https://bugs.ruby-lang.org/
______________________________________________
ruby-core mailing list -- ruby-core@ml.ruby-lang.org
To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
ruby-core info -- https://ml.ruby-lang.org/mailman3/lists/ruby-core.ml.ruby-lang.org/