From: "ivoanjo (Ivo Anjo) via ruby-core" Date: 2023-09-06T14:19:02+00:00 Subject: [ruby-core:114660] [Ruby master Bug#19865] Segfault when calling user signal handlers during VM shutdown Issue #19865 has been updated by ivoanjo (Ivo Anjo). Thanks for the quick fix! :) ---------------------------------------- Bug #19865: Segfault when calling user signal handlers during VM shutdown https://bugs.ruby-lang.org/issues/19865#change-104490 * Author: ivoanjo (Ivo Anjo) * Status: Open * Priority: Normal * ruby -v: ruby 3.2.2 (2023-03-30 revision e51014f9c0) [x86_64-linux] * Backport: 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN ---------------------------------------- Howdy ����! I work for Datadog [on the `ddtrace` gem](https://github.com/DataDog/dd-trace-rb) . I've found this issue while investigating a [customer crash report](https://github.com/DataDog/dd-trace-rb/issues/2980). ### Background The original issue was found in a production app. A number of things need to be in play to cause it. The [`ruby-odbc`](https://rubygems.org/gems/ruby-odbc) gem provides a way of accessing databases through the ODBC API. It wraps a database connection as a Data object, with a free function that, prior to freeing the native resources, disconnects from the database if the connection was still active. Because disconnecting from the database is a blocking operation, the gem (reasonably, in my opinion), releases the global VM lock before disconnecting. The trigger for the crash is: 1. The app in question used puma, and puma installs a `Signal.trap('TERM')` 2. The database object was still connected when the app started to shut down 3. A VM shutdown starts... 4. Half-way through shutdown, the VM received a SIGTERM signal, and queued it for processing 5. The VM calls the free function on all objects 6. The ruby-odbc gem sees there's an active database connection, and tries to release the GVL to call the blocking disconnect 7. Before releasing the GVL, the VM checks for pending interruptions 8. The VM tries to run the Ruby-level signal handler method half-way through VM shutdown, when you can no longer run Ruby code 9. Segfault ### How to reproduce (Ruby version & script) I was able to reproduce on a minimal example on Ruby 3.2.2 (`ruby 3.2.2 (2023-03-30 revision e51014f9c0) [x86_64-linux]`) and recent master (`ruby 3.3.0dev (2023-08-17T07:30:01Z master d26b015e83) [x86_64-linux]`). I've put the test-case up on github as well , but here's the important bits: `signal-bug-testcase.rb`: ```ruby require 'signal_bug_testcase' Signal.trap("TERM") { puts "Hello, world" } FOO = SignalBugTestcase.new ``` `signal_bug_testcase.c`: ```ruby #include #include #include #include typedef struct { int dummy; } BugTestcase; void *test_nogvl(void *unused) { fprintf(stderr, "GVL released!\n"); return NULL; } static void bug_testcase_free(void* ptr) { fprintf(stderr, "Free getting called! Sending signal...\n"); kill(getpid(), SIGTERM); fprintf(stderr, "SIGTERM signal queued, trying to release GVL...\n"); rb_thread_call_without_gvl(test_nogvl, NULL, NULL, NULL); fprintf(stderr, "After releasing GVL!\n"); free(ptr); } const rb_data_type_t bug_testcase_data_type = { .wrap_struct_name = "SignalBugTestcase", .function = { NULL, bug_testcase_free, NULL }, .flags = RUBY_TYPED_FREE_IMMEDIATELY }; VALUE bug_testcase_alloc(VALUE klass) { BugTestcase* obj = calloc(1, sizeof(BugTestcase)); return TypedData_Make_Struct(klass, BugTestcase, &bug_testcase_data_type, obj); } void Init_signal_bug_testcase(void) { VALUE cBugTestcase = rb_define_class("SignalBugTestcase", rb_cObject); rb_define_alloc_func(cBugTestcase, bug_testcase_alloc); } ``` ### Expectation and result No segfault happens. Interestingly, on Ruby 2.7, the VM exits half-way through but doesn't always segfault, but running it a few times always triggers the issue. On 3.2+ it crashes every time for me. I suspect the right thing here is to no longer accept/try to run any Ruby-level signal handlers after VM shutdown starts. Here's what I see with this test-case: ``` $ bundle exec ruby lib/signal-bug-testcase.rb Free getting called! Sending signal... SIGTERM signal queued, trying to release GVL... lib/signal-bug-testcase.rb:3: [BUG] Segmentation fault at 0x0000000000000007 ruby 3.2.2 (2023-03-30 revision e51014f9c0) [x86_64-linux] -- Control frame information ----------------------------------------------- c:0003 p:---- s:0011 e:000010 CFUNC :puts c:0002 p:0005 s:0006 e:000005 BLOCK lib/signal-bug-testcase.rb:3 [FINISH] c:0001 p:0000 s:0003 E:0001a0 DUMMY [FINISH] -- Ruby level backtrace information ---------------------------------------- lib/signal-bug-testcase.rb:3:in `block in
' lib/signal-bug-testcase.rb:3:in `puts' -- Machine register context ------------------------------------------------ RIP: 0x000070aa64cedbe7 RBP: 0x000070aa648e8fd0 RSP: 0x00007ffc5c057608 RAX: 0x0000000000004171 RBX: 0x00007ffc5c057630 RCX: 0x0000000000000001 RDX: 0x00007ffc5c057630 RDI: 0x0000000000000007 RSI: 0x0000000000004171 R8: 0x000000000000021b R9: 0x0000000000000000 R10: 0x000070aa63eff048 R11: 0x0000000000000000 R12: 0x000070aa648e8fd0 R13: 0x0000000000004171 R14: 0x0000000000000007 R15: 0x000070aa648e8ff0 EFL: 0x0000000000010202 -- C level backtrace information ------------------------------------------- ruby-3.2.2/lib/libruby.so.3.2(rb_print_backtrace+0xd) [0x70aa64d5bb5f] ruby-3.2.2/vm_dump.c:785 ruby-3.2.2/lib/libruby.so.3.2(rb_vm_bugreport) ruby-3.2.2/vm_dump.c:1080 ruby-3.2.2/lib/libruby.so.3.2(rb_bug_for_fatal_signal+0xf4) [0x70aa64b52164] ruby-3.2.2/error.c:813 ruby-3.2.2/lib/libruby.so.3.2(sigsegv+0x4d) [0x70aa64cab0fd] ruby-3.2.2/signal.c:964 /lib/x86_64-linux-gnu/libc.so.6(0x70aa64642520) [0x70aa64642520] ruby-3.2.2/lib/libruby.so.3.2(hash_table_index+0x0) [0x70aa64cedbe7] ruby-3.2.2/symbol.h:72 ruby-3.2.2/lib/libruby.so.3.2(rb_id_table_lookup) ruby-3.2.2/id_table.c:230 ruby-3.2.2/lib/libruby.so.3.2(cached_callable_method_entry+0x24) [0x70aa64d356bb] ruby-3.2.2/vm_method.c:1295 ruby-3.2.2/lib/libruby.so.3.2(callable_method_entry_or_negative) ruby-3.2.2/vm_method.c:1365 ruby-3.2.2/lib/libruby.so.3.2(callable_method_entry) ruby-3.2.2/vm_method.c:1402 ruby-3.2.2/lib/libruby.so.3.2(rb_callable_method_entry) ruby-3.2.2/vm_method.c:1409 ruby-3.2.2/lib/libruby.so.3.2(gccct_method_search_slowpath+0x38) [0x70aa64d36258] ruby-3.2.2/vm_eval.c:434 ruby-3.2.2/lib/libruby.so.3.2(rb_call0+0x267) [0x70aa64d4e877] ruby-3.2.2/vm_eval.c:483 ruby-3.2.2/lib/libruby.so.3.2(rb_call+0x32) [0x70aa64d4f406] ruby-3.2.2/vm_eval.c:877 ruby-3.2.2/lib/libruby.so.3.2(rb_funcallv_kw) ruby-3.2.2/vm_eval.c:1074 ruby-3.2.2/lib/libruby.so.3.2(vm_call_cfunc_with_frame+0x127) [0x70aa64d30277] ruby-3.2.2/vm_insnhelper.c:3268 ruby-3.2.2/lib/libruby.so.3.2(vm_sendish+0x97) [0x70aa64d407a4] ruby-3.2.2/vm_insnhelper.c:5080 ruby-3.2.2/lib/libruby.so.3.2(vm_exec_core) ruby-3.2.2/insns.def:820 ruby-3.2.2/lib/libruby.so.3.2(rb_vm_exec+0xd3) [0x70aa64d460d3] ruby-3.2.2/vm.c:2374 ruby-3.2.2/lib/libruby.so.3.2(rb_vm_invoke_proc+0x5f) [0x70aa64d4bfcf] ruby-3.2.2/vm.c:1603 ruby-3.2.2/lib/libruby.so.3.2(vm_call0_body+0x5df) [0x70aa64d4c5cf] ruby-3.2.2/vm_eval.c:274 ruby-3.2.2/lib/libruby.so.3.2(vm_call0_cc+0x77) [0x70aa64d4e7e7] ruby-3.2.2/vm_eval.c:87 ruby-3.2.2/lib/libruby.so.3.2(rb_call0) ruby-3.2.2/vm_eval.c:551 ruby-3.2.2/lib/libruby.so.3.2(rb_call+0x32) [0x70aa64d4f406] ruby-3.2.2/vm_eval.c:877 ruby-3.2.2/lib/libruby.so.3.2(rb_funcallv_kw) ruby-3.2.2/vm_eval.c:1074 ruby-3.2.2/lib/libruby.so.3.2(rb_eval_cmd_kw+0x142) [0x70aa64d4f562] ruby-3.2.2/vm_eval.c:1923 ruby-3.2.2/lib/libruby.so.3.2(signal_exec+0xf6) [0x70aa64caae16] ruby-3.2.2/signal.c:1064 ruby-3.2.2/lib/libruby.so.3.2(rb_threadptr_execute_interrupts+0x36b) [0x70aa64cf7078] ruby-3.2.2/thread.c:2334 ruby-3.2.2/lib/libruby.so.3.2(rb_threadptr_execute_interrupts) ruby-3.2.2/thread.c:2291 ruby-3.2.2/lib/libruby.so.3.2(rb_vm_check_ints+0xb) [0x70aa64cf7ac5] ruby-3.2.2/vm_core.h:1994 ruby-3.2.2/lib/libruby.so.3.2(rb_vm_check_ints) ruby-3.2.2/vm_core.h:1985 ruby-3.2.2/lib/libruby.so.3.2(unblock_function_set) ruby-3.2.2/thread.c:320 ruby-3.2.2/lib/libruby.so.3.2(blocking_region_begin) ruby-3.2.2/thread.c:1485 ruby-3.2.2/lib/libruby.so.3.2(rb_nogvl+0xbf) [0x70aa64cf90cf] ruby-3.2.2/thread.c:1548 signal-bug-testcase-2/lib/signal_bug_testcase.so(fprintf+0x0) [0x70aa6518f299] ../../../../ext/signal_bug_testcase/signal_bug_testcase.c:17 signal-bug-testcase-2/lib/signal_bug_testcase.so(bug_testcase_free) ../../../../ext/signal_bug_testcase/signal_bug_testcase.c:18 ruby-3.2.2/lib/libruby.so.3.2(run_final+0xf) [0x70aa64b73172] ruby-3.2.2/gc.c:4388 ruby-3.2.2/lib/libruby.so.3.2(finalize_list) ruby-3.2.2/gc.c:4407 ruby-3.2.2/lib/libruby.so.3.2(finalize_deferred_heap_pages) ruby-3.2.2/gc.c:4436 ruby-3.2.2/lib/libruby.so.3.2(rb_objspace_call_finalizer+0x350) [0x70aa64b80d70] ruby-3.2.2/gc.c:4573 ruby-3.2.2/lib/libruby.so.3.2(rb_ec_finalize+0x2a) [0x70aa64b5d6d1] ruby-3.2.2/eval.c:168 ruby-3.2.2/lib/libruby.so.3.2(rb_ec_cleanup) ruby-3.2.2/eval.c:262 ruby-3.2.2/lib/libruby.so.3.2(ruby_run_node+0x9d) [0x70aa64b5d91d] ruby-3.2.2/eval.c:330 ruby-3.2.2/bin/ruby(rb_main+0x21) [0x5d5d1295f187] ./main.c:38 ruby-3.2.2/bin/ruby(main) ./main.c:57 /lib/x86_64-linux-gnu/libc.so.6(__libc_start_call_main+0x80) [0x70aa64629d90] ../sysdeps/nptl/libc_start_call_main.h:58 /lib/x86_64-linux-gnu/libc.so.6(call_init+0x0) [0x70aa64629e40] ../csu/libc-start.c:392 /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main_impl) ../csu/libc-start.c:379 /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main) (null):0 [0x5d5d1295f1d5] ``` ---Files-------------------------------- signal-bug-testcase-main.zip (7.73 KB) -- https://bugs.ruby-lang.org/ ______________________________________________ ruby-core mailing list -- ruby-core@ml.ruby-lang.org To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org ruby-core info -- https://ml.ruby-lang.org/mailman3/postorius/lists/ruby-core.ml.ruby-lang.org/