From: "ioquatix (Samuel Williams)" Date: 2022-06-06T05:18:57+00:00 Subject: [ruby-core:108782] [Ruby master Bug#18818] SEGV (Fiber scheduler?) Issue #18818 has been updated by ioquatix (Samuel Williams). @ko1 I saw this problem because fiber is not retained while waiting, because we have waiting threads but not waiting fibers at VM level IIRC. Probably we need to make mutex/queue mark the wait list correctly? Is there performance issue? ---------------------------------------- Bug #18818: SEGV (Fiber scheduler?) https://bugs.ruby-lang.org/issues/18818#change-97847 * Author: nevans (Nicholas Evans) * Status: Open * Priority: Normal * Assignee: ioquatix (Samuel Williams) * ruby -v: 3.1.2, 3.0.4, master * Backport: 2.7: UNKNOWN, 3.0: UNKNOWN, 3.1: UNKNOWN ---------------------------------------- The attached script (and/or others like it) can cause SEGV in 3.0, 3.1, and master. It has always behaved as expected when I use `optflags=-O0`. When I use it with `make run` on `master`: ``` ./miniruby -I../lib -I. -I.ext/common -r./x86_64-linux-fake ../test.rb ======================================================================== fiber_queue completed in 0.00031349004711955786 ======================================================================== fiber_sized_queue ../test.rb:62: [BUG] Segmentation fault at 0x0000000000000000 ruby 3.2.0dev (2022-06-05T06:18:26Z master 5ce0be022f) [x86_64-linux] -- Control frame information ----------------------------------------------- c:0005 p:---- s:0023 e:000022 CFUNC :% c:0004 p:0031 s:0018 e:000015 METHOD ../test.rb:62 [FINISH] c:0003 p:---- s:0010 e:000009 CFUNC :pop c:0002 p:0009 s:0006 e:000005 BLOCK ../test.rb:154 [FINISH] c:0001 p:---- s:0003 e:000002 (none) [FINISH] -- Ruby level backtrace information ---------------------------------------- ../test.rb:154:in `block (2 levels) in
' ../test.rb:154:in `pop' ../test.rb:62:in `unblock' ../test.rb:62:in `%' -- Machine register context ------------------------------------------------ RIP: 0x000055eae9ffa417 RBP: 0x00007f80aba855d8 RSP: 0x00007f80a9789598 RAX: 0x000000000000009b RBX: 0x00007f80a9789628 RCX: 0x00007f80ab9c37a0 RDX: 0x00007f80a97895c0 RDI: 0x0000000000000000 RSI: 0x000000000000009b R8: 0x0000000000000000 R9: 0x00007f80a97895c0 R10: 0x0000000055550083 R11: 0x00007f80ac32ace0 R12: 0x00007f80aba855d8 R13: 0x00007f80ab9c3780 R14: 0x00007f80a97895c0 R15: 0x000000000000009b EFL: 0x0000000000010202 -- C level backtrace information ------------------------------------------- ./miniruby(rb_vm_bugreport+0x5cf) [0x55eaea06b0ef] ./miniruby(rb_bug_for_fatal_signal+0xec) [0x55eae9e4fc2c] ./miniruby(sigsegv+0x4d) [0x55eae9fba30d] [0x7f80ac153520] ./miniruby(rb_id_table_lookup+0x7) [0x55eae9ffa417] ./miniruby(callable_method_entry+0x103) [0x55eaea046bd3] ./miniruby(vm_respond_to+0x3f) [0x55eaea056c1f] ./miniruby(rb_check_funcall_default_kw+0x19c) [0x55eaea05788c] ./miniruby(rb_check_convert_type_with_id+0x8e) [0x55eae9f1b85e] ./miniruby(rb_str_format_m+0x1a) [0x55eae9fce82a] ./miniruby(vm_call_cfunc_with_frame+0x127) [0x55eaea041ac7] ./miniruby(vm_exec_core+0x114) [0x55eaea05d684] ./miniruby(rb_vm_exec+0x187) [0x55eaea04e747] ./miniruby(rb_funcallv_scope+0x1b0) [0x55eaea05a770] ./miniruby(rb_fiber_scheduler_unblock+0x3e) [0x55eae9fb979e] ./miniruby(sync_wakeup+0x10d) [0x55eae9ffd45d] ./miniruby(rb_szqueue_pop+0xf5) [0x55eae9ffefd5] ./miniruby(vm_call_cfunc_with_frame+0x127) [0x55eaea041ac7] ./miniruby(vm_exec_core+0x114) [0x55eaea05d684] ./miniruby(rb_vm_exec+0x187) [0x55eaea04e747] ./miniruby(rb_vm_invoke_proc+0x5f) [0x55eaea05584f] ./miniruby(rb_fiber_start+0x1da) [0x55eae9e1e24a] ./miniruby(fiber_entry+0x0) [0x55eae9e1e550] ``` I've attached the rest of the VM dump. `make runruby` gives a nearly identical dump. I can post a core dump or `rr` recording, if needed. _ I'm sorry I didn't simplify the script more; small, seemingly irrelevant changes can change the failure or allow it to pass. Sometimes it raises a bizarre exception instead of SEGV, most commonly a NoMethodError which seemingly indicates that the local vars have been shifted or scrambled. For example, this particular SEGV was caused by a guard clause checking that `unblock(blocker, fiber)` was given a Fiber object. Here, that object is invalid, but I've seen it be a string or some other object from elsewhere in the process. For comparison, this is what the script output should look like: ``` ======================================================================== fiber_queue completed in 0.00031569297425448895 ======================================================================== fiber_sized_queue completed in 0.1176840600091964 ======================================================================== fiber_sized_queue2 completed in 0.19209402799606323 ======================================================================== fiber_sized_queue3 completed in 0.21404067997355014 ======================================================================== fiber_sized_queue4 completed in 0.30277197097893804 ``` I was attempting to create some simple benchmarks for `Queue` and `SizedQueue` with fibers, to mimic `benchmark/vm_thread_*queue*.rb`. I never completed the benchmarks because of this SEGV. :) ---Files-------------------------------- test.rb (5.6 KB) segv-master-5ce0be022f.txt (11.8 KB) -- https://bugs.ruby-lang.org/ Unsubscribe: