[#105882] [Ruby master Bug#18280] Segmentation Fault in rb_utf8_str_new_cstr(NULL) — "ukolovda (Dmitry Ukolov)" <noreply@...>

Issue #18280 has been reported by ukolovda (Dmitry Ukolov).

13 messages 2021/11/01

[#105897] [Ruby master Bug#18282] Rails CI raises Segmentation fault with ruby 3.1.0dev supporting `Class#descendants` — "yahonda (Yasuo Honda)" <noreply@...>

Issue #18282 has been reported by yahonda (Yasuo Honda).

12 messages 2021/11/02

[#105909] [Ruby master Misc#18285] NoMethodError#message uses a lot of CPU/is really expensive to call — "ivoanjo (Ivo Anjo)" <noreply@...>

Issue #18285 has been reported by ivoanjo (Ivo Anjo).

37 messages 2021/11/02

[#105920] [Ruby master Bug#18286] Universal arm64/x86_84 binary built on an x86_64 machine segfaults/is killed on arm64 — "ccaviness (Clay Caviness)" <noreply@...>

Issue #18286 has been reported by ccaviness (Clay Caviness).

16 messages 2021/11/03

[#105928] [Ruby master Feature#18287] Support nil value for sort in Dir.glob — "Strech (Sergey Fedorov)" <noreply@...>

Issue #18287 has been reported by Strech (Sergey Fedorov).

16 messages 2021/11/04

[#105944] [Ruby master Bug#18289] Enumerable#to_a should delegate keyword arguments to #each — "Ethan (Ethan -)" <noreply@...>

Issue #18289 has been reported by Ethan (Ethan -).

8 messages 2021/11/05

[#105967] [Ruby master Bug#18293] Time.at in master branch was 25% slower then Ruby 3.0 — "watson1978 (Shizuo Fujita)" <noreply@...>

Issue #18293 has been reported by watson1978 (Shizuo Fujita).

17 messages 2021/11/08

[#106008] [Ruby master Bug#18296] Custom exception formatting should override `Exception#full_message`. — "ioquatix (Samuel Williams)" <noreply@...>

Issue #18296 has been reported by ioquatix (Samuel Williams).

14 messages 2021/11/10

[#106033] [Ruby master Bug#18330] Make failure on 32-bit Linux (Android) with Clang due to implicit 64-to-32-bit integer truncation — "xtkoba (Tee KOBAYASHI)" <noreply@...>

Issue #18330 has been reported by xtkoba (Tee KOBAYASHI).

10 messages 2021/11/11

[#106053] [Ruby master Misc#18335] openindiana ruby 3.1 preview needs --disable-dtrace — "stes (David Stes)" <noreply@...>

Issue #18335 has been reported by stes (David Stes).

14 messages 2021/11/14

[#106069] [Ruby master Feature#18339] GVL instrumentation API — "byroot (Jean Boussier)" <noreply@...>

Issue #18339 has been reported by byroot (Jean Boussier).

13 messages 2021/11/15

[#106145] [Ruby master Misc#18346] DevelopersMeeting20211209Japan — "mame (Yusuke Endoh)" <noreply@...>

Issue #18346 has been reported by mame (Yusuke Endoh).

11 messages 2021/11/18

[#106173] [Ruby master Feature#18349] Let --jit enable YJIT — "k0kubun (Takashi Kokubun)" <noreply@...>

Issue #18349 has been reported by k0kubun (Takashi Kokubun).

8 messages 2021/11/19

[#106175] [Ruby master Feature#18351] Support anonymous rest and keyword rest argument forwarding — "jeremyevans0 (Jeremy Evans)" <noreply@...>

Issue #18351 has been reported by jeremyevans0 (Jeremy Evans).

10 messages 2021/11/19

[#106279] [Ruby master Feature#18364] Add GC.stat_size_pool for Variable Width Allocation — "peterzhu2118 (Peter Zhu)" <noreply@...>

Issue #18364 has been reported by peterzhu2118 (Peter Zhu).

14 messages 2021/11/25

[#106308] [Ruby master Feature#18367] Stop the interpreter from escaping error messages — "mame (Yusuke Endoh)" <noreply@...>

Issue #18367 has been reported by mame (Yusuke Endoh).

13 messages 2021/11/29

[#106314] [Ruby master Feature#18368] Range#step semantics for non-Numeric ranges — "zverok (Victor Shepelev)" <noreply@...>

Issue #18368 has been reported by zverok (Victor Shepelev).

39 messages 2021/11/29

[#106341] [Ruby master Bug#18369] users.detect(:name, "Dorian") as shorthand for users.detect { |user| user.name == "Dorian" } — dorianmariefr <noreply@...>

Issue #18369 has been reported by dorianmariefr (Dorian Mari辿).

14 messages 2021/11/30

[#106347] [Ruby master Feature#18370] Call Exception#full_message to print exceptions reaching the top-level — "Eregon (Benoit Daloze)" <noreply@...>

Issue #18370 has been reported by Eregon (Benoit Daloze).

10 messages 2021/11/30

[ruby-core:106219] [Ruby master Bug#17573] Crashes in profiling tools when signals arrive in non-Ruby threads

From: "nagachika (Tomoyuki Chikanaga)" <noreply@...>
Date: 2021-11-23 06:13:17 UTC
List: ruby-core #106219
Issue #17573 has been updated by nagachika (Tomoyuki Chikanaga).

Backport changed from 2.5: UNKNOWN, 2.6: UNKNOWN, 2.7: UNKNOWN, 3.0: REQUIRED to 2.5: UNKNOWN, 2.6: UNKNOWN, 2.7: UNKNOWN, 3.0: DONE

ruby_3_0 949af69408e44b69cc7437b58e8edbe3cd77c966 merged revision(s) 5680c38c75aeb5cbd219aafa8eb48c315f287d97,f5d20411386ff2552ff27661387ddc4bae1ebc30.

----------------------------------------
Bug #17573: Crashes in profiling tools when signals arrive in non-Ruby threads
https://bugs.ruby-lang.org/issues/17573#change-94829

* Author: jhawthorn (John Hawthorn)
* Status: Closed
* Priority: Normal
* Assignee: ko1 (Koichi Sasada)
* ruby -v: ruby 3.0.0p0 (2020-12-25 revision 95aff21468) [x86_64-darwin19]
* Backport: 2.5: UNKNOWN, 2.6: UNKNOWN, 2.7: UNKNOWN, 3.0: DONE
----------------------------------------
Stackprof (and likely similar tools) works by setting up a timer to sends it a unix signal on an interval. From that signal handler it does a small amount of internal bookkeeping and calls `rb_postponed_job_register_one`.

This is a problem because unix signals arrive on an arbitrary thread, and as of Ruby 3.0 the execution context (which `rb_postponed_job_register_one` relies on) is stored as a thread-local.

This reproduction crashes reliably for me on macos. It doesn't seem to on linux, maybe because the timer thread is different or the kernel has a different "arbitrary" choice. It feels like this is just one of the circumstances this crash could happen.

```ruby
require "stackprof"

StackProf.run(interval: 100) do
  1000.times do
    GC.start
  end
end
```

```
$ ruby crash_stackprof.rb
[BUG] Segmentation fault at 0x0000000000000038
ruby 3.0.0p0 (2020-12-25 revision 95aff21468) [x86_64-darwin19]

-- Crash Report log information --------------------------------------------
   See Crash Report log file under the one of following:
     * ~/Library/Logs/DiagnosticReports
     * /Library/Logs/DiagnosticReports
   for more details.
Don't forget to include the above Crash Report log file in bug reports.

-- Machine register context ------------------------------------------------
 rax: 0x0000000000000000 rbx: 0x0000000107fbb780 rcx: 0x0000000000000000
 rdx: 0x0000000000000000 rdi: 0x0000000106982c28 rsi: 0x0000000107fbb780
 rbp: 0x000070000eb47a10 rsp: 0x000070000eb479f0  r8: 0x000070000eb47eb0
  r9: 0xd44931e7344c235f r10: 0x00007fff6ef49501 r11: 0x0000000000000202
 r12: 0xd44931e7344c235f r13: 0x00000000ffffffff r14: 0x0000000000000000
 r15: 0x0000000000000000 rip: 0x00000001068c85fd rfl: 0x0000000000010202

-- C level backtrace information -------------------------------------------
/Users/jhawthorn/.rubies/ruby-3.0.0/bin/ruby(rb_vm_bugreport+0x6cf) [0x1068c2d5f]
/Users/jhawthorn/.rubies/ruby-3.0.0/bin/ruby(rb_bug_for_fatal_signal+0x1d6) [0x1066dc556]
/Users/jhawthorn/.rubies/ruby-3.0.0/bin/ruby(sigsegv+0x5b) [0x10681aa0b]
/usr/lib/system/libsystem_platform.dylib(_sigtramp+0x1d) [0x7fff6efff5fd]
/Users/jhawthorn/.rubies/ruby-3.0.0/bin/ruby(rb_postponed_job_register_one+0x1d) [0x1068c85fd]
/usr/lib/system/libsystem_platform.dylib(0x7fff6efff5fd) [0x7fff6efff5fd]
```

`0x38` is the address of `((rb_execution_context_t *)0)->vm`. 

lldb shows that it comes from a second thread which was running `timer_pthread_fn`

```
$ lldb =ruby -- ./crash_stackprof.rb
(lldb) target create "/Users/jhawthorn/.rubies/ruby-3.0.0/bin/ruby"                                                                                                                                                                     ruCurrent executable set to '/Users/jhawthorn/.rubies/ruby-3.0.0/bin/ruby' (x86_64).
(lldb) settings set -- target.run-args  "./crash_stackprof.rb"                                                                                                                                                                          (lldb) run
Process 92893 launched: '/Users/jhawthorn/.rubies/ruby-3.0.0/bin/ruby' (x86_64)                                                                                                                                                         Process 92893 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGALRM
    frame #0: 0x00000001000dfbcd ruby`rgengc_check_relation [inlined] RVALUE_OLD_P_RAW(obj=4303689480) at gc.c:1419:32
   1416 RVALUE_OLD_P_RAW(VALUE obj)
   1417 {
   1418     const VALUE promoted = FL_PROMOTED0 | FL_PROMOTED1;
-> 1419     return (RBASIC(obj)->flags & promoted) == promoted;
   1420 }
   1421
   1422 static inline int                                                                                                                                                                                                                 thread #2, stop reason = EXC_BAD_ACCESS (code=1, address=0x38)
    frame #0: 0x000000010029a5fd ruby`rb_postponed_job_register_one(flags=3492904, func=(stackprof.bundle`stackprof_gc_job_handler at stackprof.c:598), data=0x0000000000000000) at vm_trace.c:1622:19                                     1619 rb_postponed_job_register_one(unsigned int flags, rb_postponed_job_func_t func, void *data)
   1620 {                                                                                                                                                                                                                                  1621     rb_execution_context_t *ec = GET_EC();
-> 1622     rb_vm_t *vm = rb_ec_vm_ptr(ec);
   1623     rb_postponed_job_t *pjob;
   1624     rb_atomic_t i, index;
   1625
Target 0: (ruby) stopped.
(lldb) t 2
* thread #2
    frame #0: 0x000000010029a5fd ruby`rb_postponed_job_register_one(flags=3492904, func=(stackprof.bundle`stackprof_gc_job_handler at stackprof.c:598), data=0x0000000000000000) at vm_trace.c:1622:19
   1619 rb_postponed_job_register_one(unsigned int flags, rb_postponed_job_func_t func, void *data)
   1620 {
   1621     rb_execution_context_t *ec = GET_EC();
-> 1622     rb_vm_t *vm = rb_ec_vm_ptr(ec);
   1623     rb_postponed_job_t *pjob;
   1624     rb_atomic_t i, index;
   1625
(lldb) bt
* thread #2
  * frame #0: 0x000000010029a5fd ruby`rb_postponed_job_register_one(flags=3492904, func=(stackprof.bundle`stackprof_gc_job_handler at stackprof.c:598), data=0x0000000000000000) at vm_trace.c:1622:19
    frame #1: 0x00007fff6efff5fd libsystem_platform.dylib`_sigtramp + 29
    frame #2: 0x00007fff6ef4e3d7 libsystem_kernel.dylib`poll + 11
    frame #3: 0x0000000100238e1e ruby`timer_pthread_fn(p=<unavailable>) at thread_pthread.c:2189:15
    frame #4: 0x00007fff6f00b109 libsystem_pthread.dylib`_pthread_start + 148
    frame #5: 0x00007fff6f006b8b libsystem_pthread.dylib`thread_start + 15
```

Attached is my attempted fix (also available at https://github.com/ruby/ruby/pull/4108) which uses the main-ractor's EC if there is none on the current thread. I *hope* this works (it seems to and fixes the crash) because before Ruby 3.0 there was a global EC, but I'm not entirely sure if this will cause other problems.

If accepted this should be backported to the 3.0 branch.

---Files--------------------------------
use_main_ractor_ec_on_threads_without_ec.patch (3.28 KB)
use_main_ractor_ec_on_threads_without_ec.patch (3.84 KB)


-- 
https://bugs.ruby-lang.org/

Unsubscribe: <mailto:ruby-core-request@ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>

In This Thread

Prev Next