From: "mbcodeandsound (Mike Bourgeous) via ruby-core" Date: 2025-04-07T17:33:46+00:00 Subject: [ruby-core:121557] [Ruby Bug#21220] Memory corruption in update_line_coverage() [write at index -1] Issue #21220 has been updated by mbcodeandsound (Mike Bourgeous). Something like this should prevent the memory corruption, but may be hiding a deeper issue: ``` diff --- thread.c 2025-02-14 14:25:54.000000000 -0700 +++ thread_fix.c 2025-04-07 11:32:53.571115993 -0600 @@ -5675,7 +5675,7 @@ rb_ary_push(lines, LONG2FIX(line + 1)); return; } - if (line >= RARRAY_LEN(lines)) { /* no longer tracked */ + if (line < 0 || line >= RARRAY_LEN(lines)) { /* no longer tracked */ return; } num = RARRAY_AREF(lines, line); ``` ---------------------------------------- Bug #21220: Memory corruption in update_line_coverage() [write at index -1] https://bugs.ruby-lang.org/issues/21220#change-112583 * Author: mbcodeandsound (Mike Bourgeous) * Status: Open * ruby -v: ruby 3.4.2 (2025-02-15 revision d2930f8e7a) +PRISM [x86_64-linux] * Backport: 3.2: UNKNOWN, 3.3: UNKNOWN, 3.4: UNKNOWN ---------------------------------------- Hello! I have encountered repeatable memory corruption in Ruby 3.4.2 on Ubuntu 24.04.2 LTS, which I believe is happening in update_line_coverage(). I could not reproduce this on Ruby 3.x or earlier. My findings follow. I also have detailed step-by-step notes at https://github.com/mike-bourgeous/mb-sound/issues/36 ### Summary `update_line_coverage()` calls `rb_sourceline()`, subtracts one from its return value, and uses this as an index into an Array. Sometimes `rb_sourceline()` returns 0, and when this happens, `update_line_coverage()` will write to index -1 of the array. This corrupts the heap before the Array, resulting in a program crash later during GC. As I am new to the Ruby codebase I do not know if it's normal for rb_sourceline() to return 0 and update_line_coverage() should handle it, or if something is wrong in the code that ultimately feeds rb_sourceline(). ### Symptom On Linux, affected processes print one of the following errors and exit: ``` munmap_chunk(): invalid pointer Aborted (core dumped) ``` or, if preloading libc_malloc_debug.so ``` malloc_check_get_size: memory corruption Aborted (core dumped) ``` ### Reproduction I have a reduced GitHub project that can reproduce the bug consistently both on my machine and in CI. When I try to reduce the size of this repo further, the bug stops happening. The issue only reproduces locally if the `coverage/` directory has a large `.resultset.json`. - **Repo:** https://github.com/mike-bourgeous/reproduce-simplecov-ruby34-bug - **Example of the bug:** https://github.com/mike-bourgeous/reproduce-simplecov-ruby34-bug/actions/runs/14289657889/job/40049195631#step:5:176 ``` shell # Repeatedly running the process increases the likelihood of crashing # as the SimpleCov result file grows. for f in `seq 1 100`; do echo $f; ruby -r./spec/simplecov_helper.rb bin/midi_roll.rb -c 40 -r 2 spec/test_data/all_notes.mid > /dev/null || break ; done ``` ### Research and reasoning I initially found the crash during a live stream when I was upgrading a project from Ruby 2.7 to Ruby 3.4. The crash occurred when an RSpec test tried to spawn another Ruby process, while using SimpleCov to measure code coverage in both. I discovered a workaround of disabling SimpleCov in the nested process when running tests on Ruby 3.4. I used a somewhat unusual approach to get coverage metrics for subprocesses. After the stream I wanted to understand what was really happening and see if I could find a way to re-enable test code coverage for subprocesses. I used a combination of Valgrind, GDB, and trial and error to narrow down the site of the crash and the original corruption. I wrote [a GDB script to automate information gathering](https://github.com/mike-bourgeous/reproduce-simplecov-ruby34-bug/blob/master/gdb_ruby_backtrace.gdb) when the GC crash occurred, and Valgrind+vgdb to identify the original write that appeared to cause the corruption. I reviewed the Git history of update_line_coverage(), rb_sourceline() (and the functions it calls), and a few other functions, but did not find any obvious changes between Ruby 3.3.x and Ruby 3.4.x, so the root cause is somewhere beyond my familiarity with the codebase. Full details of my process are in my issue notes: https://github.com/mike-bourgeous/mb-sound/issues/36 ---Files-------------------------------- corruption_c_stack.txt (2.63 KB) corruption_ruby_stack.txt (948 Bytes) crash_ruby_stack.txt (4.46 KB) crash_c_stack.txt (26.2 KB) -- https://bugs.ruby-lang.org/ ______________________________________________ ruby-core mailing list -- ruby-core@ml.ruby-lang.org To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org ruby-core info -- https://ml.ruby-lang.org/mailman3/lists/ruby-core.ml.ruby-lang.org/