From: Eric Wong <normalperson@...> Date: 2018-02-06T10:00:00+00:00 Subject: [ruby-core:85442] Re: [Ruby trunk Bug#14357] thread_safe tests suite segfaults Eric Wong <normalperson@yhbt.net> wrote: > v.ondruch@tiscali.cz wrote: > > https://bugs.ruby-lang.org/issues/14357 > > > > The thread_safe gem is not maintained anymore, but I don't see > > any reason why its test suite should segfault with Ruby 2.5. > > Right, no 3rd-party C exts loaded and I hit this in trunk, too. > Using -fsanitize=address reveals use-after-free in st.c > Investigating, but maybe Vladimir can find it sooner. Maybe my initial investigation was correct, after all. valgrind takes forever, but indicates the free is caused by rebuild_table; so it doesn't look like we missed GC marking during rebuild. Disabling the free(tab->entries) at line st.c:792 (patch below) seems to indicate success with the thread_safe test suite (letting it loop overnight). Looks like the new_tab != tab case of rebuild is leaving a hanging reference somewhere. ==9885== Thread 32 cache_loops_sp*: ==9885== Invalid read of size 8 ==9885== at 0x235622: find_table_entry_ind (st.c:873) ==9885== by 0x236C95: st_lookup (st.c:1049) ==9885== by 0x1520CE: rb_hash_aref (hash.c:853) ==9885== by 0x2A95E0: vm_opt_aref (vm_insnhelper.c:3650) ==9885== by 0x2A95E0: vm_exec_core (insns.def:1175) ==9885== by 0x2ACA83: vm_exec (vm.c:1790) ==9885== by 0x2AD875: invoke_block (vm.c:993) ==9885== by 0x2AD875: invoke_iseq_block_from_c (vm.c:1045) ==9885== by 0x2B64A8: invoke_block_from_c_bh (vm.c:1063) ==9885== by 0x2B64A8: vm_yield (vm.c:1108) ==9885== by 0x2B64A8: rb_yield_0 (vm_eval.c:970) ==9885== by 0x2B64A8: rb_yield_1 (vm_eval.c:976) ==9885== by 0x19238D: int_dotimes (numeric.c:4984) ==9885== by 0x29F816: vm_call_cfunc_with_frame (vm_insnhelper.c:1921) ==9885== by 0x29F816: vm_call_cfunc (vm_insnhelper.c:1937) ==9885== by 0x2A83D9: vm_exec_core (insns.def:719) ==9885== by 0x2ACA83: vm_exec (vm.c:1790) ==9885== by 0x2AD875: invoke_block (vm.c:993) ==9885== by 0x2AD875: invoke_iseq_block_from_c (vm.c:1045) ==9885== Address 0xbeafe88 is 43,080 bytes inside a block of size 49,152 free'd ==9885== at 0x4C29E90: free (vg_replace_malloc.c:473) ==9885== by 0x14C3EC: objspace_xfree (gc.c:7987) ==9885== by 0x14C3EC: ruby_sized_xfree (gc.c:8082) ==9885== by 0x14C3EC: ruby_xfree (gc.c:8089) ==9885== by 0x236472: rebuild_table (st.c:792) ==9885== by 0x237E85: rebuild_table_if_necessary (st.c:1090) ==9885== by 0x237E85: st_add_direct_with_hash (st.c:1153) ==9885== by 0x237E85: st_update (st.c:1431) ==9885== by 0x150A4E: tbl_update (hash.c:561) ==9885== by 0x150A4E: rb_hash_aset (hash.c:1654) ==9885== by 0x2A9687: vm_opt_aset (vm_insnhelper.c:3671) ==9885== by 0x2A9687: vm_exec_core (insns.def:1189) ==9885== by 0x2ACA83: vm_exec (vm.c:1790) ==9885== by 0x2AD875: invoke_block (vm.c:993) ==9885== by 0x2AD875: invoke_iseq_block_from_c (vm.c:1045) ==9885== by 0x2B674F: invoke_block_from_c_bh (vm.c:1063) ==9885== by 0x2B674F: vm_yield (vm.c:1108) ==9885== by 0x2B674F: rb_yield_0 (vm_eval.c:970) ==9885== by 0x2B674F: rb_yield (vm_eval.c:983) ==9885== by 0x131C86: rb_ensure (eval.c:1035) ==9885== by 0x29F816: vm_call_cfunc_with_frame (vm_insnhelper.c:1921) ==9885== by 0x29F816: vm_call_cfunc (vm_insnhelper.c:1937) ==9885== by 0x2A83D9: vm_exec_core (insns.def:719) Line numbers based on r62184 (git commit 05c18139a1545a61caaaf33d888c8427d346b571). Following patch hides the problem by introducing a leak: ``` --- a/st.c +++ b/st.c @@ -789,7 +789,7 @@ rebuild_table(st_table *tab) if (tab->bins != NULL) free(tab->bins); tab->bins = new_tab->bins; - free(tab->entries); + /* free(tab->entries); */ /* NOT FOR PRODUCTION USE */ tab->entries = new_tab->entries; free(new_tab); } ``` (gdb) up #17 0x00005604a6dd173d in find_table_entry_ind (tab=tab@entry=0x7f13e4444ac0, hash_value=hash_value@entry=0, key=key@entry=94578030726560) at ../st.c:874 874 && PTR_EQUAL(tab, &entries[bin - ENTRY_BASE], hash_value, key)) (gdb) up #18 0x00005604a6dd2d26 in st_lookup (tab=0x7f13e4444ac0, key=key@entry=94578030726560, value=value@entry=0x7f132fdfc2f8) at ../st.c:1050 1050 bin = find_table_entry_ind(tab, hash, key); (gdb) p *tab $1 = {entry_power = 7 '\a', bin_power = 8 '\b', size_ind = 0 '\000', rebuilds_num = 213, type = 0x5604a71ce210 <objhash>, num_entries = 121, bins = 0x7f13e445a340, entries_start = 0, entries_bound = 121, entries = 0x7f13e445c6b0} Looks like it's freshly rebuilt table. Pretty easy to reproduce the problem on 2.5, I remember it took more tries on 2.4 (didn't valgrind). An extra pair of eyes more experienced with this code than I am would be appreciated. Thanks. Unsubscribe: <mailto:ruby-core-request@ruby-lang.org?subject=unsubscribe> <http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>