From: v.ondruch@... Date: 2018-02-21T20:10:39+00:00 Subject: [ruby-core:85736] [Ruby trunk Bug#14480] miniruby crashing when compiled with -O2 or -O1 on aarch64 Issue #14480 has been updated by vo.x (Vit Ondruch). It seems they are getting further: ~~~ With -fomit-frame-pointer on *everything*, and hacking out the call to rb_thread_create_timer_thread in Init_Thread (to keep this single-threaded for simplicity), the bug appears to be a problem with setjmp/longjmp. x29 (the frame pointer) is corrupted deep within the 15th call to vm_exec_core within the 10th call to vm_call_opt_call. The write of the bogus value to x29 occurs here: 0x000000000057d9bc in rb_ec_tag_jump (ec=0x46b050 , st=RUBY_TAG_NONE) at vm.i:10459 10459 __builtin_longjmp(((ec->tag->buf)), (1)); 2: /x $x29 = 0x7fffff8080 (gdb) disassemble Dump of assembler code for function rb_ec_tag_jump: 0x000000000057d988 <+0>: stp x29, x30, [sp,#-32]! 0x000000000057d98c <+4>: mov x29, sp 0x000000000057d990 <+8>: str x0, [x29,#24] 0x000000000057d994 <+12>: str w1, [x29,#20] 0x000000000057d998 <+16>: ldr x0, [x29,#24] 0x000000000057d99c <+20>: ldr x0, [x0,#24] 0x000000000057d9a0 <+24>: ldr w1, [x29,#20] 0x000000000057d9a4 <+28>: str w1, [x0,#336] 0x000000000057d9a8 <+32>: ldr x0, [x29,#24] 0x000000000057d9ac <+36>: ldr x0, [x0,#24] 0x000000000057d9b0 <+40>: add x0, x0, #0x10 0x000000000057d9b4 <+44>: ldr x1, [x0,#8] 0x000000000057d9b8 <+48>: ldr x29, [x0] => 0x000000000057d9bc <+52>: ldr x0, [x0,#16] 0x000000000057d9c0 <+56>: mov sp, x0 0x000000000057d9c4 <+60>: br x1 when called from rb_iterate0, where the bogus x29 value has been fetched from the jmp_buf at +48. A watchpoint on that memory shows it being set to the bogus value here in rb_iterate0: 0x0000000000595e50 <+96>: str x0, [sp,#264] 0x0000000000595e54 <+100>: ldr x0, [sp,#232] 0x0000000000595e58 <+104>: ldr x0, [x0,#24] 0x0000000000595e5c <+108>: str x0, [sp,#592] 0x0000000000595e60 <+112>: add x0, sp, #0x108 0x0000000000595e64 <+116>: add x0, x0, #0x10 0x0000000000595e68 <+120>: add x1, sp, #0x260 0x0000000000595e6c <+124>: str x1, [x0] => 0x0000000000595e70 <+128>: adrp x1, 0x595000 28300 struct rb_vm_tag _tag; 28301 _tag.state = RUBY_TAG_NONE; 28302 _tag.tag = ((VALUE)RUBY_Qundef); 28303 _tag.prev = _ec->tag; 28304 ; 28305 state = (__builtin_setjmp((_tag.buf)) ? rb_ec_tag_state((_ec)) 28306 : ((void)(_ec->tag = &_tag), 0)); If I'm reading this right, the __builtin_longjmp rb_ec_tag_jump (in rb_iterate0) is attempting to restore x29 from the jmp_buf, but the __builtin_setjmp in rb_iterate0 isn't actually saving x29 there, and hence x29 gets corrupted at the longjmp, deep in the callstack, leading to an eventual crash when vm_call_opt_call eventually tries to use x29. ~~~ ---------------------------------------- Bug #14480: miniruby crashing when compiled with -O2 or -O1 on aarch64 https://bugs.ruby-lang.org/issues/14480#change-70578 * Author: vo.x (Vit Ondruch) * Status: Open * Priority: Normal * Assignee: * Target version: * ruby -v: ruby 2.5.0p0 (2017-12-25 revision 61468) [aarch64-linux] * Backport: 2.3: UNKNOWN, 2.4: UNKNOWN, 2.5: UNKNOWN ---------------------------------------- Recently, it is not possible to build Ruby 2.5.0 on aarch64 on Fedora Rawhide, because miniruby fails during build: ~~~ ... snip ... ./miniruby -I./lib -I. -I.ext/common -n \ -e 'BEGIN{version=ARGV.shift;mis=ARGV.dup}' \ -e 'END{abort "UNICODE version mismatch: #{mis}" unless mis.empty?}' \ -e '(mis.delete(ARGF.path); ARGF.close) if /ONIG_UNICODE_VERSION_STRING +"#{Regexp.quote(version)}"/o' \ 10.0.0 ./enc/unicode/10.0.0/casefold.h ./enc/unicode/10.0.0/name2ctype.h generating encdb.h ./miniruby -I./lib -I. -I.ext/common ./tool/generic_erb.rb -c -o encdb.h ./template/encdb.h.tmpl ./enc enc generating prelude.c ./miniruby -I./lib -I. -I.ext/common ./tool/generic_erb.rb -I. -c -o prelude.c \ ./template/prelude.c.tmpl ./prelude.rb ./gem_prelude.rb ./abrt_prelude.rb *** stack smashing detected ***: terminated encdb.h updated ... snip ... ~~~ This might by Ruby or gcc issue. Not sure yet. However, there is already lengthy analysis available in Fedora's Bugzilla [1]. Would be anybody able to help to resolve this issue? [1]: https://bugzilla.redhat.com/show_bug.cgi?id=1545239 ---Files-------------------------------- Dockerfile (573 Bytes) -- https://bugs.ruby-lang.org/ Unsubscribe: