From: "alexdowad (Alex Dowad)" Date: 2012-10-11T05:12:40+09:00 Subject: [ruby-core:47890] [ruby-trunk - Bug #7135] GC bug in Ruby 1.9.3-p194? Issue #7135 has been updated by alexdowad (Alex Dowad). Nobu������, I don't expect that you (or anyone else) would be able to reproduce this bug. As I said, it doesn't happen when I extract the part which is failing from Prawn, only when I run the tests against the whole thing (which I have modified -- I'm working on performance). This is not strange -- in general, memory corruption/pointer bugs are sensitive to the exact layout of data in memory, and changing small things in a program may randomly turn the bug on or off. As I said, I can try to dig deeper and diagnose the bug myself, but I need some advice on where to add "debug" code to the Ruby source (so I can recompile, run the code which is failing, and try to get more information on what is actually happening). To sum up the problem again, I have Ruby Strings which are randomly being overwritten (although nothing at the Ruby level is modifying them), and it only happens when the GC runs. Actually, I just discovered that if I put a call to "GC.start" in the "string.codepoints.inject" loop, the error happens *every time*. UNLESS I freeze the string -- then the error never happens, even with "GC.start" in the loop. A few questions for someone who knows Ruby internals well: - When Ruby GCs an unused object, does it zero out the memory used? - How about when a new object is allocated? - I've heard that Ruby stores the contents of small strings directly in an RObject (or RValue or whatever it is...) union. The String which is being corrupted has 7 bytes. Will a String like that *always* be embedded, or is it possible that it could still use malloc'd memory for the contents? - In the tests which I am doing right now, it always seems that byte 0 is untouched, byte 1 is changed to 1, and bytes 2-6 are changed to 0. Do those values seem familiar? Is there a different type of object which can go in the same union, which would set those particular bytes? ---------------------------------------- Bug #7135: GC bug in Ruby 1.9.3-p194? https://bugs.ruby-lang.org/issues/7135#change-30192 Author: alexdowad (Alex Dowad) Status: Feedback Priority: Normal Assignee: Category: Target version: ruby -v: ruby 1.9.3p194 (2012-04-20 revision 35410) [x86_64-linux] I'm just doing some refactoring/performance work on a popular Ruby gem called Prawn (it's used for PDF generation). I'm fighting with a strange, intermittent failure on the spec tests, and from my experimentation so far, it seems very, very likely to be a bug in Ruby's garbage collector. I'll try to keep this as brief as possible, but please be patient... The code where the intermittent failure comes from measures the width of a string when rendered using a TTF font. This is part of the method body (including my debug print statements): # GC.disable # string.freeze p string.bytes.to_a if $my_debug p string.codepoints.to_a if $my_debug p scale if $my_debug result = string.codepoints.inject(0) do |s,r| print r if $my_debug print "," if $my_debug s + character_width_by_code(r) end * scale puts if $my_debug result When the tests pass normally (which is about 7/8 of the time), the debug print statements show: [104, 101, 108, 108, 111, 194, 173] [104, 101, 108, 108, 111, 173] 0.012 104,101,108,108,111,173, ...You can see that the "print" calls in the "string.codepoints.inject" loop print the same series of codepoints as "p string.codepoints.to_a". This is what you would expect, because nothing is modifying the string. But about 1/8 of the time I get: [104, 101, 108, 108, 111, 194, 173] [104, 101, 108, 108, 111, 173] 0.012 104,42,0,0,0,0,0, I have also seen "104,0,0,0,0,0" on occasion. In all cases, "p string.codepoints.to_a" prints the correct sequence of codepoints for the string. You might think that something in the "string.codepoints.inject" loop is modifying the string, but it's not. I could show the contents of "character_width_by_code", but it would just be wasting your time, because it basically contains nothing but a couple of hash lookups. If I uncomment "string.freeze", I can run the tests 100 times or more with no failure. (This proves that the string is not being modified by my code, because it would throw an exception otherwise!) Or, if I change the code to "string.codepoints.to_a.inject", again, the failure never happens. Most revealingly, if I uncomment "GC.disable", I can run the test 100 or more times with no failure. As soon as I comment out "GC.disable", the random failure comes back, for about 1/7 - 1/10 of runs. Sometimes I also get another random failure from the same place: an "invalid codepoint in UTF-8" exception from "string.codepoints.inject". Again, this proves something inside Ruby is corrupting the string, because the call to "string.codepoints" just 2 lines before prints the correct sequence, with no exception raised. I'd like to boil this down to a smaller example which demonstrates the failure, but it's a hopeless task. When I take pieces of the code and run them in irb, the failure never happens. Even rebooting the computer may make it go away... but then, when I am working on Prawn again, sooner or later it happens again. (I know because similar intermittent failures have happened before in the past.) Once it starts happening, though, it's pretty consistent at about 1/7 - 1/10 of test runs. Another clue is that the corrupted codepoints are *always* zero. I can try to track down the problem, perhaps by adding some logging code to the Ruby interpreter source and recompiling, but I need some guidance on where to look. Can anyone who is familiar with Ruby internals (especially Strings and the GC) give me some ideas how to start? -- http://bugs.ruby-lang.org/