From: Charlie Savage Date: 2011-09-10T17:00:58+09:00 Subject: [ruby-core:39446] [Ruby 1.9 - Bug #5306] Application Hangs Due to Recent rb_thread_select Changes Issue #5306 has been updated by Charlie Savage. File strace_hangs.log added File strace_completes.log added File strace_pure.log added File pmap.log added Ok, on the first test, strange results. Running this command: strace -f -v ruby -I.:lib:tests tests/test_epoll.rb -n test_datagrams Hangs the test as expected. But running this command: strace -f -v ruby -I.:lib:tests tests/test_epoll.rb -n test_datagrams &> /tmp/strace1.log Causes the test runs to completion. And then annoyingly enough that one particular test works after that. If I reboot the machine, then the test hangs again. I have attached 2 logs, strace_completes.log and strace_hangs.log. stace_hangs.log is only the last few hundred lines (rest scrolled off the top), but what I saw matches strace_completes.log to line 2,271. After that, the two diverge. The story is different for the second test: strace -v -v ruby -I.:lib:tests tests/test_pure.rb -n test_connrefused 2>&1 | tee /tmp/strace_pure.log That log is attached. As for your other questions: > Also, can you extract these tests and run with a hand-picked port? Sure. The connection refused one is intentionally picking the first unused port. It turns out to be 9001. > I assume you tried a clean build/install of Ruby to make sure all > objects got rebuilt and reinstalled? Yes. $cd /usr/src/ruby $git pull (on the ruby 193 branch) $git clean -fx $autoconf $./configure --prefix=/usr --enable-shared=true $make $make install > Can you also try running `pmap $PID' on the hung processes to make > sure it's loading the correct libs + versions? $ps -ef | grep ruby cfis 16185 15381 4 01:51 pts/1 00:00:00 ruby -I.:lib:tests $pmap 16185 (see attached log) Hope this info helps. ---------------------------------------- Bug #5306: Application Hangs Due to Recent rb_thread_select Changes http://redmine.ruby-lang.org/issues/5306 Author: Charlie Savage Status: Open Priority: High Assignee: Category: core Target version: 1.9.3 ruby -v: ruby 1.9.3dev (2011-09-09 revision 33236) [x86_64-linux] This commit: 4e9438bc9153f7a1f4ea0af85c8dbe359e1a55d8 Changed the implementation of rb_thread_select. It causes eventmachine to hang on CentOS 5.5. Not sure what the issue is, but its easily reproduced by by running the test eventmachine/tests/test_epoll.rb. We noticed this because it also causes the tweetstream gem to hang. The same setup works on Fedora 14 and an up-to-date arch linux. Specific version information included below. We temporarily fixed this by reverting the commit. Since Centos is a common production environment (and the one we are using), this seems to us a blocker for 1.9.3. We are happy to provide any additional information or test fixes. Thanks - Charlie -------------- We are running this version of CentOS: Linux app1.zerista.com 2.6.18-238.19.1.el5.centos.plus #1 SMP Mon Jul 18 10:05:09 EDT 2011 x86_64 x86_64 x86_64 GNU/Linux And this version of Fedora: Linux ammonite.internal.zerista.com 2.6.35.14-95.fc14.x86_64 #1 SMP Tue Aug 16 21:01:58 UTC 2011 x86_64 x86_64 x86_64 GNU/Linux And this version of eventmachine: eventmachine (1.0.0.beta.3) And this version of tweetstream: tweetstream (1.0.4) -- http://redmine.ruby-lang.org