[#102687] [Ruby master Bug#17666] Sleep in a thread hangs when Fiber.set_scheduler is set — arjundas.27586@...

Issue #17666 has been reported by arjunmdas (arjun das).

16 messages 2021/03/02

[#102776] [Ruby master Bug#17678] Ractors do not restart after fork — knuckles@...

Issue #17678 has been reported by ivoanjo (Ivo Anjo).

8 messages 2021/03/08

[#102797] [Ruby master Feature#17684] Remove `--disable-gems` from release version of Ruby — hsbt@...

Issue #17684 has been reported by hsbt (Hiroshi SHIBATA).

17 messages 2021/03/10

[#102829] [Ruby master Bug#17718] a method paramaters object that can be pattern matched against — dsisnero@...

Issue #17718 has been reported by dsisnero (Dominic Sisneros).

9 messages 2021/03/11

[#102832] [Ruby master Misc#17720] Cirrus CI to check non-x86_64 architecture cases by own machines — jaruga@...

Issue #17720 has been reported by jaruga (Jun Aruga).

19 messages 2021/03/12

[#102850] [Ruby master Bug#17723] autoconf 2.70+ is not working with master branch — hsbt@...

Issue #17723 has been reported by hsbt (Hiroshi SHIBATA).

11 messages 2021/03/14

[#102884] [Ruby master Bug#17725] Prepend Breaks Ability to Alias — josh@...

Issue #17725 has been reported by joshuadreed (Josh Reed).

14 messages 2021/03/16

[#102914] [Ruby master Bug#17728] [BUG] Segmentation fault at 0x0000000000000000 — denthebat@...

Issue #17728 has been reported by meliborn (Denis Denis).

13 messages 2021/03/18

[#102919] [Ruby master Bug#17730] Ruby on macOS transitively links to ~150 dylibs — rickmark@...

Issue #17730 has been reported by rickmark (Rick Mark).

10 messages 2021/03/18

[#103013] [Ruby master Bug#17748] Ruby 3.0 takes a long time to resolv DNS of nonexistent domains — xdmx@...

Issue #17748 has been reported by xdmx (Eric Bloom).

8 messages 2021/03/25

[#103026] [Ruby master Feature#17749] Const source location without name — tenderlove@...

Issue #17749 has been reported by tenderlovemaking (Aaron Patterson).

10 messages 2021/03/25

[#103036] [Ruby master Misc#17751] Do these instructions (<<, +, [0..n]) modify the original string without creating copies? — cart4for1@...

Issue #17751 has been reported by stiuna (Juan Gregorio).

11 messages 2021/03/26

[#103040] [Ruby master Feature#17752] Enable -Wundef for C extensions in repository — eregontp@...

Issue #17752 has been reported by Eregon (Benoit Daloze).

23 messages 2021/03/26

[#103044] [Ruby master Feature#17753] Add Module#outer_scope — tenderlove@...

Issue #17753 has been reported by tenderlovemaking (Aaron Patterson).

31 messages 2021/03/26

[#103088] [Ruby master Feature#17760] Where we should install a header file when `gem install --user`? — muraken@...

Issue #17760 has been reported by mrkn (Kenta Murata).

11 messages 2021/03/30

[#103102] [Ruby master Feature#17762] A simple way to trace object allocation — mame@...

Issue #17762 has been reported by mame (Yusuke Endoh).

18 messages 2021/03/30

[#103105] [Ruby master Feature#17763] Implement cache for cvars — eileencodes@...

Issue #17763 has been reported by eileencodes (Eileen Uchitelle).

18 messages 2021/03/30

[ruby-core:102744] [Ruby master Bug#17671] Possible deadlock with condition variables in Net::IMAP

From: stanhu@...
Date: 2021-03-05 16:41:00 UTC
List: ruby-core #102744
Issue #17671 has been reported by stanhu (Stan Hu).

----------------------------------------
Bug #17671: Possible deadlock with condition variables in Net::IMAP 
https://bugs.ruby-lang.org/issues/17671

* Author: stanhu (Stan Hu)
* Status: Open
* Priority: Normal
* Backport: 2.5: UNKNOWN, 2.6: UNKNOWN, 2.7: UNKNOWN, 3.0: UNKNOWN
----------------------------------------
We're using Net::IMAP via the [MailRoom gem](https://github.com/tpitale/mail_room), and quite frequently we are seeing issues with the process not terminating even though we attempt to run `#Thread#join` with a 60-second timeout.

A GDB backtrace shows that this is stuck waiting for a conditional variable:

```
Thread 2 (Thread 0x7f2bf81ff700 (LWP 11)):
#0  futex_wait_cancelable (private=0, expected=0, futex_word=0x7f2bfa3df810) at ../sysdeps/unix/sysv/linux/futex-internal.h:88
#1  __pthread_cond_wait_common (abstime=0x0, mutex=0x7f2bfa3df840, cond=0x7f2bfa3df7e8) at pthread_cond_wait.c:502
#2  __pthread_cond_wait (cond=cond@entry=0x7f2bfa3df7e8, mutex=mutex@entry=0x7f2bfa3df840) at pthread_cond_wait.c:655
#3  0x00007f2bffe343fd in rb_native_cond_wait (mutex=0x7f2bfa3df840, cond=0x7f2bfa3df7e8) at thread_pthread.c:1197
#4  native_cond_sleep (th=th@entry=0x7f2bfa3df780, rel=rel@entry=0x0) at thread_pthread.c:1197
#5  0x00007f2bffe346f4 in native_sleep (th=th@entry=0x7f2bfa3df780, rel=0x0) at thread_pthread.c:2113
#6  0x00007f2bffe376de in sleep_forever (fl=1, th=0x7f2bfa3df780) at thread.c:1252
#7  rb_thread_sleep_deadly_allow_spurious_wakeup () at thread.c:1343
#8  rb_mutex_sleep_forever (time=time@entry=8) at thread_sync.c:451
#9  0x00007f2bffcce134 in rb_ensure (b_proc=b_proc@entry=0x7f2bffe37660 <rb_mutex_sleep_forever>, data1=data1@entry=8, e_proc=e_proc@entry=0x7f2bffe36160 <mutex_lock_uninterruptible>,
    data2=data2@entry=139826840321280) at eval.c:1129
#10 0x00007f2bffe361bd in rb_mutex_sleep (self=139826840321280, timeout=<optimized out>) at thread_sync.c:477
#11 0x00007f2bffe796e1 in vm_call0_cfunc_with_frame (argv=0x7f2bf80fc858, cd=<optimized out>, calling=0x7f2bf80fc700, ec=0x7f2bfa209bd0) at vm_eval.c:91
#12 vm_call0_cfunc (argv=0x7f2bf80fc858, cd=<optimized out>, calling=0x7f2bf80fc700, ec=0x7f2bfa209bd0) at vm_eval.c:105
#13 vm_call0_body (ec=0x7f2bfa209bd0, calling=0x7f2bf80fc700, cd=<optimized out>, argv=0x7f2bf80fc858) at vm_eval.c:140
#14 0x00007f2bffe7bfd8 in rb_funcallv_with_cc (cd=0x7f2bfffa75c0 <rb_funcallv_data>, recv=139826840321280, mid=24193, argc=1, argv=<optimized out>) at vm_core.h:1805
#15 0x00007f2bffcce134 in rb_ensure (b_proc=b_proc@entry=0x7f2bffe2f360 <do_sleep>, data1=data1@entry=139826822105168, e_proc=e_proc@entry=0x7f2bffe2df60 <delete_from_waitq>, data2=data2@entry=139826822105184)
    at eval.c:1129
#16 0x00007f2bffe2f472 in rb_condvar_wait (argc=<optimized out>, argv=0x7f2bf80fcad0, self=139826858411400) at thread_sync.c:1407
#17 0x00007f2bffe796e1 in vm_call0_cfunc_with_frame (argv=0x7f2bf80fcad0, cd=<optimized out>, calling=0x7f2bf80fc980, ec=0x7f2bfa209bd0) at vm_eval.c:91
#18 vm_call0_cfunc (argv=0x7f2bf80fcad0, cd=<optimized out>, calling=0x7f2bf80fc980, ec=0x7f2bfa209bd0) at vm_eval.c:105
#19 vm_call0_body (ec=0x7f2bfa209bd0, calling=0x7f2bf80fc980, cd=<optimized out>, argv=0x7f2bf80fcad0) at vm_eval.c:140
#20 0x00007f2bffe79d5f in rb_vm_call0 (ec=ec@entry=0x7f2bfa209bd0, recv=recv@entry=139826858411400, id=id@entry=24353, argc=<optimized out>, argv=<optimized out>, me=me@entry=0x7f2bfef8b9d0, kw_splat=0)
    at vm_eval.c:52
#21 0x00007f2bffe79ff8 in rb_vm_call_kw (ec=0x7f2bfa209bd0, recv=139826858411400, id=24353, argc=<optimized out>, argc@entry=2, argv=<optimized out>, argv@entry=0x7f2bf80fcab0, me=0x7f2bfef8b9d0,
    kw_splat=<optimized out>) at vm_eval.c:268
#22 0x00007f2bffe7a453 in rb_call0 (ec=<optimized out>, recv=<optimized out>, mid=<optimized out>, argc=argc@entry=2, argv=argv@entry=0x7f2bf80fcab0, call_scope=call_scope@entry=CALL_FCALL,
    self=<optimized out>) at vm_eval.c:392
#23 0x00007f2bffe7ac7e in rb_call (scope=CALL_FCALL, argv=0x7f2bf80fcab0, argc=2, mid=<optimized out>, recv=<optimized out>) at vm_eval.c:718
#24 rb_funcallv (recv=<optimized out>, mid=<optimized out>, argc=argc@entry=2, argv=argv@entry=0x7f2bf80fcad0) at vm_eval.c:958
#25 0x00007f2bff2f62e1 in monitor_wait_for_cond_body (v=v@entry=139826822106112) at monitor.c:152
#26 0x00007f2bffcce134 in rb_ensure (b_proc=b_proc@entry=0x7f2bff2f62a0 <monitor_wait_for_cond_body>, data1=data1@entry=139826822106112, e_proc=e_proc@entry=0x7f2bff2f64a0 <monitor_enter_for_cond>,
    data2=data2@entry=139826822106112) at eval.c:1129
#27 0x00007f2bff2f641e in monitor_wait_for_cond (monitor=139826840321320, cond=<optimized out>, timeout=8) at monitor.c:180
#28 0x00007f2bffe66c59 in vm_call_cfunc_with_frame (empty_kw_splat=<optimized out>, cd=0x7f2bfe9c8b70, calling=<optimized out>, reg_cfp=0x7f2bf81fd268, ec=0x7f2bfa209bd0) at vm_insnhelper.c:2514
#29 vm_call_cfunc (ec=0x7f2bfa209bd0, reg_cfp=0x7f2bf81fd268, calling=<optimized out>, cd=0x7f2bfe9c8b70) at vm_insnhelper.c:2539
#30 0x00007f2bffe72092 in vm_sendish (block_handler=<optimized out>, method_explorer=<optimized out>, cd=<optimized out>, reg_cfp=<optimized out>, ec=<optimized out>) at vm_insnhelper.c:4023
#31 vm_exec_core (ec=0x7f2bfa3df810, initial=128) at insns.def:801
#32 0x00007f2bffe78365 in rb_vm_exec (ec=0x7f2bfa209bd0, mjit_enable_p=1) at vm.c:1929
#33 0x00007f2bffe82130 in invoke_block (captured=0x7f2bf81fd3d0, opt_pc=<optimized out>, type=<optimized out>, cref=<optimized out>, self=139826840321400, iseq=0x7f2bfef91600, ec=0x7f2bfa209bd0) at vm.c:1044
#34 invoke_iseq_block_from_c (me=0x0, is_lambda=0, cref=<optimized out>, passed_block_handler=<optimized out>, kw_splat=<optimized out>, argv=<optimized out>, argc=0, self=139826840321400,
    captured=0x7f2bf81fd3d0, ec=0x7f2bfa209bd0) at vm.c:1116
#35 invoke_block_from_c_bh (argc=<optimized out>, argv=<optimized out>, kw_splat=<optimized out>, passed_block_handler=<optimized out>, cref=<optimized out>, is_lambda=<optimized out>,
    force_blockarg=<optimized out>, block_handler=<optimized out>, ec=<optimized out>) at vm.c:1134
#36 vm_yield (argc=<optimized out>, argv=<optimized out>, kw_splat=<optimized out>, ec=<optimized out>) at vm.c:1179
#37 rb_yield_0 (argv=<optimized out>, argc=<optimized out>) at vm_eval.c:1227
#38 rb_yield_values (n=<optimized out>) at vm_eval.c:1252
#39 0x00007f2bffe82eef in vm_yield (kw_splat=0, argv=<optimized out>, argc=<optimized out>, ec=<optimized out>) at vm.c:1165
#40 rb_yield_0 (argv=<optimized out>, argc=<optimized out>) at vm_eval.c:1227
#41 rb_yield_values2 (argc=<optimized out>, argv=<optimized out>) at vm_eval.c:1273
#42 0x00007f2bffcce134 in rb_ensure (b_proc=0x7f2bff2f6380 <monitor_sync_body>, data1=139826840321320, e_proc=<optimized out>, data2=<optimized out>) at eval.c:1129
#43 0x00007f2bffe66c59 in vm_call_cfunc_with_frame (empty_kw_splat=<optimized out>, cd=0x7f2bfa4f19e0, calling=<optimized out>, reg_cfp=0x7f2bf81fd380, ec=0x7f2bfa209bd0) at vm_insnhelper.c:2514
#44 vm_call_cfunc (ec=0x7f2bfa209bd0, reg_cfp=0x7f2bf81fd380, calling=<optimized out>, cd=0x7f2bfa4f19e0) at vm_insnhelper.c:2539
#45 0x00007f2bffe72150 in vm_sendish (method_explorer=<optimized out>, block_handler=<optimized out>, cd=<optimized out>, reg_cfp=<optimized out>, ec=<optimized out>) at vm_insnhelper.c:4023
#46 vm_exec_core (ec=0x7f2bfa3df810, initial=128) at insns.def:782
#47 0x00007f2bffe78365 in rb_vm_exec (ec=0x7f2bfa209bd0, mjit_enable_p=1) at vm.c:1929
#48 0x00007f2bffe792bc in invoke_iseq_block_from_c (me=0x0, is_lambda=<optimized out>, cref=0x0, passed_block_handler=0, kw_splat=<optimized out>, argv=<optimized out>, argc=0, self=139826857723800,
    captured=<optimized out>, ec=0x7f2bfa209bd0) at vm.c:1116
#49 invoke_block_from_c_proc (me=0x0, is_lambda=<optimized out>, passed_block_handler=0, kw_splat=<optimized out>, argv=<optimized out>, argc=0, self=139826857723800, proc=0x7f2bfa209bd0, ec=0x7f2bfa209bd0)
    at vm.c:1216
#50 vm_invoke_proc (passed_block_handler=0, kw_splat=<optimized out>, argv=<optimized out>, argc=0, self=139826857723800, proc=0x7f2bfa209bd0, ec=0x7f2bfa209bd0) at vm.c:1238
#51 rb_vm_invoke_proc (ec=0x7f2bfa209bd0, proc=proc@entry=0x7f2bf98a4510, argc=0, argv=<optimized out>, kw_splat=<optimized out>, passed_block_handler=passed_block_handler@entry=0) at vm.c:1259
#52 0x00007f2bffe3646d in thread_do_start (th=0x7f2bfa3df780) at thread.c:697
#53 0x00007f2bffe384ef in thread_start_func_2 (th=0x7f2bfa3df780, stack_start=<optimized out>) at thread.c:745
#54 0x00007f2bffe38a34 in thread_start_func_1 (th_ptr=<optimized out>) at thread_pthread.c:969
#55 0x00007f2bff9c5fa3 in start_thread (arg=<optimized out>) at pthread_create.c:486
#56 0x00007f2bff4574cf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Thread 1 (Thread 0x7f2bff33ff80 (LWP 7)):
#0  0x00007f2bff44c916 in __GI_ppoll (fds=fds@entry=0x7fff52ea2f78, nfds=nfds@entry=1, timeout=<optimized out>, timeout@entry=0x7fff52ea2f80, sigmask=sigmask@entry=0x0) at ../sysdeps/unix/sysv/linux/ppoll.c:39
#1  0x00007f2bffe33880 in rb_sigwait_sleep (th=th@entry=0x7f2bfee2d000, sigwait_fd=sigwait_fd@entry=3, rel=rel@entry=0x7fff52ea3088) at hrtime.h:148
#2  0x00007f2bffe34589 in native_sleep (th=0x7f2bfee2d000, rel=0x7fff52ea3088) at thread_pthread.c:2099
#3  0x00007f2bffe37b64 in rb_thread_terminate_all () at thread.c:587
#4  0x00007f2bffccfbaa in rb_ec_cleanup (ec=ec@entry=0x7f2bfee09050, ex=<optimized out>) at eval.c:218
#5  0x00007f2bffccfed3 in ruby_run_node (n=0x7f2bfa503a98) at eval.c:336
#6  0x000055dabf0fe10b in main (argc=<optimized out>, argv=<optimized out>) at ./main.c:50
```

`Net::IMAP` uses several conditional variables:

1. `idle` : https://github.com/ruby/ruby/blob/48f324e92f9b36edc267f9871e35039cbd1c2eb9/lib/net/imap.rb#L965
1. `get_tagged_response`: https://github.com/ruby/ruby/blob/48f324e92f9b36edc267f9871e35039cbd1c2eb9/lib/net/imap.rb#L1215
1. `send_literal`: https://github.com/ruby/ruby/blob/48f324e92f9b36edc267f9871e35039cbd1c2eb9/lib/net/imap.rb#L1368

The last two do NOT have a timeout, so it's possible we're getting stuck in one of those cases.

I noticed that `send_literal` only checks the state of `@exception` *after* the `wait` returns. Do we need to do this?

```diff
diff --git a/lib/net/imap.rb b/lib/net/imap.rb
index 720acbc86d..2e1fd6e7b9 100644
--- a/lib/net/imap.rb
+++ b/lib/net/imap.rb
@@ -1365,6 +1365,7 @@ def send_literal(str, tag = nil)
         @continued_command_tag = tag
         @continuation_request_exception = nil
         begin
+          raise @exception if @exception
           @continuation_request_arrival.wait
           e = @continuation_request_exception || @exception
           raise e if e
```




-- 
https://bugs.ruby-lang.org/

Unsubscribe: <mailto:ruby-core-request@ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>

In This Thread

Prev Next