From: "Eregon (Benoit Daloze)" Date: 2022-08-21T12:13:11+00:00 Subject: [ruby-core:109613] [Ruby master Feature#18965] Further Thread::Queue improvements Issue #18965 has been updated by Eregon (Benoit Daloze). byroot (Jean Boussier) wrote in #note-10: > Is it true for both push and pop? I can see the issue for push, but no so much for pop. Interesting, for `LinkedBlockingQueue` there is `java.util.concurrent.BlockingQueue#drainTo(java.util.Collection output, int maxElements)` (semantics are remove up to maxElements, don't wait for more elements). But `LinkedBlockingQueue` is actually not a non-blocking queue (it uses 2 locks). `ConcurrentLinkedQueue` is a non-blocking queue and more what I was thinking about for non-blocking queue implementations. That doesn't have a batch pop, and is literally just CAS operations for both push and pop. But that's probably not directly usable to implement Ruby's Queue since it doesn't have a blocking pop which Queue#pop needs. I think there can be queue implementations in between, which use CAS(node) for push/pop, but if it fails then goes to a lock or similar to wait. In such a case, it's not possible to implement batch push/pop atomically, or it would force to give up on the CAS fast path and take a lock always (a lock can use CAS internally but basically it's extra work, indirections and most importantly contention). It's still possible to implement batch pop as `N.times { queue.pop }` of course (with no performance benefit). I think the problem is mostly semantics, if there is a batch pop, it might lead to people to think it removes N elements atomically or any better than `N.times { queue.pop }`, but in fact it might not (atomic or better perf). And if a single Ruby implementation actually has atomic semantics there, people might rely on it, just like they rely on Set being ordered (even though it was never documented). And therefore that would basically prevent some queue implementations to be used for compatibility (and e.g. force all Ruby Queue impls to be single-lock or maybe dual-lock depending on the atomicity guarantees). Things like linearizability just don't work for batch push/pop. So I think it's better to keep single push/pop, because those semantics are well-defined and honored by all queue implementations. If the user does `N.times { queue.pop }`, then the semantics are very clear. ---------------------------------------- Feature #18965: Further Thread::Queue improvements https://bugs.ruby-lang.org/issues/18965#change-98797 * Author: byroot (Jean Boussier) * Status: Open * Priority: Normal ---------------------------------------- Following the recent addition of a `timeout` parameter to `Queue#pop`, there are a handful of other improvements I'd like to make. ### Batch insert When using the queue for batch processing, it would be good to be able to push multiple elements at once: Currently you have to call `push` repeatedly ```ruby items.each do |item| queue.push(item) end ``` That's wasteful because on each call we check wether the queue is closed, try to wakeup blocked threads, etc. It would be much better if you could do: ```ruby queue.concat(items) ``` With of course both `nonblock` and `timeout` support. Then there's the question of how `SizedQueue` would behave if it's not full, but still doesn't have space for all the elements. e.g. ```ruby queue = SizedQueue.new(10) queue.concat(6.times.to_a) queue.concat(6.times.to_a) # Block until there is 6 free slots? ``` I think the simplest would be to wait for enough space to append the entire set, because combined with a timeout, it would be awkward if only part of the array was concatenated. ### Batch pop Similarly, sometimes the consumer of a queue is capable of batching, and right now it's not efficient: ```ruby loop do items = [queue.pop] begin 99.times do items << queue.pop(true) # true is for nonblock end rescue ThreadError # empty queue end process_items(items) end ``` It would be much more efficient if `pop` accepted a `count` parameter: ```ruby loop do items = queue.pop(count: 100) process_items(items) end ``` The behavior would be: - Block if the queue is empty - If it's not empty, return **up to** `count` items (Just like `Array#pop`) ### Non blocking mode, without exception As shown above, the current `nonblock` parameter is a bit awkward, because: - It raises an exception, which is very expensive for a construct often used in "low level" code. - The exception is `ThreadError`, so you may have to match the error message for `"queue empty"`, to make sure it doesn't come from a Mutex issue or something like that. I believe that we could introduce a keyword argument: ```ruby Queue.new.pop(nonblock: true) # => nil ``` -- https://bugs.ruby-lang.org/ Unsubscribe: