From: "Eregon (Benoit Daloze)" Date: 2022-08-18T18:59:09+00:00 Subject: [ruby-core:109563] [Ruby master Feature#18965] Further Thread::Queue improvements Issue #18965 has been updated by Eregon (Benoit Daloze). chrisseaton (Chris Seaton) wrote in #note-3: > I was going to comment that adding or removing multiple items from a queue is likely not great for implementation, as we'd need a lock to make that atomic (or some good ideas of how to do it otherwise.) But then I looked at TruffleRuby and even we're using a lock anyway, so nothing here is a problem. It could be though. At some point TruffleRuby used Java's `LinkedBlockingQueue` (removed in this [PR](https://github.com/oracle/truffleruby/pull/13) because it's hard to implement `Queue#num_waiting` without poking in JVM internals) and JRuby currently uses a [variant of it](https://github.com/jruby/jruby/blob/master/core/src/main/java/org/jruby/ext/thread/Queue.java), which are non-blocking queue implementations, so they (typically) cannot do multiple operations (push, pop, ...) atomically. (there are also trade-offs with a non-blocking queue as it might be slower in single-threaded/low contention cases, e.g. https://github.com/oracle/truffleruby/issues/595) Java's `LinkedBlockingQueue` also doesn't seem to implement `addAll()` so it's just the default implementation adding one by one, hence `addAll()` is not atomic and the added elements might be interleaved with another thread's pushed elements. That might be an indication a batch push/pop is not that big a gain or too problematic for contention. ---------------------------------------- Feature #18965: Further Thread::Queue improvements https://bugs.ruby-lang.org/issues/18965#change-98734 * Author: byroot (Jean Boussier) * Status: Open * Priority: Normal ---------------------------------------- Following the recent addition of a `timeout` parameter to `Queue#pop`, there are a handful of other improvements I'd like to make. ### Batch insert When using the queue for batch processing, it would be good to be able to push multiple elements at once: Currently you have to call `push` repeatedly ```ruby items.each do |item| queue.push(item) end ``` That's wasteful because on each call we check wether the queue is closed, try to wakeup blocked threads, etc. It would be much better if you could do: ```ruby queue.concat(items) ``` With of course both `nonblock` and `timeout` support. Then there's the question of how `SizedQueue` would behave if it's not full, but still doesn't have space for all the elements. e.g. ```ruby queue = SizedQueue.new(10) queue.concat(6.times.to_a) queue.concat(6.times.to_a) # Block until there is 6 free slots? ``` I think the simplest would be to wait for enough space to append the entire set, because combined with a timeout, it would be awkward if only part of the array was concatenated. ### Batch pop Similarly, sometimes the consumer of a queue is capable of batching, and right now it's not efficient: ```ruby loop do items = [queue.pop] begin 99.times do items << queue.pop(true) # true is for nonblock end rescue ThreadError # empty queue end process_items(items) end ``` It would be much more efficient if `pop` accepted a `count` parameter: ```ruby loop do items = queue.pop(count: 100) process_items(items) end ``` The behavior would be: - Block if the queue is empty - If it's not empty, return **up to** `count` items (Just like `Array#pop`) ### Non blocking mode, without exception As shown above, the current `nonblock` parameter is a bit awkward, because: - It raises an exception, which is very expensive for a construct often used in "low level" code. - The exception is `ThreadError`, so you may have to match the error message for `"queue empty"`, to make sure it doesn't come from a Mutex issue or something like that. I believe that we could introduce a keyword argument: ```ruby Queue.new.pop(nonblock: true) # => nil ``` -- https://bugs.ruby-lang.org/ Unsubscribe: