[ruby-core:109588] [Ruby master Feature#18965] Further Thread::Queue improvements
From:
"Eregon (Benoit Daloze)" <noreply@...>
Date:
2022-08-20 11:13:12 UTC
List:
ruby-core #109588
Issue #18965 has been updated by Eregon (Benoit Daloze).
Thank you for the benchmark.
Given the results, I think it's currently not worth it to add batch push/pop, because it does not seem to improve performance much over a loop of single push/pop in realistic workloads and the semantics are quite a bit more complicated.
Also depending on the Queue/SizedQueue implementation (typical for non-blocking queues) there might simply not be a way to do batch push/pop, making the performance benefit actually zero for those implementations (well, slightly less Ruby calls but JITs are good at removing the cost of that).
I'd suggest to focus on `Queue.new.pop(nonblock: true) # => nil`.
Using the keyword as a way to deprecate the positional argument makes sense to me, so I agree with `pop(nonblock: true/false)`.
----------------------------------------
Feature #18965: Further Thread::Queue improvements
https://bugs.ruby-lang.org/issues/18965#change-98763
* Author: byroot (Jean Boussier)
* Status: Open
* Priority: Normal
----------------------------------------
Following the recent addition of a `timeout` parameter to `Queue#pop`, there are a handful of other improvements I'd like to make.
### Batch insert
When using the queue for batch processing, it would be good to be able to push multiple elements at once:
Currently you have to call `push` repeatedly
```ruby
items.each do |item|
queue.push(item)
end
```
That's wasteful because on each call we check wether the queue is closed, try to wakeup blocked threads, etc.
It would be much better if you could do:
```ruby
queue.concat(items)
```
With of course both `nonblock` and `timeout` support.
Then there's the question of how `SizedQueue` would behave if it's not full, but still doesn't have space for all the elements. e.g.
```ruby
queue = SizedQueue.new(10)
queue.concat(6.times.to_a)
queue.concat(6.times.to_a) # Block until there is 6 free slots?
```
I think the simplest would be to wait for enough space to append the entire set, because combined with a timeout, it would be awkward if only part of the array was concatenated.
### Batch pop
Similarly, sometimes the consumer of a queue is capable of batching, and right now it's not efficient:
```ruby
loop do
items = [queue.pop]
begin
99.times do
items << queue.pop(true) # true is for nonblock
end
rescue ThreadError # empty queue
end
process_items(items)
end
```
It would be much more efficient if `pop` accepted a `count` parameter:
```ruby
loop do
items = queue.pop(count: 100)
process_items(items)
end
```
The behavior would be:
- Block if the queue is empty
- If it's not empty, return **up to** `count` items (Just like `Array#pop`)
### Non blocking mode, without exception
As shown above, the current `nonblock` parameter is a bit awkward, because:
- It raises an exception, which is very expensive for a construct often used in "low level" code.
- The exception is `ThreadError`, so you may have to match the error message for `"queue empty"`, to make sure it doesn't come from a Mutex issue or something like that.
I believe that we could introduce a keyword argument:
```ruby
Queue.new.pop(nonblock: true) # => nil
```
--
https://bugs.ruby-lang.org/
Unsubscribe: <mailto:ruby-core-request@ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>