From: "byroot (Jean Boussier)" Date: 2022-08-18T19:59:35+00:00 Subject: [ruby-core:109564] [Ruby master Feature#18965] Further Thread::Queue improvements Issue #18965 has been updated by byroot (Jean Boussier). > That would mean basically ignoring the timeout, not great. Maybe my writing wasn't clear. I don't suggest to ignore the timeout, but to either append all items or none. If there isn't enough space available, wait for as long as the timeout allows you to. > I think for these batch push/pop, they should be motivated with benchmarks (ideally based on some real use-case) and a proof-of-concept PR. Like the previous changes I requested on queue, this is motivated by a real world use case, basically I'd like to use a SizedQueue in this code: https://github.com/Shopify/statsd-instrument/blob/da8a74c7fbe8fdd421aa1df7eda2a996bc7c3f11/lib/statsd/instrument/batched_udp_sink.rb#L93-L123 This library has a background thread that monitors a queue, everytime the queue has X items or that Y seconds have passed, it flushes the queue and emit `statsd` events. I can try implementing the suggested feature and then benchmark that code using it. > The kwarg seems a bit confusing given the existing positional arg (Queue#pop(non_block=false)). Indeed, in my mind the positional argument would be somewhat deprecated. > Is there any core method currently with a nonblock keyword argument? I don't think so. > Why not exception: false which seems more standard and established, i.e., queue.pop(true, exception: false)? Too verbose/inconvenient? I think the main reason is that I don't like that positional argument. But yes `exception: false` would be consistent with `IO#read_nonblock`, except that if passed alone it does nothing, which is awkward. > so it's just the default implementation adding one by one, hence addAll() is not atomic I don't think that's really a problem, at least not for the use cases I envision, of a background thread collecting work and processing it in batches. Worst case if it's not atomic, the background thread will get the first element, and then get the rest on its next pop. ---------------------------------------- Feature #18965: Further Thread::Queue improvements https://bugs.ruby-lang.org/issues/18965#change-98735 * Author: byroot (Jean Boussier) * Status: Open * Priority: Normal ---------------------------------------- Following the recent addition of a `timeout` parameter to `Queue#pop`, there are a handful of other improvements I'd like to make. ### Batch insert When using the queue for batch processing, it would be good to be able to push multiple elements at once: Currently you have to call `push` repeatedly ```ruby items.each do |item| queue.push(item) end ``` That's wasteful because on each call we check wether the queue is closed, try to wakeup blocked threads, etc. It would be much better if you could do: ```ruby queue.concat(items) ``` With of course both `nonblock` and `timeout` support. Then there's the question of how `SizedQueue` would behave if it's not full, but still doesn't have space for all the elements. e.g. ```ruby queue = SizedQueue.new(10) queue.concat(6.times.to_a) queue.concat(6.times.to_a) # Block until there is 6 free slots? ``` I think the simplest would be to wait for enough space to append the entire set, because combined with a timeout, it would be awkward if only part of the array was concatenated. ### Batch pop Similarly, sometimes the consumer of a queue is capable of batching, and right now it's not efficient: ```ruby loop do items = [queue.pop] begin 99.times do items << queue.pop(true) # true is for nonblock end rescue ThreadError # empty queue end process_items(items) end ``` It would be much more efficient if `pop` accepted a `count` parameter: ```ruby loop do items = queue.pop(count: 100) process_items(items) end ``` The behavior would be: - Block if the queue is empty - If it's not empty, return **up to** `count` items (Just like `Array#pop`) ### Non blocking mode, without exception As shown above, the current `nonblock` parameter is a bit awkward, because: - It raises an exception, which is very expensive for a construct often used in "low level" code. - The exception is `ThreadError`, so you may have to match the error message for `"queue empty"`, to make sure it doesn't come from a Mutex issue or something like that. I believe that we could introduce a keyword argument: ```ruby Queue.new.pop(nonblock: true) # => nil ``` -- https://bugs.ruby-lang.org/ Unsubscribe: