From: "kyanagi (Kouhei Yanagita) via ruby-core" Date: 2023-09-13T01:23:20+00:00 Subject: [ruby-core:114710] [Ruby master Feature#19075] Binary searching for the last element Issue #19075 has been updated by kyanagi (Kouhei Yanagita). (In the following, searching for the last element will be referred to as "LAST".) There are often cases where searching for the last element would be useful. A typical example that immediately comes to mind is as follows: * When we have a list of events with timestamps, find the last event that occurred up to a specific time ``` events.bsearch(target: :last) { |event| event.timestamp <= time } ``` Of course, this can be written without LAST, but I believe using LAST makes it more natural and less prone to errors. ``` index = events.bsearch { |event| event.timestamp > time } last_event = if index.nil? events.last elsif index == 0 nil else events[index-1] end ``` A similar example is the case of finding the most expensive item available in a shop. ``` highest = item_prices.bsearch(target: :last) { |price| price <= your_money } ``` Let's see another example. Given a sorted list of words, find the last word with a particular prefix in lexicographic order. ``` words = %w(pen pi pin pine ping png) prefix = 'pin' # => want to get 'ping' ``` With LAST and determistic find-any mode (what I call find-by-number mode in my pull request), the solution to this problem can be written as follows. ``` words.bsearch(target: :last) { |w| prefix <=> w[0, prefix.size] } ``` Using the current `bsearch_index`, it would be like this. ``` index = words.bsearch_index { |w| w[0, prefix.size] > prefix } w = if index.nil? words.last elsif index == 0 nil else words[index-1] end ans = w if w[0, prefix.size] == prefix ``` The difference is obvious. When it comes to names, I thought `target: :last` was easy to understand. However, the fact that several people have said that it is confusing means that this may not be an appropriate name. I hope we can find a good name. As mame-san says, `bsearch` is a confusing method in the first place. The condition that false must precede true in the value returned by the block, or that positive must precede zero and zero must precede negative, is difficult to imagine from the appearance of the method. Rather than trying to interpret the meaning of behavior, a formal name which represents "the reverse of `bsearch`" might be easier to understand. ---------------------------------------- Feature #19075: Binary searching for the last element https://bugs.ruby-lang.org/issues/19075#change-104544 * Author: kyanagi (Kouhei Yanagita) * Status: Open * Priority: Normal ---------------------------------------- My latest proposal is https://bugs.ruby-lang.org/issues/19075#note-6. I will leave the initial proposal below. --- PR: https://github.com/ruby/ruby/pull/6611 (I'm going to talk about `Array` here, but the same argument can be made for `Range`. If `Array#bsearch_last` is acceptable, I will work also for `Range`.) Ruby's bsearch returns the first element which satisfies the given block. ```ruby # Search the first element greater than 18 array = [10, 15, 20, 25] array.bsearch { |x| x > 18 } # => 20 ``` If we want the last element, we need to invert the condition and step backward. ```ruby # Search the last element less than 18 array = [10, 15, 20, 25] index = array.bsearch_index { |x| !(x < 18) } array[index-1] # => 15 ``` Of course, we need to consider `nil` and the boundary. ```ruby # Search the last element less than 100 index = array.bsearch_index { |x| !(x < 100) } # => nil if index.nil? array.last # => 25 else array[index-1] end ``` ```ruby # Search the last element less than 0 index = array.bsearch_index { |x| !(x < 0) } # => 0 if index.nil? array.last elsif index == 0 nil else array[index-1] end ``` This is where mistakes can easily be made, so I propose `Array#bsearch_last` and `Array#bsearch_last_index`. `Array#bsearch_last` returns the last element which satisfies the given block. `Array#bsearch` requires that all false-evaluating elements precede all true-evaluating elements. As is clear from the meaning of the method, conversely to `bsearch`, `bsearch_last` requires that all true-evaluating elements precede all false-evaluating elements. (If `bsearch_last` is acceptable, the name "find-minimum mode" should be changed.) ```ruby array = [10, 15, 20, 25] array.bsearch_last { |x| x < 18 } # => 15 array.bsearch_last { |x| x < 100 } # => 25 array.bsearch_last { |x| x < 0 } # => nil ``` There are several possible options for find-any mode. (1) `bsearch_last` does not support find-any mode. A block for `bsearch_last` must return `true`, `false` or `nil`. ``` [1, 2, 3].bsearch_last { 0 } # => TypeError ``` My pull request tentatively includes this implementation. (2) `bsearch_last` supports find-any mode and it behaves like `bsearch`. `bsearch` with find-any mode returns an element, for which the block returns zero. If multiple elements satisfy the condition, it is not determined which of them will be returned. It is conceivable that `bsearch_last` behaves in the same way as `bsearch`. ``` # current behavior # It is not specified whether `:b`, `:c`, or `:d` is returned. [[1,:a], [2, :b], [2, :c], [2, :d], [3, :e]].bsearch { |a, b| 2 <=> a } # => [2, :c] ``` (3) `bsearch_last` supports find-any mode and returns the last element. Make `bsearch` return the first element. Change the behavior of `bsearch` to return the first element for which the block returns zero. `bsearch_last` returns the last element for which the block returns zero. ``` # Change it like this: [[1,:a], [2, :b], [2, :c], [2, :d], [3, :e]].bsearch { |a, b| 2 <=> a } # => [2, :b] [[1,:a], [2, :b], [2, :c], [2, :d], [3, :e]].bsearch_last { |a, b| 2 <=> a } # => [2, :d] ``` (If this option is adopted, the name "find-any mode" should be renamed.) -- https://bugs.ruby-lang.org/ ______________________________________________ ruby-core mailing list -- ruby-core@ml.ruby-lang.org To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org ruby-core info -- https://ml.ruby-lang.org/mailman3/postorius/lists/ruby-core.ml.ruby-lang.org/