From: hanmac@... Date: 2018-03-08T16:52:01+00:00 Subject: [ruby-core:86056] [Ruby trunk Feature#14593] Add `Enumerator#concat` Issue #14593 has been updated by Hanmac (Hans Mackowiak). the size object in `Enumerator.new` can be a proc too that would make the call lazy? ---------------------------------------- Feature #14593: Add `Enumerator#concat` https://bugs.ruby-lang.org/issues/14593#change-70919 * Author: skalee (Sebastian Skalacki) * Status: Open * Priority: Normal * Assignee: * Target version: ---------------------------------------- I propose introducing an `Enumerator#concat(other_enum1, other_enum2, ...)` which returns an enumerator being a concatenation of `self` and passed arguments. Expectation ----------- ~~~ ruby enum1 = [1, 2, 3].each enum2 = %w[a b c].each enum3 = %i[X Y Z].each concatenation = enum1.concat(enum2, enum3) concatenation.kind_of?(Enumerator) #=> true concatenation.to_a #=> [1, 2, 3, "a", "b", "c", :X, :Y, :Z] concatenation.size #=> 9 enum_without_size = Enumerator.new {} enum_without_size.size #=> nil concatenation2 = enum1.concat(enum2, enum_without_size, enum3) concatenation2.kind_of?(Enumerator) #=> true concatenation2.to_a #=> [1, 2, 3, "a", "b", "c", :X, :Y, :Z] concatenation2.size #=> nil ~~~ Reasoning --------- Enumerators are generally useful. They allow to iterate over some data set without loading them fully into memory. They help separating data generation from its consumption. If enumerators are desirable, then enumerator concatenation is desirable as well. Reference implementation ------------------------ ~~~ ruby class Enumerator def concat(*enums) enumerators = [self, *enums] size = enumerators.reduce(0) do |acc, enum| s = enum.size break nil unless s acc + s end Enumerator.new(size) do |y| enumerators.each do |enum| enum.each { |item| y << item } end end end end ~~~ Flat map one-liner ------------------ There's an answer on Stack Overflow suggesting a neat one-liner ��� https://stackoverflow.com/a/38962951/304175 ~~~ ruby enums.lazy.flat_map{|enum| enum.lazy } ~~~ It yields items correctly. However, it is not very idiomatic. Neither it implements `#size` method properly (see example below). For these reasons, I think that implementing `Enumerator#concat` is a better option. ~~~ ruby enums = [enum1, enum2, enum3] #=> [#, #, #] concatenation3 = enums.lazy.flat_map{|enum| enum.lazy } #=> #, #, #]>:flat_map> concatenation3.to_a #=> [1, 2, 3, "a", "b", "c", :X, :Y, :Z] concatenation3.size #=> nil ~~~ Example use cases ----------------- Process 20 tweets/posts without fetching more than needed. Generate some example posts if less than 20 is available ~~~ ruby enum_tweets = lazy_fetch_tweets_from_twitter(count: 20) #=> Enumerator enum_fb_posts = lazy_fetch_posts_from_facebook(count: 20) #=> Enumerator enum_example_posts = Enumerator.new { |y| loop { y << generate_random_post } } #=> Enumerator posts = enum_tweets.concat(enum_fb_posts).concat(enum_example_posts).take(20) process(posts) ~~~ Perform a table union on large CSV files ~~~ ruby csv1_enum = CSV.foreach("path/to/1.csv") csv2_enum = CSV.foreach("path/to/2.csv") csv1_enum.concat(csv2_enum).detect { |row| is_what_we_are_looking_for?(row) } ~~~ -- https://bugs.ruby-lang.org/ Unsubscribe: