From: "matz (Yukihiro Matsumoto)" Date: 2012-04-06T08:53:37+09:00 Subject: [ruby-core:44153] [ruby-trunk - Feature #6261][Rejected] Enumerable#emap and Enumerable#egrep Issue #6261 has been updated by matz (Yukihiro Matsumoto). Status changed from Open to Rejected use Enumerable#lazy. Matz. ---------------------------------------- Feature #6261: Enumerable#emap and Enumerable#egrep https://bugs.ruby-lang.org/issues/6261#change-25674 Author: yimutang (Joey Zhou) Status: Rejected Priority: Normal Assignee: Category: Target version: I was inspired by Ruby 1.9.x`s Enumerable#chunk and #slice_before, which both take a block and return an enumerator. I wish to introduce two new method into the Enumerable core, which can be implemented in Ruby like this: module Enumerable def emap # return an enumerator raise ArgumentError, 'no block given' unless block_given? Enumerator.new do |yielder| self.each do |elem| mapped = yield elem yielder << mapped end end end def egrep raise ArgumentError, 'no block given' unless block_given? Enumerator.new do |yielder| self.each do |elem| allowed = yield elem yielder << elem if allowed end end end end #emap + #to_a is just like #map / #collect, #egrep + #to_a is just like #select. Why I think it's necessary to introduce those methods? Because #collect and #select sometimes are not effecient. Here's an weird example: lines = File.foreach('a_very_large_file') .egrep {|line| line.length < 10 } .emap {|line| line.chomp!; line } .each_slice(3) .emap {|lines| lines.join(';').downcase } .take_while {|line| line.length > 20 } The above code means: from 'a_very_large_file' take each line, let go whose length < 10, chomp each allowed line, take 3 of them as a group and join them, at last, stop when the length of joined line has length less than 20. If you replace #egrep with #select, #emap with #collect, you must iterate the whole lines of 'a_very_large_file' and create a temporary array, 3 times! It is not efficient in this situation, because the #take_while means 'I do not want to check all lines'. If you want to omit the #select and #collect, just do it like: File.foreach('a_very_large_file') do |line| # blah blah to achieve the same goal end I'm afraid it's hard to make the code clear at a glance. So you may see #egrep and #emap are very useful. Another example, I want to make a class FreqDist, which records the frequency distribution of a population of samples. class FreqDist def initialize(samples) @sample_dict = Hash.new(0) samples.each {|sample| @sample_dict[sample] += 1 } end end I want to use FreqDist to store the frequency distribution of a list of words, but there is case problem, 'When' and 'when' should not be regard as two sample. I can do it like this: fd = FreqDist.new(words.emap {|w| w.downcase }) use an enumerator instead of an array as argument, iterate once, no temporary array. Well, in my opinion, such #emap and #egrep are very powerful. Although I can implement them in Ruby and put them in a custom gem, I think it's better to introduce them into the core Enumerable module. Please consider the suggestion. Thank you! -- http://bugs.ruby-lang.org/