[#43120] [ruby-trunk - Bug #6124][Open] What is the purpose of "fake" gems in Ruby — Vit Ondruch <v.ondruch@...>

27 messages 2012/03/07

[#43142] Questions about thread performance (with benchmark included) — Rodrigo Rosenfeld Rosas <rr.rosas@...>

A while ago I've written an article entitled "How Nokogiri and JRuby

10 messages 2012/03/08

[#43148] [ruby-trunk - Feature #6126][Open] Introduce yes/no constants aliases for true/false — Egor Homakov <homakov@...>

16 messages 2012/03/09

[#43238] [ruby-trunk - Feature #6130][Open] inspect using to_s is pain — Thomas Sawyer <transfire@...>

21 messages 2012/03/11

[#43313] [ruby-trunk - Feature #6150][Open] add Enumerable#grep_v — Suraj Kurapati <sunaku@...>

17 messages 2012/03/15

[#43325] [ruby-trunk - Bug #6154][Open] Eliminate extending WaitReadable/Writable at runtime — Charles Nutter <headius@...>

25 messages 2012/03/16

[#43334] [ruby-trunk - Bug #6155][Open] Enumerable::Lazy#flat_map raises an exception when an element does not respond to #each — Dan Kubb <dan.kubb@...>

9 messages 2012/03/16

[#43370] [ruby-trunk - Feature #6166][Open] Enumerator::Lazy#pinch — Thomas Sawyer <transfire@...>

15 messages 2012/03/17

[#43373] [ruby-trunk - Bug #6168][Open] Segfault in OpenSSL bindings — Nguma Abojo <git.email.address@...>

14 messages 2012/03/17

[#43454] [ruby-trunk - Bug #6174][Open] Fix collision of ConditionVariable#wait timeout and #signal (+ other cosmetic changes) — "funny_falcon (Yura Sokolov)" <funny.falcon@...>

10 messages 2012/03/18

[#43497] [ruby-trunk - Bug #6179][Open] File::pos broken in Windows 1.9.3p125 — "jmthomas (Jason Thomas)" <jmthomas@...>

24 messages 2012/03/20

[#43502] [ruby-trunk - Feature #6180][Open] to_b for converting objects to a boolean value — "AaronLasseigne (Aaron Lasseigne)" <aaron.lasseigne@...>

17 messages 2012/03/20

[#43529] [ruby-trunk - Bug #6183][Open] Enumerator::Lazy performance issue — "gregolsen (Innokenty Mikhailov)" <anotheroneman@...>

36 messages 2012/03/21

[#43543] [ruby-trunk - Bug #6184][Open] [BUG] Segmentation fault ruby 1.9.3p165 (2012-03-18 revision 35078) [x86_64-darwin11.3.0] — "Gebor (Pierre-Henry Frohring)" <frohring.pierrehenry@...>

8 messages 2012/03/21

[#43672] [ruby-trunk - Feature #6201][Open] do_something then return :special_case (include "then" operator) — "rosenfeld (Rodrigo Rosenfeld Rosas)" <rr.rosas@...>

12 messages 2012/03/26

[#43678] [ruby-trunk - Bug #6203][Open] Array#values_at does not handle ranges with end index past the end of the array — "ferrous26 (Mark Rada)" <markrada26@...>

15 messages 2012/03/26

[#43794] [ruby-trunk - Feature #6216][Open] SystemStackError backtraces should not be reduced to one line — "postmodern (Hal Brodigan)" <postmodern.mod3@...>

15 messages 2012/03/28

[#43814] [ruby-trunk - Feature #6219][Open] Return value of Hash#store — "MartinBosslet (Martin Bosslet)" <Martin.Bosslet@...>

20 messages 2012/03/28

[#43858] [ruby-trunk - Feature #6222][Open] Use ++ to connect statements — "gcao (Guoliang Cao)" <gcao99@...>

12 messages 2012/03/29

[#43904] [ruby-trunk - Feature #6225][Open] Hash#+ — "trans (Thomas Sawyer)" <transfire@...>

36 messages 2012/03/29

[#43951] [ruby-trunk - Bug #6228][Open] [mingw] Errno::EBADF in ruby/test_io.rb on ruby_1_9_3 — "jonforums (Jon Forums)" <redmine@...>

28 messages 2012/03/30

[#43996] [ruby-trunk - Bug #6236][Open] WEBrick::HTTPServer swallows Exception — "regularfry (Alex Young)" <alex@...>

13 messages 2012/03/31

[ruby-core:43450] [ruby-trunk - Feature #4151] Enumerable#categorize

From: "trans (Thomas Sawyer)" <transfire@...>
Date: 2012-03-18 14:01:28 UTC
List: ruby-core #43450
Issue #4151 has been updated by trans (Thomas Sawyer).


I'm sorry, but #categorize has big code smell. Seems like too much functionality is being shoved into one method. It is still not clear to me after reading over explanation many times how #categorize works. So either it is relatively straight forward but the explanation is overly complexed, or the method itself is too complex. The later seems more likely given the number of options, especially lambda options, the method takes.

Let's look at a given example:

* [ruby-talk:288931]

  [["1", "01-02-2008", 5],
   ["1", "01-03-2008", 10],
   ["2", "12-25-2007", 5],
   ["1", "01-04-2008", 15]]
  to
  {"1" => {"01-02-2008" => 5, "01-03-2008" => 10, "01-04-2008" => 15},
   "2" => {"12-25-2007" => 5}}

Implemented as:

  orig.categorize(:op=>lambda {|x,y| y}) {|e| e }

Traditionally, the above would be something like:

  hash = Hash.new{|h,k| h[k]={}}
  list.each do |group, date, size|
    hash[group][date] ||= 0
    hash[group][date] += size
  end

This may be longer but it is very readable. Actually, if Ruby would *finally* offer some convenience method for the first line, e.g. `Hash.auto{{}}` instead of `Hash.new{|h,k| h[k]={}}`, then

  hash = Hash.auto{Hash.auto(0)}
  list.each{ |group, date, size| hash[group][date] += size }

I suspect almost every case for using #categorize will be able to be treated in much the same manner.

This is not to say however that I don't think some form(s) of `Enumerable -> Hash` would not be useful. I think it would, in fact I sometimes use Enumerable#mash (alias #graph).

    [1,2,3].mash{ |v| [v.to_s, v] }  #=> {'1'=>1, '2'=>2, '3'=>3}

But I would rather see a few methods that cover basic use cases that can work together and with other methods to build up more complex solutions then to create a single method that tries to cover every complex possibility in one go.


----------------------------------------
Feature #4151: Enumerable#categorize
https://bugs.ruby-lang.org/issues/4151#change-24918

Author: akr (Akira Tanaka)
Status: Open
Priority: Low
Assignee: akr (Akira Tanaka)
Category: 
Target version: 


=begin
 Hi.
 
 How about a method for converting enumerable to hash?
 
   enum.categorize([opts]) {|elt| [key1, ..., val] } -> hash
 
 categorizes the elements in _enum_ and returns a hash.
 
 The block is called for each elements in _enum_.
 The block should return an array which contains
 one or more keys and one value.
 
   p (0..10).categorize {|e| [e % 3, e % 5] }
   #=> {0=>[0, 3, 1, 4], 1=>[1, 4, 2, 0], 2=>[2, 0, 3]}
 
 The keys and value are used to construct the result hash.
 If two or more keys are provided
 (i.e. the length of the array is longer than 2),
 the result hash will be nested.
 
   p (0..10).categorize {|e| [e&4, e&2, e&1, e] }
   #=> {0=>{0=>{0=>[0, 8],
   #            1=>[1, 9]},
   #        2=>{0=>[2, 10],
   #            1=>[3]}},
   #    4=>{0=>{0=>[4],
   #            1=>[5]},
   #        2=>{0=>[6],
   #            1=>[7]}}}
 
 The value of innermost hash is an array which contains values for
 corresponding keys.
 This behavior can be customized by :seed, :op and :update option.
 
 This method can take an option hash.
 Available options are follows:
 
 - :seed specifies seed value.
 - :op specifies a procedure from seed and value to next seed.
 - :update specifies a procedure from seed and block value to next seed.
 
 :seed, :op and :update customizes how to generate
 the innermost hash value.
 :seed and :op behavies like Enumerable#inject.
 
 If _seed_ and _op_ is specified, the result value is generated as follows.
   op.call(..., op.call(op.call(seed, v0), v1), ...)
 
 :update works as :op except the second argument is the block value itself
 instead of the last value of the block value.
 
 If :seed option is not given, the first value is used as the seed.
 
   # The arguments for :op option procedure are the seed and the value.
   # (i.e. the last element of the array returned from the block.)
   r = [0].categorize(:seed => :s,
                      :op => lambda {|x,y|
                        p [x,y]               #=> [:s, :v]
                        1
                      }) {|e|
     p e #=> 0
     [:k, :v]
   }
   p r #=> {:k=>1}
 
   # The arguments for :update option procedure are the seed and the array
   # returned from the block.
   r = [0].categorize(:seed => :s,
                      :update => lambda {|x,y|
                        p [x,y]               #=> [:s, [:k, :v]]
                        1
                      }) {|e|
     p e #=> 0
     [:k, :v]
   }
   p r #=> {:k=>1}
 
 The default behavior, array construction, can be implemented as follows.
   :seed => nil
   :op => lambda {|s, v| !s ? [v] : (s << v) }
 
 Note that matz doesn't find satisfact in the method name, "categorize".
 [ruby-dev:42681]
 
 Also note that matz wants another method than this method,
 which the hash value is the last value, not an array of all values.
 This can be implemented by enum.categorize(:op=>lambda {|x,y| y}) { ... }.
 But good method name is not found yet.
 [ruby-dev:42643]
 -- 
 Tanaka Akira
 
 Attachment: enum-categorize.patch
=end



-- 
http://bugs.ruby-lang.org/

In This Thread

Prev Next