ruby-core

Mailing list archive

[#33640] [Ruby 1.9-Bug#4136][Open] Enumerable#reject should not inherit the receiver's instance variables — Hiro Asari <redmine@...>

Bug #4136: Enumerable#reject should not inherit the receiver's instance variables

10 messages 2010/12/08

[#33667] [Ruby 1.9-Bug#4149][Open] Documentation submission: syslog standard library — mathew murphy <redmine@...>

Bug #4149: Documentation submission: syslog standard library

11 messages 2010/12/10

[#33683] [feature:trunk] Enumerable#categorize — Tanaka Akira <akr@...>

Hi.

14 messages 2010/12/12
[#33684] Re: [feature:trunk] Enumerable#categorize — "Martin J. Dst" <duerst@...> 2010/12/12

[#33687] Towards a standardized AST for Ruby code — Magnus Holm <judofyr@...>

Hey folks,

23 messages 2010/12/12
[#33688] Re: Towards a standardized AST for Ruby code — Charles Oliver Nutter <headius@...> 2010/12/12

On Sun, Dec 12, 2010 at 9:55 AM, Magnus Holm <judofyr@gmail.com> wrote:

[#33689] Re: Towards a standardized AST for Ruby code — "Haase, Konstantin" <Konstantin.Haase@...> 2010/12/12

On Dec 12, 2010, at 17:46 , Charles Oliver Nutter wrote:

[#33763] [Ruby 1.9-Bug#4168][Open] WeakRef is unsafe to use in Ruby 1.9 — Brian Durand <redmine@...>

Bug #4168: WeakRef is unsafe to use in Ruby 1.9

43 messages 2010/12/17

[#33815] trunk warnflags build issue with curb 0.7.9? — Jon <jon.forums@...>

As this may turn out to be a 3rd party issue rather than a bug, I'd like some feedback.

11 messages 2010/12/22

[#33833] Ruby 1.9.2 is going to be released — "Yuki Sonoda (Yugui)" <yugui@...>

-----BEGIN PGP SIGNED MESSAGE-----

15 messages 2010/12/23

[#33846] [Ruby 1.9-Feature#4197][Open] Improvement of the benchmark library — Benoit Daloze <redmine@...>

Feature #4197: Improvement of the benchmark library

15 messages 2010/12/23

[#33910] [Ruby 1.9-Feature#4211][Open] Converting the Ruby and C API documentation to YARD syntax — Loren Segal <redmine@...>

Feature #4211: Converting the Ruby and C API documentation to YARD syntax

10 messages 2010/12/26

[#33923] [Ruby 1.9-Bug#4214][Open] Fiddle::WINDOWS == false on Windows — Jon Forums <redmine@...>

Bug #4214: Fiddle::WINDOWS =3D=3D false on Windows

15 messages 2010/12/27

[ruby-core:33869] Re: [feature:trunk] Enumerable#categorize

From: Tanaka Akira <akr@...>
Date: 2010-12-25 08:45:07 UTC
List: ruby-core #33869
2010/12/20 "Martin J. D=FCrst" <duerst@it.aoyama.ac.jp>:

>> Enumerable#categorize is more general than Enumerable#group_by.
>
> I know. I think it's way too general and difficult to understand.

I don't think it is too difficult.

When we create hash, it is natural that we can specify keys and values.

But we can specify only keys for Enumerable#group_by.
I feel it is restricted.

> This is one example. I don't deny that such examples exist. But I think t=
hey
> are not so frequent, and quite varied (i.e. there are many examples where
> one needs "almost something like this, but not quite exactly the same).
>
> I think that unless we can boil down your proposal to something simple th=
at
> the average advanced Ruby programmer can understand and use without havin=
g
> to look it up or try it out in irb all the time, we have to invest some m=
ore
> time to find the right method.
>
> After all, group_by was adapted by Ruby 1.9 after a lot of field experien=
ce
> in Rails. And I have personally used it many times, as I think others hav=
e,
> too.
>
> On the other hand, do you have that much field experience? How many other=
s
> have told you that they would have used categorize a few times already if=
 it
> existed? (As opposed to those who have said that it's too complicated to
> remember what it does, and too easy to write the equivalent by hand if
> needed.)

I transformed CSV tables variously last several monthes in my work.
Enumerable#categorize is very useful for that.
(Unfortunately, I cannot tell you the exact examples though.)

For split a CSV table: enum.categorize {|rec| [split-key, ...] }.

For merge CSV tables: Enumerable#categorize can be used for
hash-join algorism as enum.categorize {|rec| [join-key, ...] }
http://en.wikipedia.org/wiki/Hash_join

For counting number for each category: enum.categorize(:op=3D>:+) {|e| [key=
, 1] }

Also various people asks about similar hash creation.

* [ruby-talk:351947]

  {day1 =3D> [[name_person1, room1], [name_person2, room2],
[name_person3, room2]],
   day2 =3D> [[name_person1, room1], [name_person3, room1],
[name_person2, room2]]}
  to
  [{room1 =3D> [{day1 =3D> [name_person1]},
	      {day2 =3D> [name_person1, name_person3]}]},
   {room2 =3D> [{day1 =3D> [name_person2, name_person3]},
	      {day2 =3D> [name_person2]}]}]

  This can be implemented as:
  a =3D orig.map {|k, aa| aa.map {|e| [k, *e] }}.flatten(1)
  pp a.categorize {|e| [e[2], e[0], e[1]] }

* [ruby-talk:347364]

  [["2efa4ba470", "00000005"],
   ["2efa4ba470", "00000004"],
   ["02adecfd5c", "00000002"],
   ["c0784b5de101", "00000006"],
   ["68c4bf10539", "00000003"],
   ["c0784b5de101", "00000001"]]
  to
  {"2efa4ba470" =3D> ["00000005", "00000004"],
   "02adecfd5c" =3D> ["00000002"],
   "c0784b5de101" =3D> ["00000006", "00000001"],
   "68c4bf10539" =3D> ["00000003"]}

  This can be implemented as:
  orig.categorize {|e| e }

* [ruby-talk:372481]

  [["A", "a", 1], ["A", "b", 2], ["B", "a", 1]] to
  {{"A" =3D> {"a" =3D> 1, "b" =3D> 2}},
   {"B" =3D> {"a" =3D> 1}}}

  This can be implemented as:
  orig.categorize(:op=3D>lambda {|x,y| y }) {|e| e }

* [ruby-talk:288931]

  [["1", "01-02-2008", 5],
   ["1", "01-03-2008", 10],
   ["2", "12-25-2007", 5],
   ["1", "01-04-2008", 15]]
  to
  {"1" =3D> {"01-02-2008" =3D> 5, "01-03-2008" =3D> 10, "01-04-2008" =3D> 1=
5},
   "2" =3D> {"12-25-2007" =3D> 5}}

  This can be implemented as:
  orig.categorize(:op=3D>lambda {|x,y| y}) {|e| e }

* [ruby-talk:354519]

  [["200912-829", 9],
   ["200912-893", 3],
   ["200912-893", 5],
   ["200912-829", 1],
   ["200911-818", 6],
   ["200911-893", 1],
   ["200911-827", 2]]
  to
  [["200912-829", 10],
   ["200912-893", 8],
   ["200911-818", 6],
   ["200911-893", 1],
   ["200911-827", 2]]

  This can be implemented as:
  orig.categorize(:op=3D>:+) {|e| e }.to_a

* [ruby-talk:344723]

  a=3D[1,2,5,13]
  b=3D[1,1,2,2,2,5,13,13,13]
  to
  [[0, 0], [0, 1], [1, 2], [1, 3], [1, 4], [2, 5], [3, 6], [3, 7], [3, 8]]

  This can be implemented as:
  h =3D a.categorize.with_index {|e, i| [e,i] }
  b.map.with_index {|e, j| h[e] ? h[e].map {|i| [i,j] } : [] }.flatten(1)

* [ruby-talk:327908]

  [["377", "838"],
   ["377", "990"],
   ["377", "991"],
   ["377", "992"],
   ["378", "840"],
   ["378", "841"],
   ["378", "842"],
   ["378", "843"],
   ["378", "844"]]
  to
  [["377", "838 990 991 992"],
   ["378", "840 841 842 843 844"]]

  This can be implemented as:
  orig.categorize(:seed=3D>nil, :op=3D>lambda {|x,y| !x ? y.dup : (x << "
" << y) }) {|e| e }

* [ruby-talk:347700]

  ["a", "b", "a", "b", "b"]
  to
  ["a", "b"] [2, 3]

  This can be implemented as:
  h =3D orig.categorize(:op=3D>:+) {|e| [e, 1] }
  p h.keys, h.values

* [ruby-talk:343511]

  [1, 2, 3, 3, 3, 3, 4, 4, 5]
  to
  {"3"=3D>4, "4"=3D>2}

  This can be implemented as:
  h =3D orig.categorize(:op=3D>:+) {|e| [e, 1] }
  p h.reject {|k,v| v =3D=3D 1 }

I feel many people needs hash creation.
Enumerable#categorize support them.

> An additional thought: The example above starts with two-element arrays.
> Such two- or multi-element arrays are often used, but in many cases they =
are
> just an intermediate step, before creating objects. group_by seems more
> close to using objects (that may be why it is used a lot in Rails, where =
the
> basics of model classes are almost free). On the other hand, with
> multi-element arrays, I think that part of what "categorize" would do wil=
l
> often be handled before or after. Anyway, while we should not change Ruby=
 so
> that it is too difficult to use multi-element arrays instead of objects,
> there is also no reason to create more methods that work better for
> multi-element arrays.

Ruby doesn't force us to create a class for programming.

I think this is a good aspect for scripting area.

> That may be true. But even if the average number of options for a Ruby
> method has slightly increased recently, your proposals still is way over
> average on the number of options, especially in an area (iterators on
> Enumerable) where options are few and far between.

It is natural because Enumerable is exist from old time.
I don't see any problem.
--=20
Tanaka Akira

In This Thread