[#31647] [Backport #3666] Backport of r26311 (Bug #2587) — Luis Lavena <redmine@...>

Backport #3666: Backport of r26311 (Bug #2587)

13 messages 2010/08/07

[#31666] [Bug #3677] unable to run certain gem binaries' in windows 7 — Roger Pack <redmine@...>

Bug #3677: unable to run certain gem binaries' in windows 7

10 messages 2010/08/10

[#31676] [Backport #3680] Splatting calls to_ary instead of to_a in some cases — Tomas Matousek <redmine@...>

Backport #3680: Splatting calls to_ary instead of to_a in some cases

10 messages 2010/08/11

[#31681] [Bug #3683] getgrnam on computer with NIS group (+)? — Rocky Bernstein <redmine@...>

Bug #3683: getgrnam on computer with NIS group (+)?

13 messages 2010/08/11

[#31843] Garbage Collection Question — Asher <asher@...>

This question is no doubt a function of my own lack of understanding, but I think that asking it will at least help some other folks see what's going on with the internals during garbage collection.

17 messages 2010/08/25
[#31861] Re: Garbage Collection Question — Roger Pack <rogerdpack2@...> 2010/08/26

> The question in short: when an object goes out of scope and has no

[#31862] Re: Garbage Collection Question — Asher <asher@...> 2010/08/26

Right - so how does a pointer ever get off the stack?

[#31873] Re: Garbage Collection Question — Kurt Stephens <ks@...> 2010/08/27

On 8/26/10 11:51 AM, Asher wrote:

[#31894] Re: Garbage Collection Question — Asher <asher@...> 2010/08/27

I very much appreciate the response, and this is helpful in describing the narrative, but it's still a few steps behind my question - but it may very well have clarified some points that help us get there.

[#31896] Re: Garbage Collection Question — Evan Phoenix <evan@...> 2010/08/27

You have introduced something called a "root node" without defining it. What do you mean by this?

[#31885] Avoiding $LOAD_PATH pollution — Eric Hodel <drbrain@...7.net>

Last year Nobu asked me to propose an API for adding an object to

21 messages 2010/08/27

[#31947] not use system for default encoding — Roger Pack <rogerdpack2@...>

It strikes me as a bit "scary" to use system locale settings to

19 messages 2010/08/30

[#31971] Change Ruby's License to BSDL + Ruby's dual license — "NARUSE, Yui" <naruse@...>

Ruby's License will change to BSDL + Ruby's dual license

16 messages 2010/08/31

[ruby-core:31762] [Backport #3715] Enumerator#size and #size=

From: Marc-Andre Lafortune <redmine@...>
Date: 2010-08-18 20:02:31 UTC
List: ruby-core #31762
Backport #3715: Enumerator#size and #size=
http://redmine.ruby-lang.org/issues/show/3715

Author: Marc-Andre Lafortune
Status: Open, Priority: Normal
Category: core, Target version: 1.9.3

It would be useful to be able to ask an Enumerator for the number of times it will yield, without having to actually iterate it.

For example:

  (1..1000).to_a.permutation(4).size # => 994010994000  (instantly)

It would allow nice features like:

  class Enumerator
    def with_progress
      return to_enum :with_progress unless block_given?
      out_of = size || "..."
      each_with_index do |obj, i|
        puts "Progress: #{i} / #{out_of}"
        yield obj
      end
      puts "Done"
    end
  end

  # To display the progress of any iterator, one can daisy-chain with_progress:
  20.times.with_progress.map do
    # do stuff here...
  end

This would print out "Progress: 1 / 20", etc..., while doing the stuff.

*** Proposed changes ***

* Enumerator#size *

call-seq:
  e.size          -> int, Float::INFINITY or nil
  e.size {block}  -> int

Returns the size of the enumerator.
The form with no block given will do a lazy evaluation of the size without going through the enumeration. If the size can not be determined then +nil+ is returned.
The form with a block will always iterate through the enumerator and return the number of times it yielded.

  (1..100).to_a.permutation(4).size # => 94109400
  loop.size # => Float::INFINITY

  a = [1, 2, 3]
  a.keep_if.size         # => 3
  a                      # => [1, 2, 3]
  a.keep_if.size{false}  # => 3
  a                      # => []

  [1, 2, 3].drop_while.size             # => nil
  [1, 2, 3].drop_while.size{|i| i < 3}  # => 2


* Enumerator#size= *

call-seq:
  e.size = sz

Sets the size of the enumerator. If +sz+ is a Proc or a Method, it will be called each time +size+ is requested, otherwise +sz+ is returned.

  first = [1, 2, 3]
  second = [4, 5]
  enum = Enumerator.new do |y|
    first.each{|o| y << o}
    second.each{|o| y << o}
  end
  enum.size    # => nil
  enum.size = ->(e){first.size + second.size}
  enum.size    # => 5
  first << 42
  enum.size    # => 6

* Kerne#to_enum / enum_for *

The only other API change is for #to_enum/#enum_for, which can accept a block for size calculation:

  class Date
    def step(limit, step=1)
      unless block_given?
        return to_enum(:step, limit, step){|date| (limit - date).div(step) + 1}
      end
      # ...
    end
  end

*** Implementation ***

I implemented the support for #size for most builtin enumerator producing methods (63 in all).

It is broken down in about 20 commits: http://github.com/marcandre/ruby/commits/enum_size

It begins with the implementation of Enumerator#size{=}: http://github.com/marcandre/ruby/commit/a92feb0

A combined patch is available here: http://gist.github.com/535974

Still missing are Dir#each, Dir.foreach, ObjectSpace.each_object, Range#step, Range#each, String#upto, String#gsub, String#each_line.

The enumerators whose #size returns +nil+ are:
  Array#{r}index, {take|drop}_while
  Enumerable#find{_index}, {take|drop}_while
  IO: all methods

*** Notes ***
* Returning +nil+ *

I feel it is best if IO.each_line.size and similar return +nil+ to avoid side effects.

We could have Array#find_index.size return the size of the array with the understanding that this is the maximum number of times the enumerator will yield. Since a block can always contain a break statement, size could be understood as a maximum anyways, so it can definitely be argued that the definition should be the maximum number of times.

* Arguments to size proc/lambda *

My implementation currently passes the object that the enumerator will call followed with any arguments given when building the enumerator.

If Enumerator had getters (say Enumerator#base, Enumerator#call, Enumerator#args, see feature request #3714), passing the enumerator itself might be a better idea.

* Does not dispatch through name *

It might be worth noting that the size dispatch is decided when creating the enumerator, not afterwards in function of the class & method name:

[1,2,3].permutation(2).size # => 6
[1,2,3].to_enum(:permutation, 2).size # => nil

* Size setter *

Although I personally like the idea that #size= can accept a Proc/Lambda for later call, this has the downside that there is no getter, i.e. no way to get the Proc/Lambda back. I feel this is not an issue, but an alternative would be to have a #size_proc and #size_proc= setters too (like Hash).

I believe this addresses feature request #2673, although maybe in a different fashion. http://redmine.ruby-lang.org/issues/show/2673


----------------------------------------
http://redmine.ruby-lang.org

In This Thread

Prev Next