[#109403] [Ruby master Feature#18951] Object#with to set and restore attributes around a block — "byroot (Jean Boussier)" <noreply@...>

Issue #18951 has been reported by byroot (Jean Boussier).

23 messages 2022/08/01

[#109423] [Ruby master Misc#18954] DevMeeting-2022-08-18 — "mame (Yusuke Endoh)" <noreply@...>

Issue #18954 has been reported by mame (Yusuke Endoh).

10 messages 2022/08/04

[#109449] [Ruby master Feature#18959] Handle gracefully nil kwargs eg. **nil — "LevLukomskyi (Lev Lukomskyi)" <noreply@...>

Issue #18959 has been reported by LevLukomskyi (Lev Lukomskyi).

27 messages 2022/08/08

[#109456] [Ruby master Bug#18960] Module#using raises RuntimeError when called at toplevel from wrapped script — "shioyama (Chris Salzberg)" <noreply@...>

Issue #18960 has been reported by shioyama (Chris Salzberg).

15 messages 2022/08/09

[#109550] [Ruby master Feature#18965] Further Thread::Queue improvements — "byroot (Jean Boussier)" <noreply@...>

Issue #18965 has been reported by byroot (Jean Boussier).

14 messages 2022/08/18

[#109575] [Ruby master Bug#18967] Segmentation fault in stackprof with Ruby 2.7.6 — "RubyBugs (A Nonymous)" <noreply@...>

Issue #18967 has been reported by RubyBugs (A Nonymous).

10 messages 2022/08/19

[#109598] [Ruby master Bug#18970] CRuby adds an invalid header to bin/bundle (and others) which makes it unusable in Bash on Windows — "Eregon (Benoit Daloze)" <noreply@...>

Issue #18970 has been reported by Eregon (Benoit Daloze).

17 messages 2022/08/20

[#109645] [Ruby master Bug#18973] Kernel#sprintf: %c allows codepoints above 127 for 7-bits ASCII encoding — "andrykonchin (Andrew Konchin)" <noreply@...>

Issue #18973 has been reported by andrykonchin (Andrew Konchin).

8 messages 2022/08/23

[#109689] [Ruby master Misc#18977] DevMeeting-2022-09-22 — "mame (Yusuke Endoh)" <noreply@...>

Issue #18977 has been reported by mame (Yusuke Endoh).

16 messages 2022/08/25

[#109707] [Ruby master Feature#18980] Re-reconsider numbered parameters: `it` as a default block parameter — "k0kubun (Takashi Kokubun)" <noreply@...>

Issue #18980 has been reported by k0kubun (Takashi Kokubun).

40 messages 2022/08/26

[#109756] [Ruby master Feature#18982] Add an `exception: false` argument for Queue#push, Queue#pop, SizedQueue#push and SizedQueue#pop — "byroot (Jean Boussier)" <noreply@...>

Issue #18982 has been reported by byroot (Jean Boussier).

11 messages 2022/08/29

[#109773] [Ruby master Misc#18984] Doc for Range#size for Float/Rational does not make sense — "masasakano (Masa Sakano)" <noreply@...>

Issue #18984 has been reported by masasakano (Masa Sakano).

7 messages 2022/08/29

[ruby-core:109525] [Ruby master Feature#18934] Proposal: Introduce method results memoization API in the core

From: "matz (Yukihiro Matsumoto)" <noreply@...>
Date: 2022-08-18 06:27:11 UTC
List: ruby-core #109525
Issue #18934 has been updated by matz (Yukihiro Matsumoto).

Status changed from Open to Rejected

I reject this proposal to make the feature built-in for several reasons:

* I still think it should be done in library (gem)
* The term `memoize` in this proposal is misused. The canonical `memoize` process record the function result according to arguments. But in this proposal, it's restricted to functions (methods) without arguments. Since it differs from the canonical definition, it is hard to make it built-in.

Matz.


----------------------------------------
Feature #18934: Proposal: Introduce method results memoization API in the core
https://bugs.ruby-lang.org/issues/18934#change-98695

* Author: zverok (Victor Shepelev)
* Status: Rejected
* Priority: Normal
----------------------------------------
**Abstract:** I propose to introduce a simple core API for memoizing argument-less method return values.

```ruby
class Test
  def foo
    puts "call!"
    5
  end

  memoized :foo
end

o = Test.new
o.foo # prints "call!", returns 5
o.foo # returns 5 immediately
```

The full proposal is below.

## Intro

For further reasoning, I'll be using the following class. It is simplified/invented for demonstrative purposes, so I'd prefer to discuss problems/solutions described in general and not focus on "you could rewrite this class that way".

```ruby
class Sentence
  attr_reader :text

  def initialize(text) = @text = text

  def tokens() = SomeTokenizer.new(text).call

  def size() = tokens.size
  def empty?() = tokens.all?(&:whitespace?)

  def words() = tokens.select(&:word?)
  def service?() = words.empty?
end
```

## The problem

The class above is nice, clean, and easy to read. The problem with it is efficiency: if we imagine that `SomeTokenizer#call` is not cheap (it probably isn't), then creating a few sentences and then processing them with some algorithm will be—with the demonstrated definition of the class—much less efficient then it could be. Every statement like...
```ruby
many_sentences.reject(&:empty?).select { _1.words.include?('Ruby') }.map { _1.words.count / _1.tokens.count }
```
...is much less efficient than it "intuitively" should be because tokenization happens again and again.

Caching just `tokens` would probably not be enough for a complex algorithm working with some notable amounts of data: `whitespace?` and `word?` might also be non-trivial; but even trivial methods like `select` and `empty?` when needlessly repeated thousands of times, would be visible in profiling.

So, can we stop recalculating them constantly?

## Existing solutions

### Just pre-calculate everything in `initialize`

We could create a lot of instance variables in `initialize`:
```ruby
def initialize(text)
  @text = text
  @tokens = SomeTokenizer.new(@text).call
  @size = @tokens.size
  @empty = @tokens.all?(&:whitespace?)
  @words = @tokens.select(&:word?)
  @service = @words.empty?
end
```
It will work, of course, but it loses nice visibility of what's objects main data and what is derivative; and it is more code (now we need to define attr_readers and predicate methods for all of that!). And adding every new small service method (like `def question?() = tokens.last&.text == '?'`) would require rethinking "is it efficient enough to be a method, or should I add one more instance var"?

### `||=` idiom

The common idiom for caching is `||=`:
```ruby
def words()= @words ||= tokens.select(&:word?)
```

It has its drawbacks, though:
1. doesn't suit methods that can return `false` or `nil` (like `service?`)
2. harder to use with methods that need several statements to calculate the end result
3. it mixes the concerns of "how it is calculated" and "it is memoized" (looking at the method's code, you don't immediately know if the variable is used only for memoization, or it could've been set elsewhere, and here we just providing default value)
4. it pollutes the object's data representation

1-2 is typically advised to solve with a less elegant but futureproof solution (which also makes it impossible to define in one-line methods, even if the main code is short):

```ruby
def empty?
  return @empty if defined?(@empty)

  @empty = tokens.all?(&:whitespace?)
end
```

About 4: while using this solution, we'll have a lot of instance vars (that are derivative and logically not the part of object's state) now visible in default `#inspect` and serialization:
```ruby
s = Sentence.new('Ruby is cool')
p s
# #<Sentence:0x00007fe21d8fd138 @text="Ruby is cool">
puts s.to_yaml
# --- !ruby/object:Sentence
# text: Ruby is cool
p s.empty?
# false
p s
# #<Sentence:0x00007fe21d8fd138 @text="Ruby is cool", @empty=false>
puts s.to_yaml
# --- !ruby/object:Sentence
# text: Ruby is cool
# empty: false
```

### Existing memoization libraries

There are several well-known memoization libraries out there, to name a couple: old and reliable [memoist](https://github.com/matthewrudy/memoist), new and efficient [memo_wise](https://github.com/panorama-ed/memo_wise).

They solve problems 1-3 of `||=`, and also add several cool features (like argument-dependent memoization) with a macro (this is `memo_wise`, `memoist` behaves the same, just different macro name):

```ruby
class Sentence
  prepend MemoWise
  # ...
  memo_wise def empty?() = tokens.all?(:whitespace?)
end
```

Now we have a nice declarative and decoupled statement "it is memoized", which also supports booleans and `nil`s and multi-statement methods.

The problem of "detail leaking" isn't solved, though:
```ruby
p s.empty?
# false
p s
# #<Sentence:0x00007f0f474eb418 @_memo_wise={:empty?=>false}, @text="Ruby is cool">
puts s.to_yaml
# --- !ruby/object:Sentence
# _memo_wise:
#   :empty?: false
# text: Ruby is cool
```

Also, using third-party gems introduces a few new problems:
1. Performance penalty. However well it is optimized, Ruby-land "redefine method, then check it is there, then calculate" has not zero overhead.
2. Dependency penalty. If the memoizing gem is not in the project yet, it is a decision whether to introduce it or not and for small no-dependencies gems or for the strictly-controlled codebase, it might be a problem. Also, doing `prepend MemoWise` (or `extend Memoist`) is another point where the question "should I introduce this dependency?" arises (in a small class with exactly one method to memoize, for example!)

## Feature proposal & Design decisions

I propose to introduce the `Module#memoized(*symbols)` method in the core, implemented in C.

1. Name: `memoize` is a typical name that the community is used to. I want the new method to look uniform with other existing "macros" that have wording suitable for the phrase "this method is {word}": `private` or `module_function`; that's why I propose the name `memoized`
2. I believe that the memoisation should be fully opaque: not visible on `#inspect` or serialization; no settings or API to interact with the internal state of memoization.
3. Only argument-less methods are memoizable, `memoize def foo(any, args)` should raise an exception
4. (Not sure about that one) "Memoised" state of the method should be inheritable. Probably we might need a symmetric `unmemoized :foo` to overwrite that in descendants.

### Non-features

There are several more features typically seen in memoization gems considered unsuitable for core functionality:

* No arguments-dependent memoization. I believe it is a "business logic" concern: how exactly the arguments should be stored, cache growth control (with too many argument-result pairs memoized), cache cleanup, etc. Third-party libraries can handle that.
* No cache presetting/resetting API. If "it is memoized in general, but sometimes reset", it is again a business-layer concern and shouldn't be solved by a language-level declaration. Third-party libraries can handle that.
* No extra API to memoize class methods, like we don't have a specific API for making class methods private.




-- 
https://bugs.ruby-lang.org/

Unsubscribe: <mailto:ruby-core-request@ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>

In This Thread

Prev Next