From: merch-redmine@...
Date: 2020-08-02T15:38:31+00:00
Subject: [ruby-core:99445] [Ruby master Feature#17055] Allow suppressing uninitialized instance variable and method redefined verbose mode warnings

Issue #17055 has been updated by jeremyevans0 (Jeremy Evans).


Eregon (Benoit Daloze) wrote in #note-9:
> jeremyevans0 (Jeremy Evans) wrote in #note-6:
> > Yes.  When you initialize an instance variable to nil, it slows things down, and there is no benefit because the trying to access an uninitialized instance variable returns nil anyway (plus warning in verbose mode)
> 
> And when you don't initialize, reads might become slower, because they have to check if warnings are enabled, and it might create polymorphism (see the end of my reply).
> It's a two-sides blade.

This is correct.  Whether or not to initialize instance variables depends on the object.  For long lived objects (classes, constants), it definitely makes sense.  For ephemeral objects, it can hurt performance.

> I'd argue allocating without ever accessing or storing objects is an unrealistic benchmark.
> 
> Can you show a significant overhead on a realistic benchmark? (some Sequel real-world usage maybe?)

The last time I did testing on this in Sequel, the performance decrease from initializing instance variables to nil was around 5% for Sequel::Model instance creation depending on the plugins in use. One of the reasons it was around 5% was that many plugins had to override initialization methods just to set an instance variable and call `super`.  5% may not sound like a lot, but I can't justify a 5% performance decrease (or even a 1% performance decrease) just to avoid verbose warnings.

> Here is the result on `truffleruby 20.3.0-dev-b7a9b466` with your microbenchmark and 10 instead of 1000.
> It shows the benchmark is bad mostly (i.e., it optimizes away and does nothing useful).
> 
> ```
> truffleruby bench_ivar_set.rb 
> Warming up --------------------------------------
>          initialized   423.378M i/100ms
>        uninitialized   191.211M i/100ms
>          initialized   209.219M i/100ms
>        uninitialized    60.855M i/100ms
> Calculating -------------------------------------
>          initialized      2.093B (� 0.2%) i/s -     10.670B in   5.098008s
>        uninitialized      2.093B (� 0.4%) i/s -     10.467B in   5.000214s
>          initialized      2.019B (�14.7%) i/s -      9.624B in   5.036843s
>        uninitialized      2.122B (� 2.3%) i/s -     10.650B in   5.021645s
> 
> Comparison:
>        uninitialized: 2122273181.8 i/s
>          initialized: 2018820267.6 i/s - same-ish: difference falls within error
> ```
> Here is the file used: [bench_ivar_set.rb](https://gist.github.com/eregon/282070e6f6686740a0c8e41243fb598b).

Interesting.  If it is not too much trouble, what are the results with 1000 instead of 10, if I may ask?  Alternatively, if you modify the benchmark to access each instance variable once, what the the results of that?  I'd test myself, but it appears TruffleRuby is not yet ported to the operating systems I use (OpenBSD/Windows).

> > The warning gem has always supported this.  It was the primary reason I worked on adding the warning support to Ruby 2.4.
> 
> Isn't it good enough of a solution already? (can easily narrow by file & ivar name)
> Adding a dependency on it doesn't seem a big deal and even deserved if you want such fine control over warnings and purposefully suppress warnings.
> One can also `prepend` a module to `Warning` to handle this without an extra dependency.

In my opinion, a ruby gem should never modify the behavior (e.g. add/override methods) of core classes, unless that is the purpose of the library.  As such, modifying the Warning class is not something I would consider doing by default, and therefore people that use Sequel and want to run tests/development in verbose mode have to filter the warnings themselves.  With the feature I am proposing, each library has control over their own code.

I believe verbose warning for uninitialized instance variables in code you have no control over is actively harmful to the user, because it can make it more difficult to find warnings in code you do have control over.

> > > Calling more methods when doing warnings is adding more boilerplate, complexity and edge cases to these code paths.
> > 
> > In the uninitialized instance variable case, I was actually able to reduce three separate code paths for issuing the warning to a single code path, plus I found a case that should produce a warning that did not and fixed that.  So overall complexity could be lower for the uninitialized variable case, at least for MRI.
> 
> There might be other good things in the PR.
> It's irrelevant to the general added complexity of a new callback.
> Imagine if we wanted to add such methods for more warnings, I think that would quickly become a mess.

Certainly this approach is not appropriate for all warnings.  However, in my 15+ years of Ruby experience, I've seen that these two verbose warnings are by far the most common, and both of them have valid reasons for using the behavior that produces the verbose warnings (performance and safety).

> > The proactive approach is substantially less flexible (e.g. can't use a regexp), and would require storing the values and a significantly more complex implementation.  Considering you just complained about the complexity of my patch, I find it strange that you would propose an approach that would definitely require even greater internal complexity.
> 
> I think well-designed and Ruby-like API is more important than complexity here.

I fail to see how this is not a Ruby-like API.  It's similar to other callbacks, such as `method_added`.  It's also the simplest thing I can think of that would work.  Additionally, as @byroot mentioned, it may be possible to use this approach with did_you_mean for even more helpful warnings.

> > The advantage of this approach is that it allows you to get the maximum possible performance in production, while suppressing unnecessary warnings in development or testing when you may run with verbose warnings.  Without this, you either need to give up some production performance, or you have to require the user install a separate library to filter out the verbose warnings.
> 
> My view is it encourages less readable code by trying to squeak a tiny bit of performance when running on current CRuby.

This is only half about performance.  You haven't mentioned anything about the method redefinition warning yet.  Can you provide your thoughts on that?

> In other words, lazily initializing @ivars causes reads to need some branching because they need to handle both the initialized and uninitialized cases.
> So @ivars reads can no longer be straight-line code, which can impact performance as shown above.

As I stated above, whether you want to initialize instance variables lazily or eagerly for performance depends on the object in question.  Not all situations are the same.  It is faster in some situations and slower in others.

> Also if the natural default is not `nil` but say `0` then `@foo = 0` in `initialize` is useful information (notably it gives the type), and it can be a simple `attr_reader :foo` to read it instead of the convoluted:
> ```ruby
> def foo
>   @foo || 0
> end
> ```

I think we can all agree that there are cases where eagerly initializing the object can improve performance.  This says nothing about the cases where eagerly initializing the object hurts performance.

----------------------------------------
Feature #17055: Allow suppressing uninitialized instance variable and method redefined verbose mode warnings
https://bugs.ruby-lang.org/issues/17055#change-86900

* Author: jeremyevans0 (Jeremy Evans)
* Status: Open
* Priority: Normal
----------------------------------------
These two verbose mode warnings are both fairly common and have good reasons why you would not want to warn about them in specific cases.  Not initializing instance variables to nil can be much better for performance, and redefining methods without removing the method first is the only safe approach in multi-threaded code.

There are reasons that you may want to issue verbose warnings by default in these cases.  For uninitialized instance variables, it helps catch typos. For method redefinition, it could alert you that a method already exists when you didn't expect it to, such as when a file is loaded multiple times when it should only be loaded once.

I propose we keep the default behavior the same, but offer the ability to opt-out of these warnings by defining methods.  For uninitialized instance variables in verbose mode, I propose we call `expected_uninitialized_instance_variable?(iv)` on the object.  If this method doesn't exist or returns false/nil, we issue the warning.  If the method exists and returns true, we suppress the warning.  Similarly, for redefined methods, we call `expected_redefined_method?(method_name)` on the class or module.  If the method doesn't exist or returns false/nil, we issue the warning.  If the method exists and returns true, we suppress the warning.

This approach allows high performance code (uninitialized instance variables) and safe code (redefining methods without removing) to work without verbose mode warnings.

I have implemented this support in a pull request: https://github.com/ruby/ruby/pull/3371



-- 
https://bugs.ruby-lang.org/

Unsubscribe: <mailto:ruby-core-request@ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>