From: "jeremyevans0 (Jeremy Evans)" Date: 2022-10-08T18:30:07+00:00 Subject: [ruby-core:110239] [Ruby master Feature#19024] Proposal: Import Modules Issue #19024 has been updated by jeremyevans0 (Jeremy Evans). @shioyama Thank you for that explaination, I now have a better understanding of the motivation for this proposal. In terms of loading code, Ruby has two methods, `load` and `require`. `load` can take a wrapping module, `require` cannot. One reason for this is that `load` is not designed to be idempotent (it loads the file every time), while `require` is designed to be idempotent (it does not load the same file more than once). Since `load` is not designed to be idempotent, it can take a wrapping module, as the behavior can vary per-call. This does not apply to `require`, because `require` must be idempotent. Fundamentally, you cannot support `require` with a wrapping module without losing the idempotency. The following code could not work and be idempotent: ```ruby MyModule1 = Module.new MyModule2 = Module.new require 'foo', MyModule1 require 'foo', MyModule2 ``` For similar reasons, making `require` implicitly support the currently wrapping module would break idempotency and therefore I do not think it should be considered. In terms of purely reducing the amount of namespace boilerplate, you can use `load` for internal code loading. I think you would still want to use `require` for files in gems, since you do not control that code (internally, those gems could use a similar approach to this): ```ruby # payments/api_clients/foo_client.rb require "my_client_gem/api_client" class FooClient < MyClientGem::ApiClient # ... end # payments/api_clients/bar_client.rb require "my_client_gem/api_client" class BarClient < MyClientGem::ApiClient # ... end # payments.rb module Payments load File.expand_path("api_clients/foo_client.rb", __dir__), self load File.expand_path("api_clients/bar_client.rb", __dir__), self # do something with Payments::FooClient and Payments::BarClient end ``` Note that the wrapping module for `load` only currently supports a single namespace, not multiple namespaces. Maybe your patch adds that, but I couldn't tell because it doesn't include tests. Note that you can currently support multiple namespaces using `eval`. The approach seems kind of ugly, but it's basically what you seem to want in terms of implicit nesting: ```ruby module Payments class Nested # Top level constant lookup in the *_client files uses Payments::Nested, Payments, Object foo_path = File.expand_path("api_clients/foo_client.rb", __dir__)) eval File.read(foo_path), binding, foo_path bar_path = File.expand_path("api_clients/bar_client.rb", __dir__)) eval File.read(bar_path), binding, bar_path end # do something with Payments::FooClient and Payments::BarClient end ``` While I understand the goal of reducing namespace "boilerplate", I think it is important to understand that removing explicit namespaces is a tradeoff. If you do not leave the namespaces in the file, but instead let them be implicit, the code likely becomes more difficult to understand. You state that programmers would naturally prefer implicit namespaces over explicit namespaces, but I'm not sure that is true. Implicit code is not necessarily better than explicit code. What you consider "irrelevant" may be very relevant to someone who isn't familiar with the code an all of the implicit namespaces being dealt with. You describe the current state of affairs as a "terrible tradeoff", but that seems hyperbolic to me. At most, having to use explicit namespaces should be mildly annoying, even if you have full understanding of the code and can deal with implicit namespaces. In terms of encapsulation, Ruby allows trivial breaking of encapsulation even in code that uses wrapped modules. `::Foo` always refers to `Object::Foo`. You could not use wrapping module support to enforce encapsulation in Ruby. Note that both the `load` and `eval` approaches I've shown are unlikely to work well if you have optional parts of the codebase that you would like to load in different places. In situations like that, you really need the idempotency that `require` offers, to make sure the related code is only loaded once. @shioyama I think it would be helpful if, for each of the patches you are proposing, you include tests to make it easier to see what each patch allows and how the behavior changes. To the extent that the patches are independent, a separate pull request with tests for each would be helpful and aid review. Even though I don't think the current state of `load`/`require` is an issue worth fixing, I think each patch could be considered on its own merits. ---------------------------------------- Feature #19024: Proposal: Import Modules https://bugs.ruby-lang.org/issues/19024#change-99526 * Author: shioyama (Chris Salzberg) * Status: Open * Priority: Normal ---------------------------------------- There is no general way in Ruby to load code outside of the globally-shared namespace. This makes it hard to isolate components of an application from each other and from the application itself, leading to complicated relationships that can become intractable as applications grow in size. The growing popularity of a gem like [Packwerk](https://github.com/shopify/packwerk), which provides a new concept of "package" to enfoce boundaries statically in CI, is evidence that this is a real problem. But introducing a new packaging concept and CI step is at best only a partial solution, with downsides: it adds complexity and cognitive overhead that wouldn't be necessary if Ruby provided better packaging itself (as Matz has suggested [it should](https://youtu.be/Dp12a3KGNFw?t=2956)). There is _one_ limited way in Ruby currently to load code without polluting the global namespace: `load` with the `wrap` parameter, which as of https://bugs.ruby-lang.org/issues/6210 can now be a module. However, this option does not apply transitively to `require` calls within the loaded file, so its usefulness is limited. My proposal here is to enable module imports by doing the following: 1. apply the `wrap` module namespace transitively to `require`s inside the loaded code, including native extensions (or provide a new flag or method that would do this), 2. make the `wrap` module the toplevel context for code loaded under it, so `::Foo` resolves to `::Foo` in loaded code (or, again, provide a new flag or method that would do this). _Also make this apply when code under the wrapper module is called outside of the load process (when `top_wrapper` is no longer set) — this may be quite hard to do_. 3. resolve `name` on anonymous modules under the wrapped module to their names without the top wrapper module, so `::Foo.name` evaluates to `"Foo"`. There may be other ways to handle this problem, but a gem like Rails uses `name` to resolve filenames and fails when anonymous modules return something like `#::ActiveRecord` instead of just `ActiveRecord`. I have roughly implemented these three things in [this patch](https://github.com/ruby/ruby/compare/master...shioyama:ruby:import_modules). This implementation is incomplete (it does not cover the last highlighted part of 2) but provides enough of a basis to implement an `import` method, which I have done in a gem called [Im](https://github.com/shioyama/im). Im provides an `import` method which can be used to import gem code under a namespace: ```ruby require "im" extend Im active_model = import "active_model" #=> <#Im::Import root: active_model> ActiveModel #=> NameError active_model::ActiveModel #=> ActiveModel active_record = import "active_record" #=> <#Im::Import root: active_record> # Constants defined in the same file under different imports point to the same objects active_record::ActiveModel == active_model::ActiveModel #=> true ``` With the constants all loaded under an anonymous namespace, any code importing the gem can name constants however it likes: ```ruby class Post < active_record::ActiveRecord::Base end AR = active_record::ActiveRecord Post.superclass #=> AR::Base ``` Note that this enables the importer to completely determine the naming for every constant it imports. So gems can opt to hide their dependencies by "anchoring" them inside their own namespace, like this: ```ruby # in lib/my_gem.rb module MyGem dep = import "my_gem_dependency" # my_gem_dependency is "anchored" under the MyGem namespace, so not exposed to users # of the gem unless they also require it. MyGemDependency = dep #... end ``` There are a couple important implementation decisions in the gem: 1. _Only load code once._ When the same file is imported again (either directly or transitively), "copy" constants from previously imported namespace to the new namespace using a registry which maps which namespace (import) was used to load which file (as shown above with activerecord/activemodel). This is necessary to ensure that different imports can "see" shared files. A similar registry is used to track autoloads so that they work correctly when used from imported code. 2. Toplevel core types (`NilClass`, `TrueClass`, `FalseClass`, `String`, etc) are "aliased" to constants under each import module to make them available. Thus there can be side-effects of importing code, but this allows a gem like Rails to monkeypatch core classes which it needs to do for it to work. 3. `Object.const_missing` is patched to check the caller location and resolve to the constant defined under an import, if there is an import defined for that file. To be clear: **I think 1) should be implemented in Ruby, but not 2) and 3).** The last one (`Object.const_missing`) is a hack to support the case where a toplevel constant is referenced from a method called in imported code (at which point the `top_wrapper` is not active.) I know this is a big proposal, and there are strong opinions held. I would really appreciate constructive feedback on this general idea. See also similar discussion in: https://bugs.ruby-lang.org/issues/10320 -- https://bugs.ruby-lang.org/ Unsubscribe: