From: "austin (Austin Ziegler)" <noreply@...>
Date: 2022-10-04T14:29:34+00:00
Subject: [ruby-core:110183] [Ruby master Feature#19024] Proposal: Import Modules

Issue #19024 has been updated by austin (Austin Ziegler).


shioyama (Chris Salzberg) wrote in #note-11:
> That said, to be clear, the Ruby patch does not actually hit this conflict, it's the gem that does. The patch only requires a file once, _in whatever context it was required in_. If you require it in a wrap context, that's where the code is required, period. If you try to require the same file again from toplevel, or from a different wrap context, you get `false` and nothing happens.

I���m still very much against this concept, because this rule will *absolutely* cause code to break. What you���re describing is something that *only* really has value for applications, but because you���re extending transitivity to `require`, you will end up hiding shared dependencies without the "benefit" of being able to load the same code (or multiple versions of the same code) more than once like in JavaScript. Worse, given the existence of `autoload`, tracking down these issues would itself be a bit of a heisenbug-hunt.

```ruby
# a.rb
require 'faraday'

# b.rb
require 'faraday'

# app.rb
api1 = import "a"
api2 = import "b"

require 'faraday'

a::Faraday # => Faraday
b::Faraday # => NameError: uninitialized constant Faraday
Faraday # => NameError: uninitialized constant Faraday
```

Yes, the fix is easy: `require 'faraday'` before doing any `imports, but that breaks with autoload and without eager loading (not every Ruby application is using Rails with its use of Bundler eager loading).

The *only* ways that you can make any of this work with the reality of Ruby���s ecosystem are: (a) allow dependencies to opt out of being wrapped (which makes this misfeature less useful), (b) make it something that gems and app code can both opt into (e.g., something like a `package_constant`), or (c) make `$LOADED_FEATURES` unique per context (thereby allowing the same code to be loaded into memory more than once, which is one of JS���s biggest misfeatures).

> > Would there be things that would work in one mode and not in the other within Nokogiri?
> 
> I'd need to look more closely at Nokogiri to answer that, so far I've been focusing on Rails. But I'd be glad to do that.

The problem here isn���t so much Nokogiri on its own, but the fact that Nokogiri is a compiled extension. Any dependency that loads a compiled extension is going to have assumptions baked into the compiled code, and this would absolutely break those assumptions. And compiled extensions cannot typically be loaded more than once regardless of anything else.

> Putting my cards on the table, I personally have always found the assumption that all Ruby code is loaded from toplevel to be one of Ruby's biggest weaknesses. That's my view, and I'm happy to elaborate on it, but my focus right now will be objectively on whether this toplevel-centric design is inevitable or not.

Please elaborate on this, as I can only think of a handful of languages (most descended from JavaScript) where code is *not* referenced from the top-level, and they all have the much bigger weakness of being able to load the same code multiple times in multiple contexts such that you cannot be certain whether two related pieces of code are running the same version. Barring some absolute trickery (which I���ve done before) and (to some degree) refinements (which I still haven't used), you can be guaranteed that if you���re calling a method, all calls to that method will be the *same* method.

----------------------------------------
Feature #19024: Proposal: Import Modules
https://bugs.ruby-lang.org/issues/19024#change-99458

* Author: shioyama (Chris Salzberg)
* Status: Open
* Priority: Normal
----------------------------------------
There is no general way in Ruby to load code outside of the globally-shared namespace. This makes it hard to isolate components of an application from each other and from the application itself, leading to complicated relationships that can become intractable as applications grow in size.

The growing popularity of a gem like [Packwerk](https://github.com/shopify/packwerk), which provides a new concept of "package" to enfoce boundaries statically in CI, is evidence that this is a real problem. But introducing a new packaging concept and CI step is at best only a partial solution, with downsides: it adds complexity and cognitive overhead that wouldn't be necessary if Ruby provided better packaging itself (as Matz has suggested [it should](https://youtu.be/Dp12a3KGNFw?t=2956)).

There is _one_ limited way in Ruby currently to load code without polluting the global namespace: `load` with the `wrap` parameter, which as of https://bugs.ruby-lang.org/issues/6210 can now be a module. However, this option does not apply transitively to `require` calls within the loaded file, so its usefulness is limited.

My proposal here is to enable module imports by doing the following:

1. apply the `wrap` module namespace transitively to `require`s inside the loaded code, including native extensions (or provide a new flag or method that would do this),
2. make the `wrap` module the toplevel context for code loaded under it, so `::Foo` resolves to `<top_wrapper>::Foo` in loaded code (or, again, provide a new flag or method that would do this). _Also make this apply when code under the wrapper module is called outside of the load process (when `top_wrapper` is no longer set) &mdash; this may be quite hard to do_.
3. resolve `name` on anonymous modules under the wrapped module to their names without the top wrapper module, so `<top_wrapper>::Foo.name` evaluates to `"Foo"`. There may be other ways to handle this problem, but a gem like Rails uses `name` to resolve filenames and fails when anonymous modules return something like `#<Module: ...>::ActiveRecord` instead of just `ActiveRecord`.

I have roughly implemented these three things in [this patch](https://github.com/ruby/ruby/compare/master...shioyama:ruby:import_modules). This implementation is incomplete (it does not cover the last highlighted part of 2) but provides enough of a basis to implement an `import` method, which I have done in a gem called [Im](https://github.com/shioyama/im).

Im provides an `import` method which can be used to import gem code under a namespace:

```ruby
require "im"
extend Im

active_model = import "active_model"
#=> <#Im::Import root: active_model>

ActiveModel
#=> NameError

active_model::ActiveModel
#=> ActiveModel

active_record = import "active_record"
#=> <#Im::Import root: active_record>

# Constants defined in the same file under different imports point to the same objects
active_record::ActiveModel == active_model::ActiveModel
#=> true
```

With the constants all loaded under an anonymous namespace, any code importing the gem can name constants however it likes:

```ruby
class Post < active_record::ActiveRecord::Base
end

AR = active_record::ActiveRecord

Post.superclass
#=> AR::Base
```

Note that this enables the importer to completely determine the naming for every constant it imports. So gems can opt to hide their dependencies by "anchoring" them inside their own namespace, like this:

```ruby
# in lib/my_gem.rb
module MyGem
  dep = import "my_gem_dependency"

  # my_gem_dependency is "anchored" under the MyGem namespace, so not exposed to users
  # of the gem unless they also require it.
  MyGemDependency = dep

  #...
end
```

There are a couple important implementation decisions in the gem:

1. _Only load code once._ When the same file is imported again (either directly or transitively), "copy" constants from previously imported namespace to the new namespace using a registry which maps which namespace (import) was used to load which file (as shown above with activerecord/activemodel). This is necessary to ensure that different imports can "see" shared files. A similar registry is used to track autoloads so that they work correctly when used from imported code.
2. Toplevel core types (`NilClass`, `TrueClass`, `FalseClass`, `String`, etc) are "aliased" to constants under each import module to make them available. Thus there can be side-effects of importing code, but this allows a gem like Rails to monkeypatch core classes which it needs to do for it to work.
3. `Object.const_missing` is patched to check the caller location and resolve to the constant defined under an import, if there is an import defined for that file.

To be clear: **I think 1) should be implemented in Ruby, but not 2) and 3).** The last one (`Object.const_missing`) is a hack to support the case where a toplevel constant is referenced from a method called in imported code (at which point the `top_wrapper` is not active.)

I know this is a big proposal, and there are strong opinions held. I would really appreciate constructive feedback on this general idea.

See also similar discussion in: https://bugs.ruby-lang.org/issues/10320



-- 
https://bugs.ruby-lang.org/

Unsubscribe: <mailto:ruby-core-request@ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>