From: "Eregon (Benoit Daloze) via ruby-core" <ruby-core@...>
Date: 2023-08-29T20:15:59+00:00
Subject: [ruby-core:114584] [Ruby master Misc#19772] API naming for YARP compiler

Issue #19772 has been updated by Eregon (Benoit Daloze).


> Does this approach sound good to everyone, or are there any other requirements that I have missed?

These may not be requirements but they seem very important considerations for YARP adoption & usability:

* If inside a single application with a Gemfile, one wants to parse with two different non-current Ruby grammar versions, that is not possible with YARP, but it is possible with the parser gem (given enough tooling gems in a Gemfile this does not seem unlikely, suppose RuboCop parsing 3.2 and some lint tool using 3.1 because it doesn't handle new syntax/nodes from 3.2 yet; another example, testing with ruby-head + RuboCop for 3.1 and ruby-lsp/solargraph for 3.2 in Gemfile).
* To make RuboCop fast I think it is well understood we need it to use YARP and that most likely by having the parser gem use YARP. But if YARP can only parse current Ruby + one specific version then it's going to be rather unlikely the YARP version will match the RuboCop TargetRubyVersion. Having to specify `gem "yarp", "~> 3.2.0"` seems very inconvenient, I would think most Rubyists would not understand this requirement (e.g. why is it suddenly slow after updating yarp in Gemfile.lock?) or find it excessively annoying (if e.g. parser gem warns about it and requires to change the Gemfile manually, and yet another place to hardcode the Ruby version, plus it would work poorly for gems supporting multiple Ruby versions).

The solution to both of these is to support multiple Ruby grammar versions in a single gem release, like the parser gem.
Then it is always safe to update YARP, and Ruby interpreter sjust ship with one version of YARP as a bundled gem (+ the C code for the parser used for the interpreter).
My feeling is this actually not much more work than 3 release branches and some hacky way to have both some version + current Ruby version working together.
What are your concerns with supporting multiple Ruby versions inside one YARP version?
And what is your vision for RuboCop & parser gem to use YARP?

Regarding having both some version + current Ruby version working together:

* About adding fields, that may work for Ruby nodes but will not and cannot work for adding/removing/renaming/reordering fields in C structs or serializing more/less fields or differently (e.g. suppose there is some new attribute on DefNode, or some new IntegerNode int32_value field, or some extra location information desired, impossible to add that without changing C structs & serialization, or serialization uses a different more optimized format). As I said I before, I believe it cannot possibly work to have e.g. the yarp 3.5.0 gem installed in CRuby 3.3.0 and hope that the serialization will be compatible between both (and vice versa). It will break at the first field or node change between these 2 versions.
* The only way to support current + another version is two full copies of all files (or IOW the version of yarp "integrated with the interpreter" is just completely separate from the yarp gem version, they would share nothing), and different namespaces, as I wrote in https://bugs.ruby-lang.org/issues/19772#note-20. That feels a bit hacky and less convenient for users but it would work.
It has the advantage to really use the same parser for the Ruby API as the interpreter uses. But also if there is an issue with the Ruby API for "current version" it cannot be fixed except with a CRuby patch release (i.e. installing a newer yarp gem has no effect on the current version parser namespace, called Ruby::Parser above).

> Those factory methods will check the target Ruby version and handle the fields appropriately for that version of Ruby.

That sounds rather expensive and slow, and it will be an overhead on every YARP.parse. Extra method calls and indirections have a non-trivial overhead, at least on CRuby.
Plus having to maintain that manually sounds quite error-prone and verbose, and difficult to test properly.

> The YARP name is not good, since we don't want YA- prefixes in production code.

I like YARP and I think it should stay since it's already established.
I see YARP as just an abbreviation (I pronounce it "YARP" not "Yet Another Ruby Parser", same for YARV, YJIT, etc).
Similarly, whenever I read CSV in code I don't think "Comma-Separated Values", it's just "CSV".
And `Yet Another` is really a good fit here, we had so many parser projects and attempts, I think this is the good one to unify all.

----------------------------------------
Misc #19772: API naming for YARP compiler
https://bugs.ruby-lang.org/issues/19772#change-104399

* Author: jemmai (Jemma Issroff)
* Status: Open
* Priority: Normal
----------------------------------------
We are working on the YARP compiler, and have [the first PR ready](https://github.com/ruby/ruby/pull/8042) which introduces the YARP compile method. Our only outstanding question before merging it is about naming. How should we expose the public API for YARP's compile method?

Potential suggestions:

1. YARP.compile
2. RubyVM::InstructionSequence.compile(yarp: true)
3. RubyVM::InstructionSequence.compile_yarp
4. Any of the above options, with a name other than yarp (please suggest an alternative)

Regarding option 1, which would mirror `YARP.parse`, is the top level constant `YARP` acceptable?

cc @matz @ko1 


-- 
https://bugs.ruby-lang.org/
 ______________________________________________
 ruby-core mailing list -- ruby-core@ml.ruby-lang.org
 To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
 ruby-core info -- https://ml.ruby-lang.org/mailman3/postorius/lists/ruby-core.ml.ruby-lang.org/