From: "Eregon (Benoit Daloze) via ruby-core" Date: 2025-12-15T19:55:11+00:00 Subject: [ruby-core:124217] [Ruby Bug#21783] {Method,UnboundMethod,Proc}#source_location returns columns in bytes and not in characters Issue #21783 has been updated by Eregon (Benoit Daloze). >From https://bugs.ruby-lang.org/issues/6012#note-25 @matz said adding column was OK, but not byte offsets. I'm not sure what were his reasons, but maybe it's that byte offsets are too low-level for `source_location`? If so, I would think byte columns are also too low level and it should be character columns instead. >From a user POV character columns seem better and more expected. OTOH, I understand the reservation from @kddnewton and I share it as a Ruby implementer, it's much simpler to return byte columns. For example in TruffleRuby we currently save location information by having `int32_t start_offset; int32_t length;` in every Truffle AST node, i.e. byte offset and byte length. Returning byte columns from that is easy and only requires the "newline offsets" array, and not the actual source code. To return character columns, TruffleRuby would need to read from the beginning of the line to the byte offset to find how many characters that is, and keep the source code in memory (currently TruffleRuby does keep it in memory, but it might not in the future). I have also seen this in the context of [adding Prism.node_for](https://github.com/ruby/prism/pull/3808) and for that usage having byte columns is actually easier than character columns, OTOH it's not hard to convert from character columns to byte columns in that case and I already wrote the logic for that (because I expected `source_location` would return character columns, even before reading the docs). It is of course possible to convert from character column to byte column and vice versa, but it requires access to the source code, which is not always available (e.g. `eval`). ---------------------------------------- Bug #21783: {Method,UnboundMethod,Proc}#source_location returns columns in bytes and not in characters https://bugs.ruby-lang.org/issues/21783#change-115700 * Author: Eregon (Benoit Daloze) * Status: Open * ruby -v: ruby 4.0.0dev (2025-12-14T07:11:02Z master 711d14992e) +PRISM [x86_64-linux] * Backport: 3.2: UNKNOWN, 3.3: UNKNOWN, 3.4: UNKNOWN ---------------------------------------- The documentation says: ``` = Proc.source_location (from ruby core) ------------------------------------------------------------------------ prc.source_location -> [String, Integer, Integer, Integer, Integer] ------------------------------------------------------------------------ Returns the location where the Proc was defined. The returned Array contains: (1) the Ruby source filename (2) the line number where the definition starts (3) the column number where the definition starts (4) the line number where the definition ends (5) the column number where the definitions ends This method will return nil if the Proc was not defined in Ruby (i.e. native). ``` So it talks about column numbers, so it should be a number of characters and not of bytes. But currently it's a number of bytes: ``` $ ruby --parser=prism -ve 'def �t�; end; p method(:�t�).source_location' ruby 4.0.0dev (2025-12-14T07:11:02Z master 711d14992e) +PRISM [x86_64-linux] ["-e", 1, 0, 1, 14] $ ruby --parser=parse.y -ve 'def �t�; end; p method(:�t�).source_location' ruby 4.0.0dev (2025-12-14T07:11:02Z master 711d14992e) [x86_64-linux] ["-e", 1, 0, 1, 14] ``` The last number should be 12 because `"def �t�; end".size` is 12 characters. This is a Ruby-level API so I would never expect "byte columns" here, I think it's clear it should be a number of "editor columns" i.e. a number of characters. -- https://bugs.ruby-lang.org/ ______________________________________________ ruby-core mailing list -- ruby-core@ml.ruby-lang.org To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org ruby-core info -- https://ml.ruby-lang.org/mailman3/lists/ruby-core.ml.ruby-lang.org/