[#118346] [Ruby master Bug#20586] Some filesystem calls in dir.c are missing error handling and can return incorrect results if interrupted — "ivoanjo (Ivo Anjo) via ruby-core" <ruby-core@...>
Issue #20586 has been reported by ivoanjo (Ivo Anjo).
13 messages
2024/06/19
[ruby-core:118313] [Ruby master Feature#20576] Add MatchData#bytebegin and MatchData#byteend
From:
"shugo (Shugo Maeda) via ruby-core" <ruby-core@...>
Date:
2024-06-13 06:22:47 UTC
List:
ruby-core #118313
Issue #20576 has been updated by shugo (Shugo Maeda).
matz (Yukihiro Matsumoto) wrote in #note-3:
> I understand the use-case. I agree with the addition of the feature, but =
I don't like the name. The names `bytebegin`, `byteend` are follow the `byt=
eindex` tradition, but it is very hard to read (especially `byteend`). Any =
other name suggestions?
I came up with names `begin_in_bytes` and `end_in_bytes`, but `byte_begin` =
/ `byte_end` suggested by Eregon may be better.
----------------------------------------
Feature #20576: Add MatchData#bytebegin and MatchData#byteend
https://bugs.ruby-lang.org/issues/20576#change-108822
* Author: shugo (Shugo Maeda)
* Status: Open
* Target version: 3.4
----------------------------------------
I'd like to propose MatchData#bytebegin and MatchData#byteend.
These methods are similar to MatchData#begin and MatchData#end, but returns=
offsets in bytes instead of codepoints.
Pull request: https://github.com/ruby/ruby/pull/10973
One of the use cases is scanning strings: https://github.com/ruby/net-imap/=
pull/286/files
MatchData#byteend is faster than MatchData#byteoffset because there is no n=
eed to allocate an Array.
Here's a benchmark result:
```
voyager:ruby$ cat b.rb=20
require "benchmark"
require "strscan"
text =3D "=E3=81=82" * 100000
Benchmark.bmbm do |b|
b.report("byteoffset(0)[1]") do
pos =3D 0
while text.byteindex(/\G./, pos)
pos =3D $~.byteoffset(0)[1]
end
end
b.report("byteend(0)") do
pos =3D 0
while text.byteindex(/\G./, pos)
pos =3D $~.byteend(0)
end
end
end
voyager:ruby$ ./tool/runruby.rb b.rb =20
Rehearsal ----------------------------------------------------
byteoffset(0)[1] 0.020558 0.000393 0.020951 ( 0.020963)
byteend(0) 0.018149 0.000000 0.018149 ( 0.018151)
------------------------------------------- total: 0.039100sec
user system total real
byteoffset(0)[1] 0.020821 0.000000 0.020821 ( 0.020822)
byteend(0) 0.017455 0.000000 0.017455 ( 0.017455)
```
--=20
https://bugs.ruby-lang.org/