[#78949] [Ruby trunk Feature#13095] [PATCH] io.c (rb_f_syscall): remove deprecation notice — kosaki.motohiro@...
Issue #13095 has been updated by Motohiro KOSAKI.
3 messages
2017/01/03
[#78997] [Ruby trunk Bug#13110] Byte-based operations for String — shugo@...
Issue #13110 has been updated by Shugo Maeda.
3 messages
2017/01/06
[#79228] Re: [ruby-cvs:64576] naruse:r57410 (trunk): Prevent GC by volatile [Bug #13150] — Eric Wong <normalperson@...>
naruse@ruby-lang.org wrote:
5 messages
2017/01/23
[#79511] Re: [ruby-cvs:64576] naruse:r57410 (trunk): Prevent GC by volatile [Bug #13150]
— Eric Wong <normalperson@...>
2017/02/13
Eric Wong <normalperson@yhbt.net> wrote:
[#79518] Re: [ruby-cvs:64576] naruse:r57410 (trunk): Prevent GC by volatile [Bug #13150]
— Nobuyoshi Nakada <nobu@...>
2017/02/13
On 2017/02/13 10:04, Eric Wong wrote:
[#79298] [Ruby trunk Bug#13085][Assigned] io.c io_fwrite creates garbage — nobu@...
Issue #13085 has been updated by Nobuyoshi Nakada.
3 messages
2017/01/29
[#79337] Re: [ruby-changes:45397] normal:r57469 (trunk): io.c: recycle garbage on write — SASADA Koichi <ko1@...>
Eric:
4 messages
2017/01/31
[#79352] Re: [ruby-changes:45397] normal:r57469 (trunk): io.c: recycle garbage on write
— Eric Wong <normalperson@...>
2017/01/31
SASADA Koichi <ko1@atdot.net> wrote:
[ruby-core:79006] [Ruby trunk Bug#13110] Byte-based operations for String
From:
shugo@...
Date:
2017-01-06 23:54:16 UTC
List:
ruby-core #79006
Issue #13110 has been updated by Shugo Maeda.
Eric Wong wrote:
> For reading and parsing operations, I'm not sure they're needed
> because IO#read/read_nonblock/etc all return binary strings when
> passed explicit length arg; and //n exists for Regexp. (And any
> socket server reading without a length arg would be dangerous)
Let me clarify my intention.
I'd like to handle not only singlebyte characters but multibyte
characters efficiently by byte-based operations.
Once a string is scanned, we have a byte offset, so we don't need
scan the string from the beginning, but we are forced to do it by
the current API.
In the following example, the byteindex version is much faster than
the index version.
```
lexington:ruby$ cat bench.rb
require "benchmark"
s = File.read("README.ja.md") * 10
Benchmark.bmbm do |x|
x.report("index") do
pos = 0
n = 0
loop {
break unless s.index(/\p{Han}/, pos)
n += 1
_, pos = Regexp.last_match.offset(0)
}
end
x.report("byteindex") do
pos = 0
n = 0
loop {
break unless s.byteindex(/\p{Han}/, pos)
n += 1
_, pos = Regexp.last_match.byteoffset(0)
}
end
end
lexington:ruby$ ./ruby bench.rb
Rehearsal ---------------------------------------------
index 1.060000 0.010000 1.070000 ( 1.116932)
byteindex 0.000000 0.010000 0.010000 ( 0.004501)
------------------------------------ total: 1.080000sec
user system total real
index 1.050000 0.000000 1.050000 ( 1.080099)
byteindex 0.000000 0.000000 0.000000 ( 0.003814)
```
----------------------------------------
Bug #13110: Byte-based operations for String
https://bugs.ruby-lang.org/issues/13110#change-62409
* Author: Shugo Maeda
* Status: Open
* Priority: Normal
* Assignee:
* Target version:
* ruby -v:
* Backport: 2.2: UNKNOWN, 2.3: UNKNOWN, 2.4: UNKNOWN
----------------------------------------
How about to add byte-based operations for String?
```
s = "あああいいいあああ"
p s.byteindex(/ああ/, 4) #=> 18
x, y = Regexp.last_match.byteoffset(0) #=> [18, 24]
s.bytesplice(x...y, "おおお")
p s #=> "あああいいいおおおあ"
```
---Files--------------------------------
byteindex.diff (2.83 KB)
--
https://bugs.ruby-lang.org/
Unsubscribe: <mailto:ruby-core-request@ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>