[#78949] [Ruby trunk Feature#13095] [PATCH] io.c (rb_f_syscall): remove deprecation notice — kosaki.motohiro@...
Issue #13095 has been updated by Motohiro KOSAKI.
3 messages
2017/01/03
[#78997] [Ruby trunk Bug#13110] Byte-based operations for String — shugo@...
Issue #13110 has been updated by Shugo Maeda.
3 messages
2017/01/06
[#79228] Re: [ruby-cvs:64576] naruse:r57410 (trunk): Prevent GC by volatile [Bug #13150] — Eric Wong <normalperson@...>
naruse@ruby-lang.org wrote:
5 messages
2017/01/23
[#79511] Re: [ruby-cvs:64576] naruse:r57410 (trunk): Prevent GC by volatile [Bug #13150]
— Eric Wong <normalperson@...>
2017/02/13
Eric Wong <normalperson@yhbt.net> wrote:
[#79518] Re: [ruby-cvs:64576] naruse:r57410 (trunk): Prevent GC by volatile [Bug #13150]
— Nobuyoshi Nakada <nobu@...>
2017/02/13
On 2017/02/13 10:04, Eric Wong wrote:
[#79298] [Ruby trunk Bug#13085][Assigned] io.c io_fwrite creates garbage — nobu@...
Issue #13085 has been updated by Nobuyoshi Nakada.
3 messages
2017/01/29
[#79337] Re: [ruby-changes:45397] normal:r57469 (trunk): io.c: recycle garbage on write — SASADA Koichi <ko1@...>
Eric:
4 messages
2017/01/31
[#79352] Re: [ruby-changes:45397] normal:r57469 (trunk): io.c: recycle garbage on write
— Eric Wong <normalperson@...>
2017/01/31
SASADA Koichi <ko1@atdot.net> wrote:
[ruby-core:79025] [Ruby trunk Feature#13110] Byte-based operations for String
From:
duerst@...
Date:
2017-01-09 09:35:28 UTC
List:
ruby-core #79025
Issue #13110 has been updated by Martin D端rst. Shugo Maeda wrote: > Let me clarify my intention. > > I'd like to handle not only singlebyte characters but multibyte > characters efficiently by byte-based operations. What about using UTF-32? It will use some additional memory, but give you the speed you want. > Once a string is scanned, we have a byte offset, so we don't need > scan the string from the beginning, but we are forced to do it by > the current API. One way to improve this is to somehow cache the last used character and byte index for a string. I think Perl does something like this. This could be expanded to a string with several character index/byte index pairs cached, which could be searched by binary search. All this could (should!) be totally opaque to the Ruby programmer (except for the speedup). Another way would be to return an Index object that keeps the character and byte indices opaque, but can be used in a general way where speedups are needed. > In the following example, the byteindex version is much faster than > the index version. Of course it is. (Usually programs in C are faster than programs in Ruby, and this is just moving closer to C, and thus getting faster.) But what I'm wondering is that using a single string for the data in an editor buffer may still be quite inefficient. Adding or deleting a character in the middle of the buffer will be slow, even if you know the exact position in bytes. Changing the representation e.g. to an array of lines will make the efficiency mostly go away. (After all, editors need only be as fast as humans can type :-). More generally, what I'm afraid of is that with this, we start to more and more expose String internals. That can easily lead to problems. Some people may copy a Ruby snippet using byteindex, then add 1 to that index because they think that's how to get to the next character. Others may start to use byteindex everywhere, even if it's absolutely not necessary. Others may demand byte- versions of more and more operations on strings. We have seen all of this in other contexts. ---------------------------------------- Feature #13110: Byte-based operations for String https://bugs.ruby-lang.org/issues/13110#change-62433 * Author: Shugo Maeda * Status: Open * Priority: Normal * Assignee: * Target version: ---------------------------------------- How about to add byte-based operations for String? ```ruby s = "あああいいいあああ" p s.byteindex(/ああ/, 4) #=> 18 x, y = Regexp.last_match.byteoffset(0) #=> [18, 24] s.bytesplice(x...y, "おおお") p s #=> "あああいいいおおおあ" ``` ---Files-------------------------------- byteindex.diff (2.83 KB) -- https://bugs.ruby-lang.org/ Unsubscribe: <mailto:ruby-core-request@ruby-lang.org?subject=unsubscribe> <http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>