From: "ioquatix (Samuel Williams)" Date: 2022-09-26T12:50:24+00:00 Subject: [ruby-core:110088] [Ruby master Feature#14900] Extra allocation in String#byteslice Issue #14900 has been updated by ioquatix (Samuel Williams). By the way, ideally, I think you can implement this: ```ruby buffer = String.new # allocation while true # Efficiently read into the buffer: if buffer.empty? io.read(1024, buffer) else buffer << io.read(1024) end # Consume the buffer in chunks: while size = consume(buffer) buffer.byteslice!(size..-1) # shared root string - no memcpy or allocation end end ``` ---------------------------------------- Feature #14900: Extra allocation in String#byteslice https://bugs.ruby-lang.org/issues/14900#change-99343 * Author: janko (Janko Marohni��) * Status: Open * Priority: Normal ---------------------------------------- When executing `String#byteslice` with a range, I noticed that sometimes the original string is allocated again. When I run the following script: ~~~ ruby require "objspace" string = "a" * 100_000 GC.start GC.disable generation = GC.count ObjectSpace.trace_object_allocations do string.byteslice(50_000..-1) ObjectSpace.each_object(String) do |string| p string.bytesize if ObjectSpace.allocation_generation(string) == generation end end ~~~ it outputs ~~~ 50000 100000 6 5 ~~~ The one with 50000 bytes is the result of `String#byteslice`, but the one with 100000 bytes is the duplicated original string. I expected only the result of `String#byteslice` to be amongst new allocations. If instead of the last 50000 bytes I slice the *first* 50000 bytes, the extra duplication doesn't occur. ~~~ ruby # ... string.byteslice(0, 50_000) # ... ~~~ ~~~ 50000 5 ~~~ It's definitely ok if the implementation of `String#bytesize` allocates extra strings as part of the implementation, but it would be nice if they were deallocated before returning the result. EDIT: It seems that `String#slice` has the same issue. -- https://bugs.ruby-lang.org/ Unsubscribe: