[#86787] [Ruby trunk Feature#14723] [WIP] sleepy GC — ko1@...
Issue #14723 has been updated by ko1 (Koichi Sasada).
13 messages
2018/05/01
[#86790] Re: [Ruby trunk Feature#14723] [WIP] sleepy GC
— Eric Wong <normalperson@...>
2018/05/01
ko1@atdot.net wrote:
[#86791] Re: [Ruby trunk Feature#14723] [WIP] sleepy GC
— Koichi Sasada <ko1@...>
2018/05/01
On 2018/05/01 12:18, Eric Wong wrote:
[#86792] Re: [Ruby trunk Feature#14723] [WIP] sleepy GC
— Eric Wong <normalperson@...>
2018/05/01
Koichi Sasada <ko1@atdot.net> wrote:
[#86793] Re: [Ruby trunk Feature#14723] [WIP] sleepy GC
— Koichi Sasada <ko1@...>
2018/05/01
On 2018/05/01 12:47, Eric Wong wrote:
[#86794] Re: [Ruby trunk Feature#14723] [WIP] sleepy GC
— Eric Wong <normalperson@...>
2018/05/01
Koichi Sasada <ko1@atdot.net> wrote:
[#86814] Re: [Ruby trunk Feature#14723] [WIP] sleepy GC
— Koichi Sasada <ko1@...>
2018/05/02
[#86815] Re: [Ruby trunk Feature#14723] [WIP] sleepy GC
— Eric Wong <normalperson@...>
2018/05/02
Koichi Sasada <ko1@atdot.net> wrote:
[#86816] Re: [Ruby trunk Feature#14723] [WIP] sleepy GC
— Koichi Sasada <ko1@...>
2018/05/02
On 2018/05/02 11:49, Eric Wong wrote:
[#86847] [Ruby trunk Bug#14732] CGI.unescape returns different instance between Ruby 2.3 and 2.4 — me@...
Issue #14732 has been reported by jnchito (Junichi Ito).
3 messages
2018/05/02
[#86860] [Ruby trunk Feature#14723] [WIP] sleepy GC — sam.saffron@...
Issue #14723 has been updated by sam.saffron (Sam Saffron).
6 messages
2018/05/03
[#86862] Re: [Ruby trunk Feature#14723] [WIP] sleepy GC
— Eric Wong <normalperson@...>
2018/05/03
sam.saffron@gmail.com wrote:
[#86935] [Ruby trunk Bug#14742] Deadlock when autoloading different constants in the same file from multiple threads — elkenny@...
Issue #14742 has been reported by eugeneius (Eugene Kenny).
5 messages
2018/05/08
[#87030] [Ruby trunk Feature#14757] [PATCH] thread_pthread.c: enable thread caceh by default — normalperson@...
Issue #14757 has been reported by normalperson (Eric Wong).
4 messages
2018/05/15
[#87093] [Ruby trunk Feature#14767] [PATCH] gc.c: use monotonic counters for objspace_malloc_increase — ko1@...
Issue #14767 has been updated by ko1 (Koichi Sasada).
3 messages
2018/05/17
[#87095] [Ruby trunk Feature#14767] [PATCH] gc.c: use monotonic counters for objspace_malloc_increase — ko1@...
Issue #14767 has been updated by ko1 (Koichi Sasada).
9 messages
2018/05/17
[#87096] Re: [Ruby trunk Feature#14767] [PATCH] gc.c: use monotonic counters for objspace_malloc_increase
— Eric Wong <normalperson@...>
2018/05/17
ko1@atdot.net wrote:
[#87166] Re: [Ruby trunk Feature#14767] [PATCH] gc.c: use monotonic counters for objspace_malloc_increase
— Eric Wong <normalperson@...>
2018/05/18
Eric Wong <normalperson@yhbt.net> wrote:
[#87486] Re: [Ruby trunk Feature#14767] [PATCH] gc.c: use monotonic counters for objspace_malloc_increase
— Eric Wong <normalperson@...>
2018/06/13
I wrote:
[ruby-core:87075] [Ruby trunk Bug#14745] High memory usage when using String#replace with IO.copy_stream
From:
janko.marohnic@...
Date:
2018-05-16 09:33:28 UTC
List:
ruby-core #87075
Issue #14745 has been updated by janko (Janko Marohnić).
> Yes, this is an unfortunate side effect because of copy-on-write
> semantics of String#replace. If the arg (other_str) for
> String#replace is non-frozen, a new frozen string is created with
> using the existing malloc-ed pointer. Both the receiver string
> and other_str point to that new, shared string.
>
> So yeah; a combination of well-intentioned optimizations hurt
> when combined together.
That makes sense, thanks for the explanation.
> The other part could be anything using IO#write could create
> massive amounts of garbage before 2.5
This is on Ruby 2.5.1, so I'm guessing it doesn't apply here.
> Finally, I always assumed your example is a contrived case and
> you're dealing with an interface somewhere (not StringIO) which
> doesn't accept a destination buffer for .read.
The example was simplified for reproducing purposes. The place where I discovered this was in https://github.com/rocketjob/symmetric-encryption/pull/98 (I eventually managed to figure out `String#replace` was causing the high memory usage, so I switched to `String#clear`).
In short, the `SymmetricEncryption::Reader` object wraps an IO object with encrypted content, and when calling `#read` it reads data from the underlying IO object, decrypts it and returns the decrypted data. So, it's not patching the lack of outbuf argument (because the underlying IO object *should* accept the outbuf argument), rather it provides an `IO#read` interface over incrementally decrypting IO object content.
If this is necessary behaviour when having the copy-on-write optimization, feel free to close this ticket then.
----------------------------------------
Bug #14745: High memory usage when using String#replace with IO.copy_stream
https://bugs.ruby-lang.org/issues/14745#change-72040
* Author: janko (Janko Marohnić)
* Status: Open
* Priority: Normal
* Assignee:
* Target version:
* ruby -v: ruby 2.5.1p57 (2018-03-29 revision 63029) [x86_64-darwin17]
* Backport: 2.3: UNKNOWN, 2.4: UNKNOWN, 2.5: UNKNOWN
----------------------------------------
I'm using custom IO-like objects that implement #read as the first argument to IO.copy_stream, and I noticed odd memory behaviour when using String#replace on the output buffer versus String#clear. Here is an example of a "fake IO" object where #read uses String#clear on the output buffer:
~~~ ruby
GC.disable
require "stringio"
class FakeIO
def initialize(content)
@io = StringIO.new(content)
end
def read(length, outbuf)
chunk = @io.read(length)
if chunk
outbuf.clear
outbuf << chunk
chunk.clear
else
outbuf.clear
end
outbuf unless outbuf.empty?
end
end
io = FakeIO.new("a" * 50*1024*1024) # 50MB
IO.copy_stream(io, File::NULL)
system "top -pid #{Process.pid}"
~~~
This program outputs memory usage of 50MB at the end, as expected – 50MB was loaded into memory at the beginning and any new strings are deallocated. However, if I modify the #read implementation to use String#replace instead of String#clear:
~~~ ruby
def read(length, outbuf)
chunk = @io.read(length)
if chunk
outbuf.replace chunk
chunk.clear
else
outbuf.clear
end
outbuf unless outbuf.empty?
end
~~~
the memory usage has now doubled to 100MB at the end of the program, indicating that some string bytes weren't successfully deallocated. So, it seems that String#replace has different behaviour compared to String#clear + String#<<.
I was *only* able to reproduce this with `IO.copy_stream`, the following program shows 50MB memory usage, regardless of whether the String#clear or String#replace approach is used:
~~~ ruby
GC.disable
buffer = "a" * 50*1024*1024
chunk = "b" * 50*1024*1024
if ARGV[0] == "clear"
buffer.clear
buffer << chunk
else
buffer.replace chunk
end
chunk.clear
system "top -pid #{Process.pid}"
~~~
With this program I also noticed one interesting thing. If I remove `chunk.clear`, then the "clear" version uses 100MB as expected (because both buffer and chunk strings are 50MB large), but the "replace" version uses only 50MB, which makes it appear that the `buffer` string doesn't use any memory when in fact it should use 50MB just like the `chunk` string. I found that odd, and I think it might be a clue to the memory bug with String#replace I experienced when using `IO.copy_stream`.
--
https://bugs.ruby-lang.org/
Unsubscribe: <mailto:ruby-core-request@ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>