From: normalperson@... Date: 2017-01-05T23:09:56+00:00 Subject: [ruby-core:78983] [Ruby trunk Bug#13085] io.c io_fwrite creates garbage Issue #13085 has been updated by Eric Wong. File 0001-v2-io.c-io_fwrite-copy-to-hidden-buffer-when-writing.patch added OK, different strategy; not as fast, but still better than what we currently have. [PATCH v2] io.c (io_fwrite): copy to hidden buffer when writing This avoids garbage from IO#write for [Bug #13085] when called in a read-write loop while protecting the VM from race conditions forced by the user. Memory usage from benchmark/bm_io_copy_stream_write.rb is reduced greatly: target 0: a (ruby 2.5.0dev (2017-01-05 trunk 57270) [x86_64-linux]) target 1: b (ruby 2.5.0dev (2017-01-05) [x86_64-linux]) Memory usage (last size) (B) name a b io_copy_stream_write 81899520.000 6561792.000 Memory consuming ratio (size) with the result of `a' (greater is better) name b io_copy_stream_write 12.481 Despite the extra deep data copy, there is a small speedup in execution time due to GC avoidance: Execution time (sec) name a b io_copy_stream_write 0.393 0.296 Speedup ratio: compare with the result of `a' (greater is better) name b io_copy_stream_write 1.328 This patch increases memory bandwidth use by pessimistically copying the data into a temporary hidden buffer. Using a lightweight frozen copy (as before this patch) is ineffective in read-write loops, since the read operation will trigger a heavy copy behind our back due to the CoW operation. It is also impossible to safely release memory from the lightweight CoW string, because we have no idea how many lightweight duplicates exist by the time we reacquire GVL. So, we now make a heavy copy up front which we recycle immediately upon completion. Ideally, Ruby should have a way of detecting Strings which are not visible to other threads and be able to optimize away the internal copy. Or, we give up on the idea of implicit data sharing between threads since its dangerous anyways. ---------------------------------------- Bug #13085: io.c io_fwrite creates garbage https://bugs.ruby-lang.org/issues/13085#change-62398 * Author: Eric Wong * Status: Open * Priority: Normal * Assignee: * Target version: * ruby -v: * Backport: 2.2: UNKNOWN, 2.3: UNKNOWN, 2.4: UNKNOWN ---------------------------------------- Relying on rb_str_new_frozen for unconverted strings does not save memory because copy-on-write is always triggered in read-write I/O loops were subsequent IO#read calls will clobber the given write buffer. ```ruby buf = ''.b while input.read(16384, buf) output.write(buf) end ``` This generates a lot of garbage starting with Ruby 2.2 (r44471). For my use case, even `IO.copy_stream` generates garbage, since I wrap "write" to do Digest calculation in a single pass. I tried using rb_str_replace and reusing the string as a hidden `(klass == 0)` thread-local, but `rb_str_replace` attempts CoW optimization by creating new frozen objects, too: https://80x24.org/spew/20161229004417.12304-1-e@80x24.org/raw So, I'm not sure what to do, temporal locking seems wrong for writing strings (I guess it's for reading?). I get `test_threaded_flush` failures with the following: https://80x24.org/spew/20161229005701.9712-1-e@80x24.org/raw `IO#syswrite` has the same problem with garbage. I can use `IO#write_nonblock` on fast filesystems while holding GVL, I guess... ---Files-------------------------------- 0001-io.c-io_fwrite-temporarily-freeze-string-when-writin.patch (2.6 KB) 0001-v2-io.c-io_fwrite-copy-to-hidden-buffer-when-writing.patch (2.9 KB) -- https://bugs.ruby-lang.org/ Unsubscribe: