[#63592] [ruby-trunk - Bug #10009] IO operation is 10x slower in multi-thread environment — normalperson@...
Issue #10009 has been updated by Eric Wong.
3 messages
2014/07/08
[#63682] [ruby-trunk - Feature #10030] [PATCH] reduce rb_iseq_struct to 296 bytes — ko1@...
Issue #10030 has been updated by Koichi Sasada.
3 messages
2014/07/13
[#63703] [ruby-trunk - Feature #10030] [PATCH] reduce rb_iseq_struct to 296 bytes — ko1@...
Issue #10030 has been updated by Koichi Sasada.
3 messages
2014/07/14
[#63743] [ruby-trunk - Bug #10037] Since r46798 on Solaris, "[BUG] rb_vm_get_cref: unreachable" during make — ngotogenome@...
Issue #10037 has been updated by Naohisa Goto.
3 messages
2014/07/15
[#64136] Ruby 2.1.2 (and 2.1.1 and probably others) assumes a libffi with 3 version numbers in extconf.rb — "Jeffrey 'jf' Lim" <jfs.world@...>
As per subject.
4 messages
2014/07/31
[#64138] Re: Ruby 2.1.2 (and 2.1.1 and probably others) assumes a libffi with 3 version numbers in extconf.rb
— "Jeffrey 'jf' Lim" <jfs.world@...>
2014/07/31
On Thu, Jul 31, 2014 at 6:03 PM, Jeffrey 'jf' Lim <jfs.world@gmail.com>
[ruby-core:63490] [ruby-trunk - Bug #9676] String#gsub shouldn't allocate so many Strings in its loop
From:
usa@...
Date:
2014-07-02 06:40:50 UTC
List:
ruby-core #63490
Issue #9676 has been updated by Usaku NAKAMURA.
Backport changed from 2.0.0: UNKNOWN, 2.1: UNKNOWN to 2.0.0: WONTFIX, 2.1: UNKNOWN
----------------------------------------
Bug #9676: String#gsub shouldn't allocate so many Strings in its loop
https://bugs.ruby-lang.org/issues/9676#change-47533
* Author: Sam Rawlins
* Status: Closed
* Priority: Normal
* Assignee:
* Category:
* Target version:
* ruby -v: trunk
* Backport: 2.0.0: WONTFIX, 2.1: UNKNOWN
----------------------------------------
`rb_reg_search()` allocates (dups) a String to attach to the backreference object ( `RMATCH(match)->str = rb_str_new4(str);` ). If #gsub has been passed 2 arguments (not Enumerator form) and the second argument is a String, then it shouldn't make these allocations when calling `rb_reg_search()` inside it's loop.
Here's an example:
# gsub-allocates-too-much.rb
require File.join(__dir__, "lib", "allocation_stats")
def puts_object_list(name, stats)
objects = stats.allocations.group_by(:sourcefile, :sourceline, :class).all.
values.flatten.map(&:object).
map {|o| o.is_a?(String) ? "#{o.inspect}<#{o.encoding.to_s}>" : o.inspect }
puts "#{name} #{objects.flatten.size} new objects:"
objects.group_by(&:hash).values.each { |ary| puts "#{ary.join(", ")}" }
end
slash = '/'; underscore = '_'; colon = ':' # allocate before the trace
str = "12:34:45:67"
stats = AllocationStats.trace { str.gsub(colon, underscore) }
puts '> "12:34:45:67".gsub(":", "_")'
puts_object_list("gsub substitutes 3x times:", stats)
$ ruby ../allocation_stats/gsub-allocates-too-much.rb
> "12:34:45:67".gsub(":", "_")
gsub substitutes 3x times: 12 new objects:
"12:34:45:67"<UTF-8>, "12:34:45:67"<UTF-8>, "12:34:45:67"<UTF-8>, "12:34:45:67"<UTF-8>
"12_34_45_67"<UTF-8>
":"<ASCII-8BIT>, ":"<ASCII-8BIT>
":"<US-ASCII>, ":"<US-ASCII>
#<MatchData ":">
#<MatchData nil>
/:/
The Strings that are copies of the original String are all unnecessary (except one, the last).
I have a fix (attached and at [1]) that involves allocating the str attribute of the backreference object only when necessary. In order to do this without changing the signature of `rb_reg_search()`, this patch changes `rb_reg_search()` to wrap a new function `rb_reg_search0()`. So no calls to `rb_reg_search()` need to change, and `str_gsub()` changes two calls into `rb_reg_search0()` to avoid the allocations. (I believe String#split suffers from the same extra allocations, and can make a similar call to `rb_reg_search0()`.)
The impact of this fix is primarily faster garbage collection. I have two "real world" examples:
* ActiveRecord sqlite3 specs: total time in GC reduced from 11.2s to 10.4s (7% savings).
* Mail gem specs: total time in GC reduced from 0.220s to 0.215s (2% savings).
These numbers bounced around a lot though. I'm open to better benchmarking suggestions. I used ActiveRecord and Mail for real world examples of #gsub, where realistic Strings are gsubbed.
[1] https://github.com/ruby/ruby/pull/578
---Files--------------------------------
ruby-578.diff (2.56 KB)
--
https://bugs.ruby-lang.org/