From: Michael Selig Date: 2010-09-22T15:34:50+09:00 Subject: [ruby-core:32498] [Ruby 1.9-Bug#3855][Open] String#rindex extremely slow on long UTF8 strings --mimepart_4c99a38235188_eedcdd796a12342 Content-Type: text/plain Content-Transfer-Encoding: Quoted-printable Content-Disposition: inline Bug #3855: String#rindex extremely slow on long UTF8 strings http://redmine.ruby-lang.org/issues/show/3855 Author: Michael Selig Status: Open, Priority: Normal ruby -v: ruby 1.9.3dev (2010-09-21 trunk 29308) [i686-linux] Not really a bug ..... I think this issue was raised a few months ago, but I have done a very si= mple patch to string.c to fix the problem. Example: ruby -e 'p String.new("XXX\u0639" + "X" * 100000).rindex("\u0639")' takes approx 2.7 secs on my old AMD Athlon system, but only approx 0.02 s= ec with the patch below. The problem is worst when the search string is e= ither not found or is near the beginning of the string. The issue is the call to "str_nth()" which has to scan the string repeate= dly on multibyte encodings just to locate where to start comparing. I hope that you will consider applying the patch. Mike ---------------------------------------- http://redmine.ruby-lang.org --mimepart_4c99a38235188_eedcdd796a12342 Content-Type: application/octet-stream; name=rindex.pat Content-Transfer-Encoding: Base64 Content-Disposition: attachment; filename=rindex.pat SW5kZXg6IHN0cmluZy5jCj09PT09PT09PT09PT09PT09PT09PT09PT09PT09 PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT0KLS0tIHN0 cmluZy5jCShyZXZpc2lvbiAyOTMxNSkKKysrIHN0cmluZy5jCSh3b3JraW5n IGNvcHkpCkBAIC0yNDg4LDE0ICsyNDg4LDE0IEBACiAgICAgZSA9IFJTVFJJ TkdfRU5EKHN0cik7CiAgICAgdCA9IFJTVFJJTkdfUFRSKHN1Yik7CiAgICAg c2xlbiA9IFJTVFJJTkdfTEVOKHN1Yik7Ci0gICAgZm9yICg7OykgewotCXMg PSBzdHJfbnRoKHNiZWcsIGUsIHBvcywgZW5jLCBzaW5nbGVieXRlKTsKLQlp ZiAoIXMpIHJldHVybiAtMTsKKyAgICBzID0gc3RyX250aChzYmVnLCBlLCBw b3MsIGVuYywgc2luZ2xlYnl0ZSk7CisgICAgd2hpbGUgKHMpIHsKIAlpZiAo bWVtY21wKHMsIHQsIHNsZW4pID09IDApIHsKIAkgICAgcmV0dXJuIHBvczsK IAl9CiAJaWYgKHBvcyA9PSAwKSBicmVhazsKIAlwb3MtLTsKKwlzID0gcmJf ZW5jX3ByZXZfY2hhcihzYmVnLCBzLCBlLCBlbmMpOwogICAgIH0KICAgICBy ZXR1cm4gLTE7CiB9Cg== --mimepart_4c99a38235188_eedcdd796a12342--