From: ko1@... Date: 2014-08-03T10:13:28+00:00 Subject: [ruby-core:64172] [ruby-trunk - Feature #10096] [PATCH] use khash for fstring and id_str tables Issue #10096 has been updated by Koichi Sasada. Eric Wong wrote: > frozen_strings and global_symbols.id_str hashes are two of the bigger > hashes in Ruby. They are needlessly ordered and incurs malloc overhead > in every st_table_entry. We will change global_symbols.id_str (simple array) soon. So do not touch it. ID is now includes serial number. So use it. But there are several issues to apply this simple approach. For example, some IDs share same serial number. But we (Nobu and I) discussed to solve them, and made up solutions. > Use an unordered open-addressing table which incurs no additional malloc > overhead besides the (larger) table itself. > > Reduces "ruby -e exit" (w/RubyGems) by ~200K on eglibc malloc on > amd64 Debian stable due to having fewer allocations Please write "from which size". 100MB -> (100MB-200KB) is trivial. 300KB->100KB is great work. > I chose khash because it is flexible and has (IMHO) a good API. > The API should also be familiar to mruby hackers, as mruby uses > a version of khash. Did you check performance? I know mruby uses it (and maybe more products). So I assume it is sophisticated. How do you feel? ---------------------------------------- Feature #10096: [PATCH] use khash for fstring and id_str tables https://bugs.ruby-lang.org/issues/10096#change-48173 * Author: Eric Wong * Status: Open * Priority: Normal * Assignee: Koichi Sasada * Category: core * Target version: current: 2.2.0 ---------------------------------------- frozen_strings and global_symbols.id_str hashes are two of the bigger hashes in Ruby. They are needlessly ordered and incurs malloc overhead in every st_table_entry. Use an unordered open-addressing table which incurs no additional malloc overhead besides the (larger) table itself. Reduces "ruby -e exit" (w/RubyGems) by ~200K on eglibc malloc on amd64 Debian stable due to having fewer allocations global_symbols.str_id is left unchanged (for now) because it is used for Symbol.all_symbols where ordering is expected This introduces no user-visible changes or incompatibility (unless I added a bug :x). I chose khash because it is flexible and has (IMHO) a good API. The API should also be familiar to mruby hackers, as mruby uses a version of khash. Future changes: * covert smaller internal hashes where ordering is not exposed to Ruby users * (hopefully) other hashes (methods/constants/ivars) hashes [Feature #9614] (Note: tried a few times in lynx with 503 errors, never had this problem before in lynx, trying clunky browser now) ---Files-------------------------------- khash.patch (35.5 KB) -- https://bugs.ruby-lang.org/