[#70257] [Ruby trunk - Feature #11420] [Open] Introduce ID key table into MRI — ko1@...

Issue #11420 has been reported by Koichi Sasada.

11 messages 2015/08/06

[ruby-core:70357] Re: [Ruby trunk - Feature #11420] [Open] Introduce ID key table into MRI

From: Юрий Соколов <funny.falcon@...>
Date: 2015-08-12 21:28:29 UTC
List: ruby-core #70357
First btw is:

struct entry {
   uint32 hash, next, prev, forw;
   st_data_t key, value;
 } entries[capa];
uint32 bins[numbins];
#=>
 bins   : [ ... ]
 entries: [[hash1, next, prev, forw, key1, val1], ...]

Second btw (php7-like) is (as you correctly assumed):

struct entry {
   uint hash, next;
   st_data_t key, value;
 } entries[capa];
uint32 bins[numbins];
#=>
 bins   : [ ... ]
 entries: [[hash1, next, key1, val1], ...]

I think, incremental compaction can fix worst case of php7-like hash
(longliving continiously mutated huge hash), so i'll try to implement it.

> struct hash_id_table {
>     int capa;
>     int num;
>     int used;
>     item_t *items;
> };
> Can we eliminate `used'? If we can remove it, table size will be 16B.
> I understand used and num is not same (used counts delete empty,
> collited entries), but `used' is used only for extension. How about to
> use num for this purpose? Sorry if I misunderstood.

With presence of deletions, 'used' is neccessary for most of
open-addressing algorithms.

Two algos which don't need 'used' are "Coockoo Hashing" and "Robin-Hood
with backshifting". Coockoo limits number of possible slots for an element,
and Robin-Hood do more job on deletion (in case of collisions). Both will
do much more job on insertion than simple openaddressing.

Coalesce chained hash could be also made without free_pos with searching
for free position like with open-adressing. It will increase insertion time
a bit, but otherwise, it is ok. Should I implement it?

With regards,
Sokolov Yura aka funny_falcon

In This Thread