[#70257] [Ruby trunk - Feature #11420] [Open] Introduce ID key table into MRI — ko1@...

Issue #11420 has been reported by Koichi Sasada.

11 messages 2015/08/06

[ruby-core:70355] Re: [Ruby trunk - Feature #11420] [Open] Introduce ID key table into MRI

From: Юрий Соколов <funny.falcon@...>
Date: 2015-08-12 19:29:14 UTC
List: ruby-core #70355
I thought to suggest to embed hash_id_table directly into places when it is
used (instead of pointer to).
It will use 24byte on 64bit platform, so not too big.
It has benefit when used with jemalloc/tcmalloc which are good at
allocating sizes = 2**x.
With flexible array size will be not equal to power of 2, so that they will
consume a bit more memory on big tables.

But I think flexible array are also very good idea.
Let me know, if I need to fix hash variant for this.

btw, I'm trying to improve st_table itself by allocating items in a
continuous array and using indices instead of pointers
between them. It should decrease allocation pressure and improve cache
locality, but has drawback of size limit
(2 billion items) and not releasing space on every deletion.

'btw take 2' Dit you see new PHP7 `array` implementation ?
https://nikic.github.io/2014/12/22/PHPs-new-hashtable-implementation.html
PHP array is the same logical data-structure as Ruby Hash, so we could try
their implementation decision as well.
Most hashes are only filled, with no deletion called, so that this
construction is really great in term of speed,
memory locality and memory consumption.
It has drawback for huge long-living hash with continuous insertions and
deletions (it should be periodically reordered).


2015-08-12 16:32 GMT+03:00 Eric Wong <normalperson@yhbt.net>:

> Btw, did you consider using flexible array to avoid extra malloc
> and improve cache locality?
>
> Since we introduce new API anyways, we may change the tbl argument to **
> so rb_id_table_insert can modify tbl as needed.  Something like:
>
> int rb_id_table_insert(struct rb_id_table **tbl, ID id, VALUE val);
> struct hash_id_table {
>     int capa;
>     int num;
>     int used;
>     item_t items[1]; /* flexible array */
> };
>

In This Thread