ruby-core

2013/6/5 Tanaka Akira <akr@fsij.org>:
>
> At first time I see that, I guess there is no application of "nails" argument.
>
> Now I think I know several applications:
>
> * Integer#to_s(radix) for radix is power of 2
>   For example, i.to_s(2) can be implemented using that with size=1 and
> nails=7 and
>   0 and 1 are converted to '0' and '1' later.
>
> * BER-compressed integer for pack (size=1 and nails=1)

I implemented rb_integer_pack and rb_integer_unpack in ruby trunk.
(It is declared in internal.h.  So it is not public now.)

I designed them as follows:

  /* "MS" in MSWORD and MSBYTE means "most significant" */
  /* "LS" in LSWORD and LSBYTE means "least significant" */
  /* For rb_integer_pack and rb_integer_unpack: */
  #define INTEGER_PACK_MSWORD_FIRST       0x01
  #define INTEGER_PACK_LSWORD_FIRST       0x02
  #define INTEGER_PACK_MSBYTE_FIRST       0x10
  #define INTEGER_PACK_LSBYTE_FIRST       0x20
  #define INTEGER_PACK_NATIVE_BYTE_ORDER  0x40
  /* For rb_integer_unpack: */
  #define INTEGER_PACK_FORCE_BIGNUM       0x100
  /* Combinations: */
  #define INTEGER_PACK_LITTLE_ENDIAN \
      (INTEGER_PACK_LSWORD_FIRST | \
       INTEGER_PACK_LSBYTE_FIRST)
  #define INTEGER_PACK_BIG_ENDIAN \
      (INTEGER_PACK_MSWORD_FIRST | \
       INTEGER_PACK_MSBYTE_FIRST)

  /*
   * Export an integer into a buffer.
   *
   * This function fills the buffer specified by _words_ and _numwords_ as
   * abs(val) in the format specified by _wordsize_, _nails_ and _flags_.
   *
   * [val] Fixnum, Bignum or another integer like object which has
to_int method.
   * [words] buffer to export abs(val).
   * [numwords] the size of given buffer as number of words.
   * [wordsize] the size of word as number of bytes.
   * [nails] number of padding bits in a word.
   *   Most significant nails bits of each word are filled by zero.
   * [flags] bitwise or of constants which name starts "INTEGER_PACK_".
   *   It specifies word order and byte order.
   *
   * This function returns the signedness and overflow condition as follows:
   *   -2 : negative overflow.  val <= -2**(numwords*(wordsize*CHAR_BIT-nails))
   *   -1 : negative without overflow.
-2**(numwords*(wordsize*CHAR_BIT-nails)) < val < 0
   *   0 : zero.  val == 0
   *   1 : positive without overflow.  0 < val <
2**(numwords*(wordsize*CHAR_BIT-nails))
   *   2 : positive overflow.  2**(numwords*(wordsize*CHAR_BIT-nails)) <= val
   *
   * The least significant words of abs(val) are filled in the buffer
when overflow occur.
   */
  int rb_integer_pack(VALUE val, void *words, size_t numwords, size_t
wordsize, size_t nails, int flags);

  /*
   * Import an integer into a buffer.
   *
   * [sign] signedness of the value.
   *   -1 for non-positive.  0 or 1 for non-negative.
   * [words] buffer to import.
   * [numwords] the size of given buffer as number of words.
   * [wordsize] the size of word as number of bytes.
   * [nails] number of padding bits in a word.
   *   Most significant nails bits of each word are ignored.
   * [flags] bitwise or of constants which name starts "INTEGER_PACK_".
   *   It specifies word order and byte order.
   *   Also, INTEGER_PACK_FORCE_BIGNUM specifies that the result will
be a Bignum
   *   even if it is representable as a Fixnum.
   *
   * This function returns the imported integer as Fixnum or Bignum.
   */
  VALUE rb_integer_unpack(int sign, const void *words, size_t
numwords, size_t wordsize, size_t nails, int flags);

I also implemented two functions to calculate the require buffer size.

  /*
   * Calculate a number of bytes to be required to represent
   * the absolute value of the integer given as _val_.
   *
   * [val] an integer.
   * [nlz_bits_ret] number of leading zero bits in the most
significant byte is returned if not NULL.
   *
   * This function returns ((val_numbits * CHAR_BIT + CHAR_BIT - 1) / CHAR_BIT)
   * where val_numbits is the number of bits of abs(val).
   * This function should not overflow.
   *
   * If nlz_bits_ret is not NULL,
   * (return_value * CHAR_BIT - val_numbits) is stored in *nlz_bits_ret.
   * In this case, 0 <= *nlz_bits_ret < CHAR_BIT.
   *
   */
  size_t rb_absint_size(VALUE val, int *nlz_bits_ret);

  /*
   * Calculate a number of words to be required to represent
   * the absolute value of the integer given as _val_.
   *
   * [val] an integer.
   * [word_numbits] number of bits in a word.
   * [nlz_bits_ret] number of leading zero bits in the most
significant word is returned if not NULL.
   *
   * This function returns ((val_numbits * CHAR_BIT + word_numbits -
1) / word_numbits)
   * where val_numbits is the number of bits of abs(val).
   * If it overflows, (size_t)-1 is returned.
   *
   * If nlz_bits_ret is not NULL and overflow is not occur,
   * (return_value * word_numbits - val_numbits) is stored in *nlz_bits_ret.
   * In this case, 0 <= *nlz_bits_ret < word_numbits.
   *
   */
  size_t rb_absint_numwords(VALUE val, size_t word_numbits, size_t
*nlz_bits_ret);

Any opinions about the API?

Note that packing/unpacking BER-compressed integer uses them now and
they are much faster (for very big integers).

  200% time ./ruby -e '[3**200000].pack("w")'
  ./ruby -e '[3**200000].pack("w")'  3.14s user 0.00s system 99% cpu 3.147 total
  trunk% time ./ruby -e '[3**200000].pack("w")'
  ./ruby -e '[3**200000].pack("w")'  0.02s user 0.01s system 95% cpu 0.029 total

  200% time ./ruby -e '("\x81"*100000+"\x00").unpack("w")'
  ./ruby -e '("\x81"*100000+"\x00").unpack("w")'  4.21s user 0.03s
system 99% cpu 4.251 total
  trunk% time ./ruby -e '("\x81"*100000+"\x00").unpack("w")'
  ./ruby -e '("\x81"*100000+"\x00").unpack("w")'  0.02s user 0.00s
system 88% cpu 0.023 total

Integer#to_s(power-of-2) is also bit faster now.

  200% time ./ruby -e 'v = 3**100000; 1000.times { v.to_s(16) }'
  ./ruby -e 'v = 3**100000; 1000.times { v.to_s(16) }'  4.77s user
0.01s system 99% cpu 4.787 total
  trunk% time ./ruby -e 'v = 3**100000; 1000.times { v.to_s(16) }'
  ./ruby -e 'v = 3**100000; 1000.times { v.to_s(16) }'  3.16s user
0.01s system 99% cpu 3.184 total

Versions:
  200% ./ruby -v
  ruby 2.0.0p214 (2013-06-09 revision 41193) [x86_64-linux]
  trunk% ./ruby -v
  ruby 2.1.0dev (2013-06-10 trunk 41214) [x86_64-linux]
-- 
Tanaka Akira

Thread

Prev Next

In This Thread

Prev Next