[#8566] Visions for 2001/1.7.x development? — Robert Feldt <feldt@...>

Hi matz and other Ruby developers,

18 messages 2001/01/03
[#8645] Re: Visions for 2001/1.7.x development? — matz@... (Yukihiro Matsumoto) 2001/01/04

Hi,

[#8580] bug?? — jmichel@... (Jean Michel)

I don't understand the following behaviour:

19 messages 2001/01/03

[#8633] Interesting Language performance comparisons - Ruby, OCAML etc — "g forever" <g24ever@...>

13 messages 2001/01/04

[#8774] No :<, :>, etc. methods for Array — "Brian F. Feldman" <green@...>

So, why not include Comparable in Array by default? It shouldn't have any

28 messages 2001/01/07
[#8779] Re: No :<, :>, etc. methods for Array — matz@... (Yukihiro Matsumoto) 2001/01/07

Hi,

[#8780] Re: No :<, :>, etc. methods for Array — "Brian F. Feldman" <green@...> 2001/01/07

matz@zetabits.com (Yukihiro Matsumoto) wrote:

[#8781] Re: No :<, :>, etc. methods for Array — gotoken@... (GOTO Kentaro) 2001/01/07

In message "[ruby-talk:8780] Re: No :<, :>, etc. methods for Array"

[#8782] Re: No :<, :>, etc. methods for Array — "Brian F. Feldman" <green@...> 2001/01/07

gotoken@math.sci.hokudai.ac.jp (GOTO Kentaro) wrote:

[#8829] Sandbox (again) — wys@... (Clemens Wyss)

Hi,

20 messages 2001/01/08
[#8864] Re: Sandbox (again) — Clemens Hintze <c.hintze@...> 2001/01/08

On 8 Jan, Clemens Wyss wrote:

[#8931] String confusion — Anders Bengtsson <ndrsbngtssn@...>

Hello everyone,

21 messages 2001/01/09
[#8937] Re: String confusion — matz@... (Yukihiro Matsumoto) 2001/01/09

Hi,

[#8953] Please remove account from files — "Thomas Daniels" <westernporter@...>

Please take my e-mail address from your files and "CANCEL" my =

14 messages 2001/01/09
[#8983] Re: Please remove account from files — John Rubinubi <rubinubi@...> 2001/01/10

On Wed, 10 Jan 2001, Thomas Daniels wrote:

[#9020] time to divide -talk? (was: Please remove account from files) — Yasushi Shoji <yashi@...> 2001/01/10

At Wed, 10 Jan 2001 14:23:30 +0900,

[#9047] Re: time to divide -talk? (was: Please remov e account from files) — Aleksi Niemel<aleksi.niemela@...>

Yasushi Shoji:

27 messages 2001/01/10
[#9049] Re: time to divide -talk? — Yasushi Shoji <yashi@...> 2001/01/10

At Thu, 11 Jan 2001 00:20:45 +0900,

[#9153] what about this begin? — Anders Strandl Elkj誡 <ase@...> 2001/01/11

[#9195] Re: Redefining singleton methods — ts <decoux@...>

>>>>> "H" == Horst Duch=EAne?= <iso-8859-1> writes:

10 messages 2001/01/12

[#9242] polymorphism — Maurice Szmurlo <maurice@...>

hello

73 messages 2001/01/13

[#9279] Can ruby replace php? — Jim Freeze <jim@...>

When I read that ruby could be used to replace PHP I got really

15 messages 2001/01/14

[#9411] The Ruby Way — "Conrad Schneiker" <schneiker@...>

As a member of the "Big 8" newsgroups, "The Ruby Way" (of posting) is to

15 messages 2001/01/17

[#9462] Re: reading an entire file as a string — ts <decoux@...>

>>>>> "R" == Raja S <raja@cs.indiana.edu> writes:

35 messages 2001/01/17
[#9465] Re: reading an entire file as a string — Dave Thomas <Dave@...> 2001/01/17

raja@cs.indiana.edu (Raja S.) writes:

[#9521] Larry Wall INterview — ianm74@...

Larry was interviewed at the Perl/Ruby conference in Koyoto:

20 messages 2001/01/18
[#10583] Re: Larry Wall INterview — "greg strockbine" <gstrock@...> 2001/02/08

Larry Wall's interview is how I found out

[#9610] Re: 101 Misconceptions About Dynamic Languages — "Ben Tilly" <ben_tilly@...>

"Christian" <christians@syd.microforte.com.au> wrote:

13 messages 2001/01/20

[#9761] Re: 101 Misconceptions About Dynamic Languages — ts <decoux@...>

>>>>> "C" == Christoph Rippel <crippel@primenet.com> writes:

16 messages 2001/01/23

[#9792] Ruby 162 installer available — Dave Thomas <Dave@...>

15 messages 2001/01/24

[#9958] Re: Vim syntax files again. — "Conrad Schneiker" <schneik@...>

Hugh Sasse wrote:

14 messages 2001/01/26
[#10065] Re: Vim syntax files again. — Hugh Sasse Staff Elec Eng <hgs@...> 2001/01/29

On Sat, 27 Jan 2001, Conrad Schneiker wrote:

[#9975] line continuation — "David Ruby" <ruby_david@...>

can a ruby statement break into multiple lines?

18 messages 2001/01/27
[#9976] Re: line continuation — Michael Neumann <neumann@...> 2001/01/27

On Sat, 27 Jan 2001, David Ruby wrote:

[#9988] Re: line continuation — harryo@... (Harry Ohlsen) 2001/01/28

>A statement break into mutliple lines if it is not complete,

[ruby-talk:9633] Re: Strange Hash behavior

From: "Ben Tilly" <ben_tilly@...>
Date: 2001-01-21 06:17:30 UTC
List: ruby-talk #9633
Aleksi Niemel<aleksi.niemela@cinnober.com> wrote:
>
>matz says:
> > >Hash checks equality of keys by `hash' and `eql?'.
> > >Try to define both.
> > >
>
>Agnot asks:
> > Ok, thanks. But for me its a violation of "Principle of Least
> > Surprise".
> > Can someone give a rationale?
>
>It's pretty common for Hash to require these two things. With the method
>hash you can ask "unique" code for each object, thus achieve O(n) searches.
>OTOH, you seldom can generate quickly a meaningful hash code which doesn't
>give duplicates; those are called collisions. I guess it's quite common to
>first find a right list of key-value pairs through a hash code, and then 
>the
>right pair inside the list through some equality check.
>
>IMHO it's also wise to define eql? instead of blindly use ==, as you might
>want the hash comparison to be different from normal comparison.

The above explanation makes more sense if you understand
what a hash is.  Probably most of this list does, but for
those who don't here is an explanation.  (If you understand
hashes then there is no point reading on.  If you never
really knew what they do and how they work...)

Suppose that I want to be able to keep track of things that
I have seen and be able to find them again.  How would I do
it?

Well the simplest approach is to just keep a list of what I
have seen.  Whenever I want to look something up I just
scan my list and see if I have it.  If I don't I add it to
the list.  This is simple but quickly becomes slow.

So what do I do?

Well the obvious solution is have some sort of structured
way to hold the data which makes lookup fast.  For instance
I might file it away alphabetically.  I might keep it in a
sorted list and do a binary lookup.  I might store the
data in a tree.  I might store the data in a tree and then
do some smart stuff to make sure that the tree is fairly
well balanced.  etc.

All of these solutions run into various trade-offs.  In
general most of the really obvious structured approaches
involve either making lots of assumptions, or they
involve making O(log(n)) comparisons.  If you have a
hundred thousand things I far prefer making 18
comparisons than 50,000 on average, but we can still do
better on most datasets.

Hashes are an example of, "Do something random and then
depend on chance".  In a hash you have buckets you throw
your data in.  Generally you try to have somewhat more
buckets than data.  (Too much more and you waste space.
Too little and you get too much in each bucket.  What
most people do is every so often resize the hash and move
things to their real buckets.)  You have a hashing
function that produces a number.  Take that number mod
the number of buckets, and you know which bucket to toss
it in.  Toss it.

When you want to know if you have seen something before, you
just figure out which bucket it would have been tossed in,
and rummage around in that bucket.  The laws of probability
tell us that most of the time that bucket has little in it.
(Actually it tells us that you have a Poisson distribution.)
So usually you only have to do something like 1-2
comparisons.  (Actually it is better than that.  Keep track
of the hash value you had, and compare hash values first.
Most of the time that is a very fast comparison.  So the
full equality check is usually just a final sanity check.)

But the catch is that if your hashing function has lots of
collisions, then you slow down.  Oops.  But there are
well-known hash functions that tend to work very well in
practice.  So if your hash function doesn't take too long
to compute, this is a good strategy.

I don't know what hash function Ruby uses.  The one that
Perl uses works something like this:

  def hash (string)
    hash_val = 0
    for byte in string.each_byte do
      hash_val = 33*hash_val + byte
    end
    hash_val += hash_val/32
  end

(Except using integer operations done in C.)

Cheers,
Ben
_________________________________________________________________
Get your FREE download of MSN Explorer at http://explorer.msn.com

In This Thread

Prev Next