[#6954] Why isn't Perl highly orthogonal? — Terrence Brannon <brannon@...>

27 messages 2000/12/09

[#7022] Re: Ruby in the US — Kevin Smith <kevinbsmith@...>

> Is it possible for the US to develop corporate

36 messages 2000/12/11
[#7633] Re: Ruby in the US — Dave Thomas <Dave@...> 2000/12/19

tonys@myspleenklug.on.ca (tony summerfelt) writes:

[#7636] Re: Ruby in the US — "Joseph McDonald" <joe@...> 2000/12/19

[#7704] Re: Ruby in the US — Jilani Khaldi <jilanik@...> 2000/12/19

> > first candidates would be mysql and postgressql because source is

[#7705] Code sample for improvement — Stephen White <steve@...> 2000/12/19

During an idle chat with someone on IRC, they presented some fairly

[#7750] Re: Code sample for improvement — "Guy N. Hurst" <gnhurst@...> 2000/12/20

Stephen White wrote:

[#7751] Re: Code sample for improvement — David Alan Black <dblack@...> 2000/12/20

Hello --

[#7755] Re: Code sample for improvement — "Guy N. Hurst" <gnhurst@...> 2000/12/20

David Alan Black wrote:

[#7758] Re: Code sample for improvement — Stephen White <steve@...> 2000/12/20

On Wed, 20 Dec 2000, Guy N. Hurst wrote:

[#7759] Next amusing problem: talking integers (was Re: Code sample for improvement) — David Alan Black <dblack@...> 2000/12/20

On Wed, 20 Dec 2000, Stephen White wrote:

[#7212] New User Survey: we need your opinions — Dave Thomas <Dave@...>

16 messages 2000/12/14

[#7330] A Java Developer's Wish List for Ruby — "Richard A.Schulman" <RichardASchulman@...>

I see Ruby as having a very bright future as a language to

22 messages 2000/12/15

[#7354] Ruby performance question — Eric Crampton <EricCrampton@...>

I'm parsing simple text lines which look like this:

21 messages 2000/12/15
[#7361] Re: Ruby performance question — Dave Thomas <Dave@...> 2000/12/15

Eric Crampton <EricCrampton@worldnet.att.net> writes:

[#7367] Re: Ruby performance question — David Alan Black <dblack@...> 2000/12/16

On Sat, 16 Dec 2000, Dave Thomas wrote:

[#7371] Re: Ruby performance question — "Joseph McDonald" <joe@...> 2000/12/16

[#7366] GUIs for Rubies — "Conrad Schneiker" <schneik@...>

Thought I'd switch the subject line to the subject at hand.

22 messages 2000/12/16

[#7416] Re: Ruby IDE (again) — Kevin Smith <kevins14@...>

>> >> I would contribute to this project, if it

17 messages 2000/12/16
[#7422] Re: Ruby IDE (again) — Holden Glova <dsafari@...> 2000/12/16

-----BEGIN PGP SIGNED MESSAGE-----

[#7582] New to Ruby — takaoueda@...

I have just started learning Ruby with the book of Thomas and Hunt. The

24 messages 2000/12/18

[#7604] Any corrections for Programming Ruby — Dave Thomas <Dave@...>

12 messages 2000/12/18

[#7737] strange border-case Numeric errors — "Brian F. Feldman" <green@...>

I haven't had a good enough chance to familiarize myself with the code in

19 messages 2000/12/20

[#7801] Is Ruby part of any standard GNU Linux distributions? — "Pete McBreen, McBreen.Consulting" <mcbreenp@...>

Anybody know what it would take to get Ruby into the standard GNU Linux

15 messages 2000/12/20

[#7938] Re: defined? problem? — Kevin Smith <sent@...>

matz@zetabits.com (Yukihiro Matsumoto) wrote:

26 messages 2000/12/22
[#7943] Re: defined? problem? — Dave Thomas <Dave@...> 2000/12/22

Kevin Smith <sent@qualitycode.com> writes:

[#7950] Re: defined? problem? — Stephen White <steve@...> 2000/12/22

On Fri, 22 Dec 2000, Dave Thomas wrote:

[#7951] Re: defined? problem? — David Alan Black <dblack@...> 2000/12/22

On Fri, 22 Dec 2000, Stephen White wrote:

[#7954] Re: defined? problem? — Dave Thomas <Dave@...> 2000/12/22

David Alan Black <dblack@candle.superlink.net> writes:

[#7975] Re: defined? problem? — David Alan Black <dblack@...> 2000/12/22

Hello --

[#7971] Hash access method — Ted Meng <ted_meng@...>

Hi,

20 messages 2000/12/22

[#8030] Re: Basic hash question — ts <decoux@...>

>>>>> "B" == Ben Tilly <ben_tilly@hotmail.com> writes:

15 messages 2000/12/24
[#8033] Re: Basic hash question — "David A. Black" <dblack@...> 2000/12/24

On Sun, 24 Dec 2000, ts wrote:

[#8178] Inexplicable core dump — "Nathaniel Talbott" <ntalbott@...>

I have some code that looks like this:

12 messages 2000/12/28

[#8196] My first impression of Ruby. Lack of overloading? (long) — jmichel@... (Jean Michel)

Hello,

23 messages 2000/12/28

[#8198] Re: Ruby cron scheduler for NT available — "Conrad Schneiker" <schneik@...>

John Small wrote:

14 messages 2000/12/28

[#8287] Re: speedup of anagram finder — "SHULTZ,BARRY (HP-Israel,ex1)" <barry_shultz@...>

> -----Original Message-----

12 messages 2000/12/29

[ruby-talk:8364] Re: speedup of anagram finder

From: "SHULTZ,BARRY (HP-Israel,ex1)" <barry_shultz@...>
Date: 2000-12-30 18:31:28 UTC
List: ruby-talk #8364
Hi David,

> -----Original Message-----
> From: David Alan Black [mailto:dblack@candle.superlink.net]
> Sent: Saturday, December 30, 2000 2:58 PM

> Disadvantage #2: it choked on "Bhagavad-Gita".  Hmmmm.... we never did
> establish guidelines for things like non-alphabetical characters (not
> a *huge* problem, but hey, I didn't write /usr/dict/words :-)  And

I think that's a minor problem, fixed with the obvious adjustment ( for all
ascii, just use the same 
idea using the first 256 primes instead of the first 26 ). Of course, that
would make the 
hash even uglier!

> hard-coding the ASCII set could be regarded as a further disadvantage,
> if one ever wanted to internationalize the code.
> 

I'll have to think about that one. 

> Anyway -- the following slightly tweaked version of the unpack-based
> implementation:
> 
>    def unpack(words)
>      anagrams, keys, word, key = {}, {}
>      for word in words do
>        word.chomp!
>        key = word.dup
>        key.downcase!
>        chars = key.unpack('c*')
>        chars.sort!
>        key = chars.pack('c*')
>        if anagrams[key]
> 	 anagrams[key] << (keys[key] = word)
>        else
> 	 anagrams[key] = [ word ]
>        end
>      end
>    end
> 
> benchmarks faster than the primes version.  Note: I have removed the
> final output loop, as it's identical in both cases.  (I've also
> removed the first runs, to compensate for First Report Time-Distension
> Syndrome.)
> 
> 		   user     system      total        real
>    primes      3.520000   0.000000   3.520000 (  3.789767)
>    unpack      2.490000   0.000000   2.490000 (  2.485712)
> 

My benchmarks are different; they still show primes as faster. Below you'll
find what I've done.
Can someone tell me what's wrong with it?? 
In the meantime, I'll try to make "primes" a little faster still. This is
fun and interesting,
but what's even nicer for me, from the mathematical side, is that I can now
make 
statements like the following ( please excuse the mathematical jargon :-) ):

Let S be the set of all ascii strings, x,y be in S and s(x) =  some
"asc2prime-like" function.
Let [x] be the class of x in S modulo the "anagram" equivalence relation.
Then s(x) = s(y) iff [x] = [y].

Sorry for the digression. It is for gotoken and anyone else who likes this
stuff.

Barry


require 'benchmark'
include Benchmark
  
 def barry(words)
     asc2prime = { 65 => 2 , 66 => 3 , 67 => 5 , 68 => 7 , 69 => 11 ,
               70 => 13, 71 => 17 , 72 => 19 , 73 => 23 , 74 => 29 ,
               75 => 31, 76 => 37 , 77 => 41 , 78 => 43 , 79 => 47 , 80 =>
53 ,
               81 => 59 , 82 => 61 , 83 => 67 , 84 => 71 , 85 => 73 , 86 =>
79 ,
               87 => 83 , 88 => 89 , 89 => 97 , 90 => 101 ,
               97 => 2 , 98 => 3 , 99 => 5 , 100 => 7 , 101 => 11 ,
               102 => 13, 103 => 17 , 104 => 19 , 105 => 23 , 106 => 29 ,
               107 => 31, 108 => 37 , 109 => 41 , 110 => 43 , 111 => 47 ,
112 => 53 ,
               113 => 59 , 114 => 61 , 115 => 67 , 116 => 71 , 117 => 73 ,
118 => 79 ,
               119 => 83 , 120 => 89 , 121 => 97 , 122 => 101
                }

     anagrams = {}
     keys = {}
     word = nil
     key = 0
     total = 0

     for word in words do
       word.chomp!
       key = 1
       word.each_byte {|s| key *= asc2prime[s]}
       if anagrams[key]
	 anagrams[key] << word
	 keys[key] = 1
       else
	 anagrams[key] = [ word ]
       end
     end
#     for key in keys.keys
#       puts anagrams[key].join(' ')
#     end
   end
   
   def unpack(words)
     anagrams, keys, word, key = {}, {}
     for word in words do
       word.chomp!
       key = word.dup
       key.downcase!
       chars = key.unpack('c*')
       chars.sort!
       key = chars.pack('c*')
       if anagrams[key]
	 anagrams[key] << (keys[key] = word)
       else
	 anagrams[key] = [ word ]
       end
     end
   end
 
wordlist = File.open("c:/temp/wordlist.txt").read


bm(10) do |x|
  GC.start
  x.report("primes"){barry(wordlist)}
  GC.start
  x.report("unpack"){unpack(wordlist)  }
  GC.start
  x.report("primes"){barry(wordlist)}
  GC.start
  x.report("unpack"){unpack(wordlist)  }
end


                user     system      total        real
primes      1.943000   0.000000   1.943000 (  1.933000)
unpack      1.952000   0.000000   1.952000 (  1.953000)
primes      1.682000   0.000000   1.682000 (  1.682000)
unpack      1.722000   0.000000   1.722000 (  1.723000)

In This Thread

Prev Next