[#6954] Why isn't Perl highly orthogonal? — Terrence Brannon <brannon@...>

27 messages 2000/12/09

[#7022] Re: Ruby in the US — Kevin Smith <kevinbsmith@...>

> Is it possible for the US to develop corporate

36 messages 2000/12/11
[#7633] Re: Ruby in the US — Dave Thomas <Dave@...> 2000/12/19

tonys@myspleenklug.on.ca (tony summerfelt) writes:

[#7636] Re: Ruby in the US — "Joseph McDonald" <joe@...> 2000/12/19

[#7704] Re: Ruby in the US — Jilani Khaldi <jilanik@...> 2000/12/19

> > first candidates would be mysql and postgressql because source is

[#7705] Code sample for improvement — Stephen White <steve@...> 2000/12/19

During an idle chat with someone on IRC, they presented some fairly

[#7750] Re: Code sample for improvement — "Guy N. Hurst" <gnhurst@...> 2000/12/20

Stephen White wrote:

[#7751] Re: Code sample for improvement — David Alan Black <dblack@...> 2000/12/20

Hello --

[#7755] Re: Code sample for improvement — "Guy N. Hurst" <gnhurst@...> 2000/12/20

David Alan Black wrote:

[#7758] Re: Code sample for improvement — Stephen White <steve@...> 2000/12/20

On Wed, 20 Dec 2000, Guy N. Hurst wrote:

[#7759] Next amusing problem: talking integers (was Re: Code sample for improvement) — David Alan Black <dblack@...> 2000/12/20

On Wed, 20 Dec 2000, Stephen White wrote:

[#7212] New User Survey: we need your opinions — Dave Thomas <Dave@...>

16 messages 2000/12/14

[#7330] A Java Developer's Wish List for Ruby — "Richard A.Schulman" <RichardASchulman@...>

I see Ruby as having a very bright future as a language to

22 messages 2000/12/15

[#7354] Ruby performance question — Eric Crampton <EricCrampton@...>

I'm parsing simple text lines which look like this:

21 messages 2000/12/15
[#7361] Re: Ruby performance question — Dave Thomas <Dave@...> 2000/12/15

Eric Crampton <EricCrampton@worldnet.att.net> writes:

[#7367] Re: Ruby performance question — David Alan Black <dblack@...> 2000/12/16

On Sat, 16 Dec 2000, Dave Thomas wrote:

[#7371] Re: Ruby performance question — "Joseph McDonald" <joe@...> 2000/12/16

[#7366] GUIs for Rubies — "Conrad Schneiker" <schneik@...>

Thought I'd switch the subject line to the subject at hand.

22 messages 2000/12/16

[#7416] Re: Ruby IDE (again) — Kevin Smith <kevins14@...>

>> >> I would contribute to this project, if it

17 messages 2000/12/16
[#7422] Re: Ruby IDE (again) — Holden Glova <dsafari@...> 2000/12/16

-----BEGIN PGP SIGNED MESSAGE-----

[#7582] New to Ruby — takaoueda@...

I have just started learning Ruby with the book of Thomas and Hunt. The

24 messages 2000/12/18

[#7604] Any corrections for Programming Ruby — Dave Thomas <Dave@...>

12 messages 2000/12/18

[#7737] strange border-case Numeric errors — "Brian F. Feldman" <green@...>

I haven't had a good enough chance to familiarize myself with the code in

19 messages 2000/12/20

[#7801] Is Ruby part of any standard GNU Linux distributions? — "Pete McBreen, McBreen.Consulting" <mcbreenp@...>

Anybody know what it would take to get Ruby into the standard GNU Linux

15 messages 2000/12/20

[#7938] Re: defined? problem? — Kevin Smith <sent@...>

matz@zetabits.com (Yukihiro Matsumoto) wrote:

26 messages 2000/12/22
[#7943] Re: defined? problem? — Dave Thomas <Dave@...> 2000/12/22

Kevin Smith <sent@qualitycode.com> writes:

[#7950] Re: defined? problem? — Stephen White <steve@...> 2000/12/22

On Fri, 22 Dec 2000, Dave Thomas wrote:

[#7951] Re: defined? problem? — David Alan Black <dblack@...> 2000/12/22

On Fri, 22 Dec 2000, Stephen White wrote:

[#7954] Re: defined? problem? — Dave Thomas <Dave@...> 2000/12/22

David Alan Black <dblack@candle.superlink.net> writes:

[#7975] Re: defined? problem? — David Alan Black <dblack@...> 2000/12/22

Hello --

[#7971] Hash access method — Ted Meng <ted_meng@...>

Hi,

20 messages 2000/12/22

[#8030] Re: Basic hash question — ts <decoux@...>

>>>>> "B" == Ben Tilly <ben_tilly@hotmail.com> writes:

15 messages 2000/12/24
[#8033] Re: Basic hash question — "David A. Black" <dblack@...> 2000/12/24

On Sun, 24 Dec 2000, ts wrote:

[#8178] Inexplicable core dump — "Nathaniel Talbott" <ntalbott@...>

I have some code that looks like this:

12 messages 2000/12/28

[#8196] My first impression of Ruby. Lack of overloading? (long) — jmichel@... (Jean Michel)

Hello,

23 messages 2000/12/28

[#8198] Re: Ruby cron scheduler for NT available — "Conrad Schneiker" <schneik@...>

John Small wrote:

14 messages 2000/12/28

[#8287] Re: speedup of anagram finder — "SHULTZ,BARRY (HP-Israel,ex1)" <barry_shultz@...>

> -----Original Message-----

12 messages 2000/12/29

[ruby-talk:8152] Re: speedup of anagram finder

From: David Alan Black <dblack@...>
Date: 2000-12-28 01:21:37 UTC
List: ruby-talk #8152
Hello --

That was a great lesson.  I have a couple of follow-up questions.

On Thu, 28 Dec 2000, GOTO Kentaro wrote:

> Hi,
> 
> In message "[ruby-talk:8142] speedup of anagram finder"
>     on 00/12/28, "Joseph McDonald" <joe@vpop.net> writes:
> 
> >any ideas how to speed this up?:
> 
> General hint: 
>    1. Initialize block variables before the block.
>    2. Use destructive method!
>    3. `x.size > 0' ==> `not x.empty?'
>    4. `x.size > n' ==> `x[n]'
>    5. `a = x[0]'   ==> `a, = x'

I assume that #2 is because many non-destructive methods call their
destructive equivalents, so for instance Array#flatten makes a copy of
the object and then calls Array#flatten! on it.  Or is there another
reason?  And is this something that can be depended on permanently?

About #4: it breaks on this:

   a = [1,2,false]
   a.size > 2       => true
   a[2]             => false

> def fa4(words, out = STDOUT)
>   anagrams = {}
>   word, key, lst = nil
>   words.each do |word|

I keep finding that "for x in y" benchmarks faster than #each.
I wish it didn't, because I like #each.  Further research
required....

>     word.chomp!
>     key = word.downcase.split('')
>     key.sort!
>     anagrams[key] ||= Array.new
>     anagrams[key].push(word)

Would there be any disadvantage here to using "[]" instead of
Array.new?  It seems to be a lot faster.

Speaking of which....  I found that storing the hash keys as
strings, in the following manner:

      (anagrams[key.join] ||= []) .push word

sped things up quite a bit.  I'm not sure why that would be, exactly,
but it did seem to be the case.  (Combining it into one line is just
to avoid doing two joins or creating a new variable.)  Yet more
further research....

For what it's worth, here's a tweaked/hacked version of your code.
Some of the changes are things mentioned above; others are just little
experiments, some of which didn't seem to matter much for speed.

   def david(words, out = STDOUT)
     anagrams, word, key, lst = {}    # :-)
     words.each do |word|
       word.chomp!
       key = word.dup
       key.downcase!                  # as many ! as possible
       key = key.split('')
       key.sort!
       (anagrams[key.join] ||= []) .push word
     end
     anagrams.values.each do |lst|
       out.puts lst.join(" ") if lst[1]
     end
   end

Here's some benchmarking, where gotoken is your fa4, and david is the
above.  (I've cut-and-pasted a bit, as I was getting weird results
on first runs, so these are actually second runs.)

			 user     system      total        real
   david             4.600000   0.000000   4.600000 (  4.591789)
   gotoken           6.290000   0.030000   6.320000 (  6.329997)


Much of the speed david() picked up seems to be from "key.join".


David

-- 
David Alan Black
home: dblack@candle.superlink.net
work: blackdav@shu.edu
Web:  http://pirate.shu.edu/~blackdav


In This Thread