[#8566] Visions for 2001/1.7.x development? — Robert Feldt <feldt@...>

Hi matz and other Ruby developers,

18 messages 2001/01/03
[#8645] Re: Visions for 2001/1.7.x development? — matz@... (Yukihiro Matsumoto) 2001/01/04

Hi,

[#8580] bug?? — jmichel@... (Jean Michel)

I don't understand the following behaviour:

19 messages 2001/01/03

[#8633] Interesting Language performance comparisons - Ruby, OCAML etc — "g forever" <g24ever@...>

13 messages 2001/01/04

[#8774] No :<, :>, etc. methods for Array — "Brian F. Feldman" <green@...>

So, why not include Comparable in Array by default? It shouldn't have any

28 messages 2001/01/07
[#8779] Re: No :<, :>, etc. methods for Array — matz@... (Yukihiro Matsumoto) 2001/01/07

Hi,

[#8780] Re: No :<, :>, etc. methods for Array — "Brian F. Feldman" <green@...> 2001/01/07

matz@zetabits.com (Yukihiro Matsumoto) wrote:

[#8781] Re: No :<, :>, etc. methods for Array — gotoken@... (GOTO Kentaro) 2001/01/07

In message "[ruby-talk:8780] Re: No :<, :>, etc. methods for Array"

[#8782] Re: No :<, :>, etc. methods for Array — "Brian F. Feldman" <green@...> 2001/01/07

gotoken@math.sci.hokudai.ac.jp (GOTO Kentaro) wrote:

[#8829] Sandbox (again) — wys@... (Clemens Wyss)

Hi,

20 messages 2001/01/08
[#8864] Re: Sandbox (again) — Clemens Hintze <c.hintze@...> 2001/01/08

On 8 Jan, Clemens Wyss wrote:

[#8931] String confusion — Anders Bengtsson <ndrsbngtssn@...>

Hello everyone,

21 messages 2001/01/09
[#8937] Re: String confusion — matz@... (Yukihiro Matsumoto) 2001/01/09

Hi,

[#8953] Please remove account from files — "Thomas Daniels" <westernporter@...>

Please take my e-mail address from your files and "CANCEL" my =

14 messages 2001/01/09
[#8983] Re: Please remove account from files — John Rubinubi <rubinubi@...> 2001/01/10

On Wed, 10 Jan 2001, Thomas Daniels wrote:

[#9020] time to divide -talk? (was: Please remove account from files) — Yasushi Shoji <yashi@...> 2001/01/10

At Wed, 10 Jan 2001 14:23:30 +0900,

[#9047] Re: time to divide -talk? (was: Please remov e account from files) — Aleksi Niemel<aleksi.niemela@...>

Yasushi Shoji:

27 messages 2001/01/10
[#9049] Re: time to divide -talk? — Yasushi Shoji <yashi@...> 2001/01/10

At Thu, 11 Jan 2001 00:20:45 +0900,

[#9153] what about this begin? — Anders Strandl Elkj誡 <ase@...> 2001/01/11

[#9195] Re: Redefining singleton methods — ts <decoux@...>

>>>>> "H" == Horst Duch=EAne?= <iso-8859-1> writes:

10 messages 2001/01/12

[#9242] polymorphism — Maurice Szmurlo <maurice@...>

hello

73 messages 2001/01/13

[#9279] Can ruby replace php? — Jim Freeze <jim@...>

When I read that ruby could be used to replace PHP I got really

15 messages 2001/01/14

[#9411] The Ruby Way — "Conrad Schneiker" <schneiker@...>

As a member of the "Big 8" newsgroups, "The Ruby Way" (of posting) is to

15 messages 2001/01/17

[#9462] Re: reading an entire file as a string — ts <decoux@...>

>>>>> "R" == Raja S <raja@cs.indiana.edu> writes:

35 messages 2001/01/17
[#9465] Re: reading an entire file as a string — Dave Thomas <Dave@...> 2001/01/17

raja@cs.indiana.edu (Raja S.) writes:

[#9521] Larry Wall INterview — ianm74@...

Larry was interviewed at the Perl/Ruby conference in Koyoto:

20 messages 2001/01/18
[#10583] Re: Larry Wall INterview — "greg strockbine" <gstrock@...> 2001/02/08

Larry Wall's interview is how I found out

[#9610] Re: 101 Misconceptions About Dynamic Languages — "Ben Tilly" <ben_tilly@...>

"Christian" <christians@syd.microforte.com.au> wrote:

13 messages 2001/01/20

[#9761] Re: 101 Misconceptions About Dynamic Languages — ts <decoux@...>

>>>>> "C" == Christoph Rippel <crippel@primenet.com> writes:

16 messages 2001/01/23

[#9792] Ruby 162 installer available — Dave Thomas <Dave@...>

15 messages 2001/01/24

[#9958] Re: Vim syntax files again. — "Conrad Schneiker" <schneik@...>

Hugh Sasse wrote:

14 messages 2001/01/26
[#10065] Re: Vim syntax files again. — Hugh Sasse Staff Elec Eng <hgs@...> 2001/01/29

On Sat, 27 Jan 2001, Conrad Schneiker wrote:

[#9975] line continuation — "David Ruby" <ruby_david@...>

can a ruby statement break into multiple lines?

18 messages 2001/01/27
[#9976] Re: line continuation — Michael Neumann <neumann@...> 2001/01/27

On Sat, 27 Jan 2001, David Ruby wrote:

[#9988] Re: line continuation — harryo@... (Harry Ohlsen) 2001/01/28

>A statement break into mutliple lines if it is not complete,

[ruby-talk:8544] Re: speedup of anagram finder

From: jmichel@... (Jean Michel)
Date: 2001-01-02 22:10:05 UTC
List: ruby-talk #8544
In article <m2lmstvna4.fsf@zip.local.thomases.com>,
Dave Thomas  <Dave@PragmaticProgrammer.com> wrote:
>jmichel@schur.institut.math.jussieu.fr (Jean Michel) writes:
>
...
>That's a good approach.
>
>I was wondering if there's a totally different way of doing
>this. Right now, all solutions (including the original based on Perl)
>generated a hash from each word, such that hash(A) == hash(B) => A == B.
>
>How about going statistical? Anyone fancy writing a Bloom filter based 
>finder (perhaps using #sum as the hash)? Let's think more radically!

I  guess I  got  bitten  by the  ruby  benchmarking  bug. The  following
version  which sidesteps  unpack  and sort  seems  slighly faster  under
'benchmark' that  the unpack-sort  version, although  it is  much slower
under 'profile'.  It seems 'profile' does  a very bad job  of accurately
reflecting time  spent in various  places, since  its main effect  is to
enormously  increase the  interpretive  overhead. I  think  ruby needs a
better 'profile' (kernel help seems necessary).

def histogram(words) # assumes lower-case words using only a..z
  anagrams, keys, word, key = {}, {}
  for word in words do 
    word.chomp!
    hist=Array.new(26,0)
    word.each_byte{|b| hist[b-?a]+=1}
    key = hist.pack('c*')
    if anagrams[key] then anagrams[key] << (keys[key] = word)
    else anagrams[key] = [ word ]
    end
  end
end

As to  possible algorithms  for the  problem: I think  one needs  to use
*some* function with  the property that words which are  anagrams map to
the  same value.  This  function however  does not  need  to be  exactly
injective for anagrams, it may map a few words which are not anagrams of
each other  to the same  value; these would then  be separated out  in a
second (inexpensive) pass.

 Also, hashing is a good way to bring together anagrams, but another way
would  be just  sorting:  in a  first  step, build  an  array of  pairs
[anagram,word], and then  sort it. The efficiency of  this would depend
if ruby is very efficient or not at growing an array.

 Best regards,
    Jean MICHEL

In This Thread