[#18974] Perl/Python/Ruby common backend (Perl6) — ptkwt@...1.aracnet.com (Phil Tomson)

There is a thread about using .NET's CLR as a backend for Ruby, but how

17 messages 2001/08/01

[#19064] ANN: Code Amelioration Contest (presented by Ruby Conference 2001) — David Alan Black <dblack@...>

17 messages 2001/08/03
[#19184] Re: ANN: Code Amelioration Contest (presented by Ruby Conference 2001) — John Carter <john.carter@...> 2001/08/06

On Fri, 3 Aug 2001, David Alan Black wrote:

[#19185] Re: ANN: Code Amelioration Contest (presented by Ruby Conference 2001) — David Alan Black <dblack@...> 2001/08/06

Hello --

[#19186] Re: ANN: Code Amelioration Contest (presented by Ruby Conference 2001) — John Carter <john.carter@...> 2001/08/06

On Mon, 6 Aug 2001, David Alan Black wrote:

[#19125] My 1st look @ ruby: No prototypes and problem with String#gsub — stesch@... (Stefan Scholl)

My first ruby program:

23 messages 2001/08/04

[#19192] Some remarks from a nembie in Ruby — Renaud HEBERT <renaud.hebert@...>

After having read the book "Programming Ruby: The Pragmatic Programmer's

38 messages 2001/08/06

[#19269] Re: Perl/Python/Ruby common backend (Parrot, can Ruby play too?) — ptkwt@...1.aracnet.com (Phil Tomson)

In article <72X97.12093$9i1.972452@e420r-atl1.usenetserver.com>,

50 messages 2001/08/07
[#19349] Re: Perl/Python/Ruby common backend (Parrot, can Ruby play too?) — Mathieu Bouchard <matju@...> 2001/08/08

[#19456] Re: Perl/Python/Ruby common backend (Parrot, can Ruby play too?) — Harry Ohlsen <harryo@...> 2001/08/09

Ned Konz wrote:

[#19451] Re: Help! I'm still confused about threadin g in the ML — "Morris, Chris" <chris.morris@...>

> Is there an Outlook option to turn on In-Reply-To or References

14 messages 2001/08/09
[#19453] Re: Help! I'm still confused about threadin g in the ML — Dave Thomas <Dave@...> 2001/08/09

"Morris, Chris" <chris.morris@snelling.com> writes:

[#19506] the way class variables work — David Alan Black <dblack@...>

Hello --

51 messages 2001/08/10
[#19511] Re: the way class variables work — Chris Uzdavinis <chris@...> 2001/08/11

David Alan Black <dblack@candle.superlink.net> writes:

[#19524] order and freedom in Ruby (was: Re: Re: the way class variables work) — David Alan Black <dblack@...> 2001/08/11

Hello --

[#19517] Why not?: Assigning to self — furufuru@... (Ryo Furue)

Hi there,

55 messages 2001/08/11
[#19689] Re: Why not?: Assigning to self — Ron Jeffries <ronjeffries@...> 2001/08/14

On 13 Aug 2001 20:59:54 -0700, furufuru@ccsr.u-tokyo.ac.jp (Ryo Furue)

[#19694] Re: Why not?: Assigning to self — Ned Konz <ned@...> 2001/08/14

On Tuesday 14 August 2001 05:09 am, Ron Jeffries wrote:

[#19695] Re: Why not?: Assigning to self — ts <decoux@...> 2001/08/14

>>>>> "N" == Ned Konz <ned@bike-nomad.com> writes:

[#19696] Re: Why not?: Assigning to self — Ned Konz <ned@...> 2001/08/14

On Tuesday 14 August 2001 07:51 am, you wrote:

[#19697] Re: Why not?: Assigning to self — ts <decoux@...> 2001/08/14

>>>>> "N" == Ned Konz <ned@bike-nomad.com> writes:

[#19700] Re: Why not?: Assigning to self — Ned Konz <ned@...> 2001/08/14

On Tuesday 14 August 2001 08:27 am, you wrote:

[#19701] Re: Why not?: Assigning to self — ts <decoux@...> 2001/08/14

>>>>> "N" == Ned Konz <ned@bike-nomad.com> writes:

[#19703] Re: Why not?: Assigning to self — Ned Konz <ned@...> 2001/08/14

On Tuesday 14 August 2001 09:05 am, Guy Decoux wrote:

[#19704] Re: Why not?: Assigning to self — ts <decoux@...> 2001/08/14

>>>>> "N" == Ned Konz <ned@bike-nomad.com> writes:

[#19708] Re: Why not?: Assigning to self — Ned Konz <ned@...> 2001/08/14

On Tuesday 14 August 2001 09:27 am, you wrote:

[#19709] Re: Why not?: Assigning to self — ts <decoux@...> 2001/08/14

>>>>> "N" == Ned Konz <ned@bike-nomad.com> writes:

[#19713] Re: Why not?: Assigning to self — Ned Konz <ned@...> 2001/08/14

On Tuesday 14 August 2001 09:45 am, you wrote:

[#19750] Re: Why not?: Assigning to self — matz@... (Yukihiro Matsumoto) 2001/08/15

Hi,

[#19819] Re: Why not?: Assigning to self — Ned Konz <ned@...> 2001/08/15

On Tuesday 14 August 2001 08:14 pm, matz wrote:

[#19852] Re: Why not?: Assigning to self — matz@... (Yukihiro Matsumoto) 2001/08/16

Hi,

[#19857] Re: Why not?: Assigning to self — "Florian G. Pflug" <fgp@...> 2001/08/16

On Thu, Aug 16, 2001 at 11:05:59AM +0900, Yukihiro Matsumoto wrote:

[#19858] Re: Why not?: Assigning to self — matz@... (Yukihiro Matsumoto) 2001/08/16

Hi,

[#19867] Re: Why not?: Assigning to self — "Pit Capitain" <pit@...> 2001/08/16

Just a followup at (my) current end of the thread:

[#19550] Forced garbage collection — Lars Christensen <larsch@...>

14 messages 2001/08/11
[#19562] Re: Forced garbage collection — "Nat Pryce" <nat.pryce@...13media.com> 2001/08/12

From: "Lars Christensen" <larsch@cs.auc.dk>

[#19551] /.ed again — Tobias Reif <tobiasreif@...>

Ruy gets slasdotted again ;)

19 messages 2001/08/11

[#19650] Ruby Newbie mailing list — Michael Pence <mikepence@...>

Hello all.

14 messages 2001/08/13
[#19656] RE: Ruby Newbie mailing list — "Louis Brothers" <lcb134@...> 2001/08/13

We had a similar discussion on the OmniWeb Objective-C mailing list not to

[#19659] Re: Ruby Newbie mailing list — Michael Pence <mikepence@...> 2001/08/13

I appreciate your references to Objectionable-C ;-)

[#19685] Compiling Ruby with cygwin and Tk support — Manuel Zabelt <ng@...>

Hello!

13 messages 2001/08/14

[#19718] General (GUI/license) questions — Ryan Tarpine <rtarpine@...>

First: Kero commented in the description of his new Ruby Agenda program

18 messages 2001/08/14

[#19755] "new" returning nil: how to report the failure of object creation — furufuru@... (Ryo Furue)

Hi there,

14 messages 2001/08/15

[#19758] The GUI poll is in, and the results are surprising — Dave Thomas <Dave@...>

40 messages 2001/08/15
[#19774] Re: The GUI poll is in, and the results are surprising — Lars Christensen <larsch@...> 2001/08/15

On Wed, 15 Aug 2001, Dave Thomas wrote:

[#19784] Re: The GUI poll is in, and the results aresurprising — "Lyle Johnson" <ljohnson@...> 2001/08/15

> Please don't forget what Ruby is all about in this discussion! I think

[#19824] Ruby GUI — "Hal E. Fulton" <hal9000@...>

The concept of a new GUI is somewhat appealing,

16 messages 2001/08/15

[#20033] Ruby Article — Joshua Drake <jd.nospam@...>

Hello,

38 messages 2001/08/20

[#20127] Another Possible RCR - Wrappers via Mixins — Stephen White <spwhite@...>

The main difference between mix-ins and multiple inheritence is (to my understanding) that parent classes do not call child code, but mix-ins do.

15 messages 2001/08/22

[#20135] Bruce Eckel's criticism of Ruby — Ned Konz <ned@...>

Python.org links to http://www.mindview.net/Etc/notes.html#Ruby , saying

24 messages 2001/08/22

[#20183] ++ Operator — kamphausen@... (SKa)

Dear Community,

35 messages 2001/08/23
[#20234] Re: ++ Operator — Dave Thomas <Dave@...> 2001/08/24

matz@ruby-lang.org (Yukihiro Matsumoto) writes:

[#20236] Re: ++ Operator — matz@... (Yukihiro Matsumoto) 2001/08/24

Hi,

[#20209] In Ruby 0 is true but nil is false.. or how to shoot yourself?.. — Guillaume Cottenceau <gc@...>

I have a simple Audio-CD database (using CSV format). I was writing a

11 messages 2001/08/23

[#20254] File.readline(s) — Michael Husmann <michael.husmann@...>

I am reading a 55MB ASCII file by using File.readline(s) which takes on

14 messages 2001/08/24

[#20303] New Windows InstallShield version of Ruby — Andrew Hunt <andy@...>

19 messages 2001/08/24

[#20307] Backwards language — "Sean Middleditch" <elanthis@...>

Greetings,

30 messages 2001/08/24

[ruby-talk:19002] How to optimize a Ruby program?

From: Ned Konz <ned@...>
Date: 2001-08-02 03:52:28 UTC
List: ruby-talk #19002
I've come up with my first Ruby program (below). Actually, it's a
library module that implements the Hunt/McIlroy "diff" algorithm. It was
pretty fast to write (I more or less translated it from my Perl CPAN
module), but it may not have  been a great choice. (I'm going to post it
to RAA, so I'll have more questions about the right way to package it).

I'm a bit frustrated with the speed. On a pair of 4000 line arrays of
text, it runs on my machine in 35 seconds. My Perl code, on the other
hand, does the same job in 25 seconds. Of course, it's fast enough on
smaller arrays.

I've looked at it quite a bit, and even wrote a line profiler, but can't
manage to make it much faster.

The single biggest chunk of time is spent in the binary search routine
in replaceNextLargerWith(). I found that I could make it a tiny bit
faster by doing a single <=> and then using equal? to compare with
literal -1 and 1.

Anyway, I'd appreciate it if someone could give me some advice as to how
to make it more competitive with Perl's speed. I'd rather keep it pure
Ruby, instead of making a C extension.

Another question is the API. Note that there is a keyGen and a compare
method (that call hash() and eql?() respectively by default); I'd expect
people using this module with specialized data types  to override these
if necessary.

But in traverseSequences, there are up to 5 callbacks so that user code
can be called at the right times. Should this require inheritance or
overriding, or is the Ruby Way to pass in multiple Method or Proc
instances?

Thanks,
-- 
Ned Konz
currently: Stanwood, WA
email:     ned@bike-nomad.com
homepage:  http://bike-nomad.com




Attachments (2)

difftest.rb (355 Bytes, text/x-ruby)
require 'diff'

$, = ' '
diff = Diff.new
a = %w(a b c e h j l m n p)
b = %w(b c d e f j k l m r s t)
correctResult = %w(b c e j l m)

result = diff.lcs(a, b)
print "a: ", a, "\n"
print "b: ", b, "\n"
print "correctResult: ", correctResult, "\n"
print "result: ", result, "\n"

result = diff.diff(a, b)
print "diff output:", result.inspect
diff.rb (4.88 KB, text/x-ruby)
# Hunt/McIlroy diff algorithm for Ruby
# by Ned Konz, ned@bike-nomad.com
class Diff
  private
  # return a hash mapping keys to arrays of line numbers
  def withPositionsOfInInterval(aCollection, istart, iend)
    d = {}
    (istart .. iend).each do |index|
      element = aCollection[index]
      key = keyGen(element) 
      if d.has_key?(key)
        d[key].unshift(index)
      else
        d[key] = [index]
      end
    end
    return d
  end

  # This is where the bulk of the time (40%) is spent in this module,
  # so try to make it fast!
  def replaceNextLargerWith(thresh, aValue, high)
    high = thresh.length - 1  if high.nil? or high == 0
    # off the end?
    if high == -1 or aValue > thresh[-1]
      thresh.push(aValue)
      return high + 1
    end

    # binary search for insertion point...
    low = 0
    while low <= high
      index = (high + low) >> 1
      result = aValue <=> thresh[index]
      if result.equal?(1)
        low = index + 1
      elsif result.equal?(-1)
        high = index - 1
      else
        return nil 
      end
    end

    # now insertion point is in low.
    thresh[low] = aValue  # overwrite next larger
    return low
  end

  # This method computes the longest common subsequence in a and b.
  def longestCommonSubsequence(a, b)
    aStart = 0
    aFinish = a.length - 1
    bStart = 0
    bFinish = b.length - 1
    matchVector = []

    # First we prune off any common elements at the beginning
    while aStart <= aFinish and
      bStart <= bFinish and
      compare(a[aStart], b[bStart])
      matchVector[aStart] = bStart
      aStart += 1
      bStart += 1
    end

    # now the end
    while aStart <= aFinish and
      bStart <= bFinish and
      compare(a[aFinish], b[bFinish])
      matchVector[aFinish] = bFinish
      aFinish = aFinish - 1
      bFinish = bFinish - 1
    end

    # Now compute the equivalence classes of positions of elements
    bMatches = withPositionsOfInInterval(b, bStart, bFinish)
    thresh = []
    links = []
    (aStart .. aFinish).each do |i|
      ai = keyGen(a[i])
      if bMatches.has_key?(ai)
        k = 0
        bMatches[ai].each do |j|
          # optimization: most of the time this will be true
          if k and k > 0 and thresh[k] > j and thresh[k - 1] < j
            thresh[k] = j
          else
            k = replaceNextLargerWith(thresh, j, k)
          end
          unless k.nil?
            links[k] = [(k > 0 ? links[k - 1] : nil), i, j]
          end
        end
      end
    end

    if thresh.length
      link = links[thresh.length - 1]
      until link.nil?
        matchVector[link[1]] = link[2]
        link = link[0]
      end
    end

    return matchVector
  end

  protected
  # override these two methods to provide
  # different comparison semantics

  def keyGen(s)
    s.hash
  end

  def compare(aa,ab)
    aa.eql?(ab)
  end

  public

  def traverse_sequences(a, b, callbacks={})
    # these should default to nil
    finishedACallback  = callbacks['finished_a']
    finishedBCallback  = callbacks['finished_b']

    # and these to a no-op
    callbacks.default = Proc.new { |args| }
    matchCallback     = callbacks['match']
    discardACallback  = callbacks['discard_a']
    discardBCallback  = callbacks['discard_b']

    matchVector = longestCommonSubsequence(a, b)

    bi = 0; ai = 0
    (0 ... matchVector.length).each do |ai|
      bLine = matchVector[ai]
      if bLine.nil?
        discardACallback.call(ai, bi)
      else
        while bi < bLine
          discardBCallback.call(ai, bi)
          bi += 1
        end
        matchCallback.call(ai, bi)
        bi += 1
      end
    end
    ai += 1

    # the last entry (if any) processed was a match.

    if not finishedBCallback.nil? and ai < a.length
      finishedBCallback.call(bi)
    else
      while ai < a.length
        discardACallback.call(ai, bi)
        ai += 1
      end
    end

    if not finishedACallback.nil? and bi < b.length
      finishedACallback.call(ai)
    else
      while bi < b.length
        discardBCallback.call(ai, bi)
        bi += 1
      end
    end
  end

  def lcs(a, b)
    matchVector = longestCommonSubsequence(a, b)
    retval = []
    i = 0
    while i < matchVector.length
      retval.push(a[i]) unless matchVector[i].nil?
      i += 1
    end
    return retval
  end

  def diff(a, b)
    retval = []
    hunk = []
    discard = Proc.new { |ai,bi|
      hunk.push(['-', ai, a[ai]])
    }
    add = Proc.new { |ai,bi|
      hunk.push(['+', bi, b[bi]])
    }
    match = Proc.new { |ai,bi|
      if hunk.length > 0
        retval.push(hunk)
        hunk = []
      end
    }
    traverse_sequences(a, b, {
      'match' => match,
      'discard_a' => discard,
      'discard_b' => add
    })
    match.call(0, 0)
    return retval
  end
end

In This Thread

Prev Next