[#277137] executing VIM on a remote machine? — "Gian Holland" <gianmh@...>

Is is possible with ruby to execute vim on a remote machine to edit a file?

11 messages 2007/11/01

[#277159] Who required that!? — Trans <transfire@...>

Is there any way to ask a file what other file require/load 'd it? I

15 messages 2007/11/02
[#277180] Re: Who required that!? — "ara.t.howard" <ara.t.howard@...> 2007/11/02

[#277359] Unicode illegal characters problem — "Axel Etzold" <AEtzold@...>

Dear all,

16 messages 2007/11/03

[#277377] dragons and factorials (keyboard input) — Thufir <hawat.thufir@...>

I'm getting keyboard input successfully, but I'm getting a string. I

13 messages 2007/11/03

[#277385] x=[]; x[:bla][:some_key] does not work? — Joshua Muheim <forum@...>

Hi all

24 messages 2007/11/03

[#277517] Reading a class-file and calling it at runtime. — Miss Elaine Eos <Misc@...>

I'm trying to read-in a folder full of "plug-ins" and call each of them,

19 messages 2007/11/05
[#277534] Re: [n00b] Reading a class-file and calling it at runtime. — "Sean O'Halpin" <sean.ohalpin@...> 2007/11/05

On 11/5/07, Miss Elaine Eos <Misc@your-pants.playnaked.com> wrote:

[#277579] Re: Reading a class-file and calling it at runtime. — 7stud -- <bbxx789_05ss@...> 2007/11/05

Sean O'halpin wrote:

[#277615] Faking the refering page with Mechanize — Ehud Rosenberg <ehudros@...>

Hi,

13 messages 2007/11/06
[#277617] Re: Faking the refering page with Mechanize — Konrad Meyer <konrad@...> 2007/11/06

Quoth Ehud Rosenberg:

[#277719] Language Popularity - PHP vs Ruby? — Marc Heiler <shevegen@...>

Hi,

26 messages 2007/11/06
[#277773] Re: Language Popularity - PHP vs Ruby? — Ilan Berci <coder68@...> 2007/11/06

Marc Heiler wrote:

[#277881] Re: Language Popularity - PHP vs Ruby? — Alex Young <alex@...> 2007/11/07

Ilan Berci wrote:

[#277785] Re: JRuby performance questions answered — Isaac Gouy <igouy2@...>

Quoting Charles Oliver Nutter <charles.nutter / sun.com>:

23 messages 2007/11/06
[#277789] Re: JRuby performance questions answered — Roger Pack <rogerpack2005@...> 2007/11/06

[#278300] Re: JRuby performance questions answered — Roger Pack <rogerpack2005@...> 2007/11/09

[#278343] Re: JRuby performance questions answered — "Rick DeNatale" <rick.denatale@...> 2007/11/10

On 11/9/07, Roger Pack <rogerpack2005@gmail.com> wrote:

[#278385] Re: JRuby performance questions answered — "M. Edward (Ed) Borasky" <znmeb@...> 2007/11/10

Rick DeNatale wrote:

[#278397] Re: JRuby performance questions answered — Sean Surname <x3qh85202@...> 2007/11/10

M. Edward (Ed) Borasky wrote:

[#277797] Is there a "||" that treats "" also as false? — Joshua Muheim <forum@...>

Hi all

15 messages 2007/11/07

[#277900] a problem related string(250 score) — Johnson Wang <99100@163.com>

How to solve this problem in Ruby????

13 messages 2007/11/07

[#277931] Windows: Scheduled Ruby script won't run — bdezonia@...

Hello all,

12 messages 2007/11/07

[#277944] how to delete array — Surjit Nameirakpam <surjit.meitei@...>

Problem

21 messages 2007/11/07
[#277952] Re: how to delete array — "Leslie Viljoen" <leslieviljoen@...> 2007/11/07

On Nov 7, 2007 10:02 PM, Surjit Nameirakpam <surjit.meitei@gmail.com> wrote:

[#277954] Re: how to delete array — Surjit Nameirakpam <surjit.meitei@...> 2007/11/07

My business logic doesn't help me find which values i have to delete but

[#277955] Re: how to delete array — Surjit Nameirakpam <surjit.meitei@...> 2007/11/07

Surjit Nameirakpam wrote:

[#277980] alternate to case; generating a list of sub-classes — Thufir <hawat.thufir@...>

The below uses pseudo-random number generation to populate an array

11 messages 2007/11/07

[#278070] local variables, eval, and parsing — furtive.clown@...

17 messages 2007/11/08
[#278076] Re: local variables, eval, and parsing — furtive.clown@... 2007/11/08

val = 44

[#278089] Re: local variables, eval, and parsing — Randy Kramer <rhkramer@...> 2007/11/08

On Thursday 08 November 2007 11:25 am, furtive.clown@gmail.com wrote:

[#278099] Re: local variables, eval, and parsing — furtive.clown@... 2007/11/08

On Nov 8, 1:10 pm, Randy Kramer <rhkra...@gmail.com> wrote:

[#278130] Re: local variables, eval, and parsing — Randy Kramer <rhkramer@...> 2007/11/08

On Thursday 08 November 2007 01:45 pm, furtive.clown@gmail.com wrote:

[#278120] 'Class.inherited' v. 'inherited' syntax inside Class — 7stud -- <bbxx789_05ss@...>

How come when you redefine the inherited method in Class, you don't use

12 messages 2007/11/08

[#278171] Ruby/Fastcgi going into uninterruptible after random periods of time — "nate" <ruby@...>

Hello there -

11 messages 2007/11/09

[#278226] Current Quizmaster Retiring — James Edward Gray II <james@...>

When I started the Ruby Quiz project, I made some off-hand comment =20

28 messages 2007/11/09

[#278271] enterprise ruby — Roger Pack <rogerpack2005@...>

I am thinking of doing a 'side by side' distro of Ruby that includes the

50 messages 2007/11/09
[#278276] Re: enterprise ruby — Robert Klemme <shortcutter@...> 2007/11/09

On 09.11.2007 21:28, Roger Pack wrote:

[#278305] Re: enterprise ruby — Lionel Bouton <lionel-subscription@...> 2007/11/10

Robert Klemme wrote the following on 09.11.2007 22:05 :

[#278415] Re: enterprise ruby — Charles Oliver Nutter <charles.nutter@...> 2007/11/11

Lionel Bouton wrote:

[#278592] Re: enterprise ruby — Jay Levitt <jay+news@...> 2007/11/12

On Sun, 11 Nov 2007 22:32:14 -0500, M. Edward (Ed) Borasky wrote:

[#278616] Re: enterprise ruby — "M. Edward (Ed) Borasky" <znmeb@...> 2007/11/12

Jay Levitt wrote:

[#278310] equivalent injecting implementations? — Trans <transfire@...>

Are these strictly equivalent? I get the feeling no, but I haven't

14 messages 2007/11/10
[#278344] Re: equivalent injecting implementations? — "Rick DeNatale" <rick.denatale@...> 2007/11/10

On 11/9/07, Trans <transfire@gmail.com> wrote:

[#278354] The Man or Boy Recursion Test — Werner <wdahn@...>

Hello,

17 messages 2007/11/10
[#278392] Re: The Man or Boy Recursion Test — Tim Hunter <TimHunter@...> 2007/11/10

Werner wrote:

[#278410] Re: The Man or Boy Recursion Test — Lloyd Linklater <lloyd@2live4.com> 2007/11/10

Tim Hunter wrote:

[#278453] Re: The Man or Boy Recursion Test — Tim Hunter <TimHunter@...> 2007/11/11

Lloyd Linklater wrote:

[#278458] Re: The Man or Boy Recursion Test — "Rick DeNatale" <rick.denatale@...> 2007/11/11

On Nov 11, 2007 9:21 AM, Tim Hunter <TimHunter@nc.rr.com> wrote:

[#278413] ruby-1.8.6-p111 build on osx 10.5.0 fails; ok on 10.4.10. bug or config? — snowcrash+rubytalk <schneecrash+rubytalk@...>

hi,

11 messages 2007/11/11
[#278447] Re: ruby-1.8.6-p111 build on osx 10.5.0 fails; ok on 10.4.10. bug or config? — "Laurent Sansonetti" <laurent.sansonetti@...> 2007/11/11

On Nov 11, 2007 1:27 AM, snowcrash+rubytalk

[#278539] comp.lang.fortran challenge — Bil Kleb <Bil.Kleb@...>

Having Ruby fun with the comp.lang.fortran folks:

14 messages 2007/11/12

[#278575] does Ruby has method properties — Thilina Buddhika <thilinamb@...>

In java script it is possible to something like this,

16 messages 2007/11/12

[#278643] alias_method :tap, :affect — Josh Susser <josh@...>

46 messages 2007/11/12
[#278652] Re: alias_method :tap, :affect — "Rick DeNatale" <rick.denatale@...> 2007/11/12

On Nov 12, 2007 12:58 PM, Josh Susser <josh@hasmanythrough.com> wrote:

[#278657] Re: alias_method :tap, :affect — furtive.clown@... 2007/11/12

[#278663] Re: alias_method :tap, :affect — "Martin DeMello" <martindemello@...> 2007/11/12

On Nov 12, 2007 11:10 AM, <furtive.clown@gmail.com> wrote:

[#278675] Re: alias_method :tap, :affect — furtive.clown@... 2007/11/12

On Nov 12, 2:31 pm, Martin DeMello <martindeme...@gmail.com> wrote:

[#278678] Re: alias_method :tap, :affect — James Edward Gray II <james@...> 2007/11/12

On Nov 12, 2007, at 2:40 PM, furtive.clown@gmail.com wrote:

[#278685] Re: alias_method :tap, :affect — furtive.clown@... 2007/11/12

On Nov 12, 3:55 pm, James Edward Gray II <ja...@grayproductions.net>

[#278688] Re: alias_method :tap, :affect — James Edward Gray II <james@...> 2007/11/12

On Nov 12, 2007, at 3:25 PM, furtive.clown@gmail.com wrote:

[#278691] Re: alias_method :tap, :affect — furtive.clown@... 2007/11/12

On Nov 12, 4:37 pm, James Edward Gray II <ja...@grayproductions.net>

[#278692] Re: alias_method :tap, :affect — James Edward Gray II <james@...> 2007/11/12

On Nov 12, 2007, at 4:20 PM, furtive.clown@gmail.com wrote:

[#278693] Re: alias_method :tap, :affect — furtive.clown@... 2007/11/12

On Nov 12, 5:25 pm, James Edward Gray II <ja...@grayproductions.net>

[#278695] Re: alias_method :tap, :affect — James Edward Gray II <james@...> 2007/11/12

On Nov 12, 2007, at 4:35 PM, furtive.clown@gmail.com wrote:

[#278705] Re: alias_method :tap, :affect — furtive.clown@... 2007/11/12

>

[#278706] Re: alias_method :tap, :affect — James Edward Gray II <james@...> 2007/11/12

On Nov 12, 2007, at 5:30 PM, furtive.clown@gmail.com wrote:

[#278708] Re: alias_method :tap, :affect — furtive.clown@... 2007/11/12

On Nov 12, 6:39 pm, James Edward Gray II <ja...@grayproductions.net>

[#278710] Re: alias_method :tap, :affect — Raul Parolari <raulparolari@...> 2007/11/13

unknown wrote:

[#278742] Yielding an object and caring about the result: the cousin of Object#tap — furtive.clown@...

The idea of Object#tap is to insert a "listener" (like tapping a phone

36 messages 2007/11/13
[#278744] Re: Yielding an object and caring about the result: the cousin of Object#tap — furtive.clown@... 2007/11/13

[#278853] Re: Yielding an object and caring about the result: the cousin of Object#tap — "ara.t.howard" <ara.t.howard@...> 2007/11/13

[#278864] Re: Yielding an object and caring about the result: the cousin of Object#tap — furtive.clown@... 2007/11/13

Ara,

[#278884] Re: Yielding an object and caring about the result: the cousin of Object#tap — -a <ara.t.howard@...> 2007/11/13

[#278981] Re: Yielding an object and caring about the result: the cousin of Object#tap — furtive.clown@... 2007/11/14

Let's compare them again. I changed some variable names which will

[#278840] Why are so many people confused about "Enterprise" software? — "Kyle Schmitt" <kyleaschmitt@...>

Really, why are so many people confused about "Enterprise" software,

10 messages 2007/11/13

[#278871] Ordered Hash Usefulness — "Devi Web Development" <devi.webmaster@...>

On Nov 12, 2007 7:56 AM, James Edward Gray II <james@grayproductions.net> wrote:

19 messages 2007/11/13

[#278878] recursion with blocks — Mike Perham <mperham@...>

I have a tree structure where I want to walk the structure and find a

13 messages 2007/11/13

[#278928] rails incredibly slow (update) — Ron Jeffries <ronjeffries@...>

After 24 minutes of waiting, the Welcome Aboard window came up in my

75 messages 2007/11/14
[#278936] Re: rails incredibly slow (update) — Ron Jeffries <ronjeffries@...> 2007/11/14

On Tue, 13 Nov 2007 23:22:29 -0500, Mohit Sindhwani

[#279008] ruby incredibly slow (update 2) — Ron Jeffries <ronjeffries@...> 2007/11/14

On Wed, 14 Nov 2007 11:42:17 -0500, Tanner Burson

[#279418] Re: ruby incredibly slow (update 2) — Ron Jeffries <ronjeffries@...> 2007/11/17

On Fri, 16 Nov 2007 07:51:51 -0500, Bob Hutchison <hutch@recursive.ca>

[#279710] Re: ruby incredibly slow (update 2) — Bob Hutchison <hutch@...> 2007/11/19

[#279121] webrick alternative — Michael Conrad <list-ruby@...>

Hi,

20 messages 2007/11/15

[#279241] Alternative Ruby grammar — Markus Liedl <m.liedl@...>

I have spent the last months to write an alternative Ruby grammar now

23 messages 2007/11/16

[#279263] meta-class subclass relationships — Greg Weeks <greg.weeks@...>

Ruby exposes its singleton meta-classes, eg:

16 messages 2007/11/16
[#279288] Re: meta-class subclass relationships — "David A. Black" <dblack@...> 2007/11/16

Hi --

[#279347] Goedel (#147) — Ruby Quiz <james@...>

The three rules of Ruby Quiz:

21 messages 2007/11/16

[#279360] "instance_eval" (eg, sent to a class object) — Greg Weeks <greg.weeks@...>

I've poked around, but I don't get instance_eval at all.

11 messages 2007/11/16

[#279419] Composition: Build objects from other objects — Thufir <hawat.thufir@...>

<http://www.javaworld.com/javaworld/jw-06-2001/jw-0608-java101.html> has

22 messages 2007/11/17

[#279539] Five Top programming Languages — bicomplex@...

Five Top programming Languages

15 messages 2007/11/18

[#279564] GUI and ruby — ulazar <ulazar@...>

I would like to create an interface GUI with ruby. What I have to use? I

17 messages 2007/11/18

[#279642] Open-ended ranges? — Clifford Heath <no@...>

Folk,

21 messages 2007/11/19

[#279670] False positives in editing data — RichardOnRails <RichardDummyMailbox58407@...>

Hi All,

39 messages 2007/11/19

[#279671] Is there an equivalent in irb to command: history in bash? — Stephen Bannasch <stephen.bannasch@...>

Is there an equivalent in irb to the command history in bash?

11 messages 2007/11/19
[#279675] Re: Is there an equivalent in irb to command: history in bash? — Chris Shea <cmshea@...> 2007/11/19

On Nov 18, 10:44 pm, Stephen Bannasch <stephen.banna...@deanbrook.org>

[#279738] Read last line of a file — Shuaib Zahda <shuaib.zahda@...>

Hi all

17 messages 2007/11/19
[#279748] Re: Read last line of a file — Xavier Noria <fxn@...> 2007/11/19

On Nov 19, 2007, at 3:38 PM, Shuaib Zahda wrote:

[#279822] RubyGems 0.9.5 — Eric Hodel <drbrain@...7.net>

RubyGems 0.9.5 adds several new features and fixes several bugs.

56 messages 2007/11/20
[#279841] Re: RubyGems 0.9.5 — Michael Greenly <mgreenly@...> 2007/11/20

I got caught by some "tired at the end of the day" stupidity...

[#279903] Re: RubyGems 0.9.5 — "Rick DeNatale" <rick.denatale@...> 2007/11/20

On Nov 20, 2007 12:53 AM, Michael Greenly <mgreenly@gmail.com> wrote:

[#279907] Re: RubyGems 0.9.5 — Michael Greenly <mgreenly@...> 2007/11/20

Rick Denatale wrote:

[#279921] Re: RubyGems 0.9.5 — "M. Edward (Ed) Borasky" <znmeb@...> 2007/11/20

Michael Greenly wrote:

[#279928] Re: RubyGems 0.9.5 — "Austin Ziegler" <halostatue@...> 2007/11/20

On 11/20/07, M. Edward (Ed) Borasky <znmeb@cesmail.net> wrote:

[#279935] Re: RubyGems 0.9.5 — Michael Greenly <mgreenly@...> 2007/11/20

Austin Ziegler wrote:

[#280023] Re: RubyGems 0.9.5 — Eric Hodel <drbrain@...7.net> 2007/11/20

On Nov 20, 2007, at 08:18 , Michael Greenly wrote:

[#280086] Re: RubyGems 0.9.5 — Sylvain Joyeux <sylvain.joyeux@...> 2007/11/21

On Wed, Nov 21, 2007 at 08:09:18AM +0900, Eric Hodel wrote:

[#280119] Re: RubyGems 0.9.5 — "Austin Ziegler" <halostatue@...> 2007/11/21

On 11/21/07, Sylvain Joyeux <sylvain.joyeux@polytechnique.org> wrote:

[#279870] eigenvalues, eigenvectors in Ruby ??? — unbewusst.sein@... (Une B騅ue)

40 messages 2007/11/20

[#279877] read, write, seek method in a ring buffer class — Martin Durai <martin@...>

Could any body help me with creating a ring buffer class using a string.

12 messages 2007/11/20

[#279887] is there an nicer way for this expression? — Remco Hh <remco@...>

hi,

18 messages 2007/11/20

[#279896] Choosing a scripting language for scientific programming — deltaquattro <deltaquattro@...>

Hi,

14 messages 2007/11/20

[#279940] AJAX without Rails — Miki Vz <mikisvaz@...>

Hi, I'm pretty new to Ajax.

24 messages 2007/11/20
[#279953] Re: AJAX without Rails — Miki Vz <mikisvaz@...> 2007/11/20

Actually, I'm not sure I'm using eruby, I'm using mod_ruby and

[#279961] Re: AJAX without Rails — Deepak Vohra <dvohra09@...> 2007/11/20

http://www.regdeveloper.co.uk/2007/01/15/ajax_rails_tutorial/

[#279963] Re: AJAX without Rails — Miki Vz <mikisvaz@...> 2007/11/20

Isn't this precisely a rails tutorial? I'm trying not to use rails,

[#279967] Re: AJAX without Rails — Deepak Vohra <dvohra09@...> 2007/11/20

Ruby on Rails is the only Ajax framework for Ruby.

[#279984] Packet : A Pure Ruby Library for Event Driven Network Programming — hemant <gethemant@...>

Hi Folks,

11 messages 2007/11/20

[#280005] Ruby Tool Survey — Tim Bray <Tim.Bray@...>

I'm running a survey to find out what tools Ruby and Rails people

24 messages 2007/11/20

[#280091] porting java methods to ruby — Martin Durai <martin@...>

could any one help me out to solve this.

19 messages 2007/11/21

[#280232] How to give depth to arrays? — Chris Morales <primo.tertio@...>

Hi,

12 messages 2007/11/22

[#280316] Checking whether a string is a number in disguise? — Peter Bunyan <peter.bunyan@...>

I'm working on an RPN calculator (don't ask why...) and I'm having

10 messages 2007/11/22

[#280521] Iterating through class names using a block — Ge Bro <boomstik@...>

Hey all,

14 messages 2007/11/24

[#280542] Convert words to numbers and back? — Jordon Bedwell <jordon@...>

I was wondering if somebody could give me some insight and help on how

10 messages 2007/11/24

[#280645] Moving files matching Regex — Mark Woodward <markonlinux@...>

Hi all,

16 messages 2007/11/25

[#280664] specify start postion of Regexp matching — makoto kuwata <kwa@...>

Hi, all.

16 messages 2007/11/25

[#280670] Creating a rubygem - a story and help request — Phrogz <phrogz@...>

The Story

11 messages 2007/11/25

[#280708] European Ruby Conference 2008 in Prague, Czech Republic? — Karel Minařík <karel.minarik@...>

Hi all,

13 messages 2007/11/25

[#280818] Removing duplicates and substrings from an array — "Sam Larbi" <slarbi@...>

I've got an array of strings, say like:

12 messages 2007/11/26

[#280901] Most elegant way to do this? — rbysamppi@...

Are there any more elegant, concise, pithy, and more Rubyish ways of

31 messages 2007/11/27

[#280905] Bizarre Floating point errors in Ruby? Serious bug? — space.ship.traveller@...

Hi,

16 messages 2007/11/27

[#280921] FEATURE SUGGESTION: Accept default value for to_f and to_i — Mr Magpie <gazmcgheesubs@...>

I suggest that to_i() and to_f() have an optional parameter added with

13 messages 2007/11/27

[#280923] Ruby on OLPC?? — "M. Edward (Ed) Borasky" <znmeb@...>

I've been seriously considering the One Laptop Per Child Give One Get

25 messages 2007/11/27

[#280947] Re: "Why I Program In Ruby (And Maybe Why You Shouldn't)" — Raul Parolari <raulparolari@...>

Trollen Lord wrote:

36 messages 2007/11/27
[#281035] Re: "Why I Program In Ruby (And Maybe Why You Shouldn't)" — MonkeeSage <MonkeeSage@...> 2007/11/27

On Nov 27, 11:42 am, Trollen Lord <trollenl...@gmail.com> wrote:

[#281081] Re: "Why I Program In Ruby (And Maybe Why You Shouldn't)" — MonkeeSage <MonkeeSage@...> 2007/11/27

On Nov 27, 3:24 pm, Trollen Lord <trollenl...@gmail.com> wrote:

[#281012] Accessing a file server with ruby — Anthony <improvcornartist@...>

This seems like it should be a simple solution, but I don't know

15 messages 2007/11/27

[#281157] Equivalent for unix "read" command in rake tasks? — Rob Lucas <roblucas@...>

Hi,

10 messages 2007/11/28

[#281174] Ruby needs continuations... — "Just Another Victim of the Ambient Morality" <ihatespam@...>

Warning: I don't really know what I'm talking about so if I make any

26 messages 2007/11/28

[#281224] Im trying to make Thumbnail pics -- any suggestions? — wiz_pendases@...

Im trying to make Thumbnail pics -- any suggestions? (dont know wher

10 messages 2007/11/28

[#281273] Custom Protocol — thefed <fedzor@...>

I understand that I've asked a similar question that of custom packets.

16 messages 2007/11/28

[#281295] Creating Databases in Ruby — "Will Mueller" <will.liljon@...>

Hello All,

12 messages 2007/11/29

[#281331] Hash Sorting — Nathan Viswa <nathanv@...>

Can not understand how the block after sort works! Need help. thanks.

25 messages 2007/11/29

[#281385] Negate a character sequence in a regular expression? — crm_114@...

For the following string:

12 messages 2007/11/29

[#281432] when 1.9.0 will be released? — sayoyo Sayoyo <sayoyo@...>

Hi, does someone know when the 1.9.0 will be released?

17 messages 2007/11/30

[#281478] Postfix to Infix (#148) — Ruby Quiz <james@...>

The three rules of Ruby Quiz:

45 messages 2007/11/30

[#281519] Unicode in Regex — Greg Willits <lists@...>

This is mostly a Ruby thing, and partly a Rails thing.

33 messages 2007/11/30

Re: Database speed issues

From: Robert Klemme <shortcutter@...>
Date: 2007-11-04 11:30:02 UTC
List: ruby-talk #277451
On 03.11.2007 01:53, JeremyWoertink@gmail.com wrote:
> Cool. Ok I will explain the best I can the whole entire situation from
> the beginning.
> 
> I work for a company that produces plastic cards (i.e. credit cards,
> debit cards, gift cards, id cards etc..)
> 
> I receive files from customers containing anywhere from 1..100000000
> records. These files get formatted and put into a network drive in
> their respective job number folders. One customer is [...].

First of all a bit of general advice: in my experience customers are not 
happy reading their names somewhere public without their prior 
permission.  So you probably should omit them in future postings to 
avoid trouble.  Company names do not actually help in resolving a 
technical issue.

> About a year ago there was an idiot programmer who formatted the data
> incorrectly causing 150,000 duplicate cards. So because of this, the
> company wants to put in place a system of checking for duplicates.

Even in the absence of "idiot programmers" duplicate checking is a good 
idea.  There is so much that can go wrong - and you indicated yourself, 
that you might get buggy input data.  It is always a good idea to verify 
that data you receive from somewhere else complies with your 
requirements or otherwise meets your expectations.  And this holds true 
on all levels (i.e. input files obtained from somewhere, method 
arguments etc.).  So you might as well be thankful for this guy's 
mistake because the checking you are doing might save your company a lot 
of hassle in another situation. :-)

> The
> way it has to work (because I don't have a say in this even though i'm
> the developer >.<) is that the past 18 months of jobs will be loaded
> into a database (or something useful) and when we (the programmers)
> get a new file, we will generate our data from these files, then run
> our application to see if there are any duplicate records in what we
> created compared to what we have created in the past 18 months
> according to their respective companies. (i.e. we get a starbucks job,
> program it and then test it against last 18 months of starbucks
> records.)

Do you actually reload past 18 month's job data for every new job?  This 
would be a major source of inefficiency.  If you do I would rather have 
a large table that stays there and add entries with timestamps.  Then 
with a proper index (or with a table partitioned by month) you can 
efficiently delete old data (i.e. data from jobs older than 18 months). 
  If your DB product supports partitioning you should definitively use 
it as it makes deletions much more efficient.

Btw, I can't find a reference to the DB vendor that you are using. 
*That* would be an interesting brand name. :-)  Seriously, this can have 
a major impact on your options.

> If there are no duplicate files, then we will load that job into the
> database (or whatever we need) and continue on from there. If there
> are duplicates, then we have to re-program that job, and generate a
> report stating how many duplicates, and what exactly it is that was
> duplicated (i.e. account number, mag stripe, pin, customer name). The
> report is then automatically e-mailed to our boss who then walks next
> door to our office to yell at us and tell us to re-program the job.
> 
> So for a figure we did 100 million starbucks cards in the past 12
> months. We need to check the last 18 months. Starbucks jobs come in
> from a company called Valuelink, who generates the data for us. CVS
> pharmacy and Disney and McDonalds roughly equal another 100 million
> records combined. These also come from Valuelink.
> This is important, because if we program it correctly, but Valulink
> sends over a duplicate card number, then we need to be able to report
> this. So when they send over another job of 5 million cards, now we
> are checking that against 200 million + cards to see if anything is
> duplicate.
> 
> We program roughly 20-40 jobs in a day, so we need a quick way to go
> through all this and not lose time or else we will end up working
> longer days, and not be compensated for it :(

You do not give details about your jobs and what it is that gets 
duplicated.  I will just assume that it is some kind of "key", which 
might be just a single character sequence or a number of fields.  Also, 
we do not know what other information comprises a "job", so take the 
following with a grain of salt.

There are a few ways you could tackle this.  I will try to sketch some.

1. Avoid invalid jobs

You could do this by having a table with job definitions (maybe one per 
customer if keys are different) where the key has a UNIQUE constraint on 
it.  Then you prepare your data (possibly in CVS files) and load it into 
the DB.  The unique constraint will prevent duplicate jobs.

Now it depends on the DB vendor you are using.  IIRC with Oracle you can 
use SQL*Loader and have it report duplicate records, i.e. records that 
were not inserted.  Same for SQL Server, there is a key property IGNORE 
DUPLICATES which will prevent duplicate insertion.  I am not sure about 
duplicate reporting though.

If your DB vendor does not support this, you can load data into a 
temporary table and insert from there.  You can then use an approach 
from 2 to detect duplicates:

2. Detect duplicates

Approach as above but do not create a UNIQUE constraints but instead 
index the key (if your key contains multiple fields you just have a 
covering index with several columns).  Now you can check for duplicates 
with a query like this:

select key1, key2, ... keyn, count(*) occurrences
from job_table
group by key1, key2, ... keyn
having count(*) > 1

Now this query will return all key fields which occur more than once. 
If you also need other info you can do this:

select *
from job_table jt join (
select key1, key2, ... keyn
from job_table
group by key1, key2, ... keyn
having count(*) > 1
) jt_dup on jt.key1 = jt_dup.key1
and jt.key2 = jt_dup.key2
and ...
and jt.keyn = jt_dup.keyn

(The joined table is an "inline view" in case you want to look further 
into this concept.)

Using the index you can also delete duplicates pretty efficiently. 
Alternatively delete all entries from the new job, modify the data 
outside and load again.  It depends on the nature of your other 
processing which approach is better.  Either way you could also generate 
your job data from this table even with duplicates in the table by using 
SELECT DISTINCT or GROUP BY.

I hope I have given you some food for thought and this gets you started 
digging deeper.

Kind regards

	robert


PS: I'm traveling the next three days so please do not expect further 
replies too early.

In This Thread

Prev Next