[#4341] DRY and embedded docs. — Hugh Sasse Staff Elec Eng <hgs@...>
If I have a here document in some ruby program:
[#4347] Re: DATA and rewind. — ts <decoux@...>
>>>>> "H" == Hugh Sasse Staff Elec Eng <hgs@dmu.ac.uk> writes:
[#4350] Re: Thirty-seven Reasons [Hal Fulton] Love[s] Ruby — "David Douthitt" <DDouthitt@...>
[#4396] Re: New Require (was: RAA development ideas (was: RE: Looking for inp ut on a 'links' page)) — Hugh Sasse Staff Elec Eng <hgs@...>
On 9 Aug 2000, Dave Thomas wrote:
[#4411] Re: RAA development ideas (was: RE: Lookin g for inp ut on a 'links' page) — Aleksi Niemel<aleksi.niemela@...>
Me:
On Thu, 10 Aug 2000, [iso-8859-1] Aleksi Niemelwrote:
[#4465] More RubyUnit questions. — Hugh Sasse Staff Elec Eng <hgs@...>
I am beginning to get a feel for this, but I still have a few more
[#4478] Re: RubyUnit. Warnings to be expected? — ts <decoux@...>
>>>>> "H" == Hugh Sasse Staff Elec Eng <hgs@dmu.ac.uk> writes:
[#4481] Invoking an extension after compilation — Dave Thomas <Dave@...>
Hi,
[#4501] What's the biggest Ruby development? — Dave Thomas <Dave@...>
[#4502] methods w/ ! giving nil — Hugh Sasse Staff Elec Eng <hgs@...>
I have got used to the idea that methods that end in '!' return nil if
[#4503] RubyUnit and encapsulation. — Hugh Sasse Staff Elec Eng <hgs@...>
My_class's instance variables are not all "attr :<name>" type variables,
[#4537] Process.wait bug + fix — Brian Fundakowski Feldman <green@...>
If your system uses the rb_waitpid() codepath of rb_f_wait(),
[#4567] Re: What's the biggest Ruby development? — Aleksi Niemel<aleksi.niemela@...>
Dave said:
Robert Feldt <feldt@ce.chalmers.se> writes:
On Sat, 26 Aug 2000, Dave Thomas wrote:
Robert Feldt <feldt@ce.chalmers.se> writes:
On Mon, 28 Aug 2000, Dave Thomas wrote:
Robert Feldt <feldt@ce.chalmers.se> writes:
[#4591] Can't get Tcl/Tk working — Stephen White <steve@...>
I can't get any of the samples in the ext/tk/sample directory working. All
I'm sure looking forwards to buying the book. :)
Stephen White <steve@deaf.org> writes:
On Sun, 27 Aug 2000, Dave Thomas wrote:
Stephen White <steve@deaf.org> writes:
[#4608] Class methods — Mark Slagell <ms@...>
Reading the thread about regexp matches made me wonder about this:
[#4611] mod_ruby 0.1.19 — shreeve@...2s.org (Steve Shreeve)
Shugo (and others),
[#4633] Printing tables — DaVinci <bombadil@...>
Hi.
[#4647] Function argument lists in parentheses? — Toby Hutton <thutton@...>
Hello,
[#4652] Andy and Dave's European Tour 2000 — Dave Thomas <Dave@...>
Hi,
[#4672] calling super from c — Robert Feldt <feldt@...>
[#4699] Double parenthesis — Klaus Spreckelsen <ks@...1.ruhr-uni-bochum.de>
Why is the first line ok, but the second line is not?
[ruby-talk:04407] Real world performance problems
I thought to share my problem in case you might have major improvement ideas
for it. Here's a sample code which shows how my current version is working,
and that it's performing unacceptably slow. This is my first "real-world"
application with Ruby so I've to say I'm quite surprised, and I wouldn't
describe my feelings using phrases "wow, that's awesome!", "cooool!" or
something :). So I'm quite sure there's quite much I can do for it, just
don't know what.
I think the major problem is that I'm creating tons of objects, and most of
the time is spent on garbage collection. Actually the real data does not got
lost at any point, there's just probably too many inobjects before final
form.
Anyway, I was expecting (about) linear progression of my 24 000 item
creation and indexing. The reality was strucking and surprising. The process
time grows, more than just reallocation of store hash. And that should be be
quite rare event if I'm right.
clock time of measurement
| processed
| | time spent on processing these
| | |
21:49:20.692 0:
21:49:21.696 1000: 1.0033
21:49:26.177 2000: 4.4810
21:49:31.275 3000: 5.0985
21:49:38.313 4000: 7.0378
21:49:46.680 5000: 8.3666
21:49:56.060 6000: 9.3803
21:50:08.024 7000: 11.9637
21:50:20.632 8000: 12.6087
21:50:36.306 9000: 15.6736
21:50:52.299 10000: 15.9934
21:51:09.934 11000: 17.6344
21:51:32.223 12000: 22.2889
21:51:51.977 13000: 19.7542
21:52:15.868 14000: 23.8907
21:52:40.387 15000: 24.5198
21:53:05.641 16000: 25.2538
21:53:35.478 17000: 29.8363
21:54:02.646 18000: 27.1685
21:54:34.499 19000: 31.8530
21:55:06.679 20000: 32.1800
21:55:39.463 21000: 32.7836
21:56:18.285 22000: 38.8225
21:56:53.563 23000: 35.2782
21:57:34.268 24000: 40.7044
I thought it would be a snap to read (generate) 24 000 entries, collect keys
and store them. Well, with quite average intel computer it took over 7
minutes (of CPU time).
Before you start to tweak my program you should keep few things in mind:
1) In reality there's about this many entries in total, but this
code is just for initialization. The real program will
overwrite existing entries many many times.
2) I'm getting the data in format used in example.
3) Field names and values, and their count, is not predeterminable.
The information which fields compose the key is known neither.
4) The store will be searched often and the data inside will be heavily
operated. This rules out solution to use 'store' hash in following format
{ "FieldValue1/FieldValue2" => "original data string", ... }
Ok, I guess I've said enough, or even too much. One more point, though, the
code. And thanks beforehand.
- Aleksi
# Helper routines for time printing
def time(earlier=nil)
t = Time.now
$stderr.printf("%02d:%02d:%02d.%03d",
t.hour, t.min, t.sec, t.usec.to_i/1000)
if earlier
printf(" %6.4f", t - earlier )
end
puts
t
end
store = {} # where we stuff items
key_fields = %w(kkkkkk rrrr) # what 'names' provide the uniq id
mod = "aaaa" # something random for the entry (and key)
t=nil # for time difference
24001.times do |i|
# our real world data, the content varies actually, we modify
# each entry only to get unique key for 'store'
str = ("ddddddddddd=0.00&oooooooooooo=0&aaaaaaa=0.00&llll=0.00&"+
"yyyyyy=0.00&pppp=0.00&rrrr=abcd1234#{mod}&vvvvvv=0&"+
"eeee=5.90&bbb=0.00&sss=0.00&ttttt=NIL&mmmmmmm=0&"+
"kkkkkk=ABCDE&nnnnnnnnnn=0.00&qqqqqq=12170305")
mod.succ!
# transfer input data into nice datastructure
entry = {}
str.split(/&/).each do |field|
name, value = field.split(/=/)
entry[name] = value
end
# find out the key
key = []
key_fields.each do |kf|
key << entry[kf]
end
# keep each entry easily accessible
store[key] = entry
# let's print time for progressing for each
# 1000 items we process
if i%1000 == 0
printf(" %6d: ", i)
t = time(t)
end
end
puts "done!"
sleep 120