[#389739] Ruby Challenge — teresa nuagen <unguyen90@...>

Here is a ruby challenge for all you computer science lovers out there,

22 messages 2011/11/05
[#389769] Re: Ruby Challenge — "Jonan S." <jonanscheffler@...> 2011/11/05

Totally unrelated to any husker computer science programs right? Like

[#389905] Re: Ruby Challenge — Stephen Ramsay <sramsay.unl@...> 2011/11/09

Jonan S. wrote in post #1030330:

[#389907] Re: Ruby Challenge — aseret nuagen <unguyen90@...> 2011/11/09

> You mean like the professor for the course? Because that would be me .

[#389915] Re: Ruby Challenge — Robert Klemme <shortcutter@...> 2011/11/09

On Wed, Nov 9, 2011 at 4:52 AM, aseret nuagen <unguyen90@aim.com> wrote:

[#389792] Tricky DSL, how to do it? — Intransition <transfire@...>

I'd want to write a DSL such that a surface method_missing catches

18 messages 2011/11/06

[#389858] Compiling Ruby Inline C code - resolving errors — Martin Hansen <mail@...>

I am trying to get this Ruby inline C code http://pastie.org/2825882 to

12 messages 2011/11/08

[#389928] Forming a Ruby meetup group... — "Darryl L. Pierce" <mcpierce@...>

Where I work we have a local Ruby group that used to meet up, until the

12 messages 2011/11/09

[#389950] The faster way to read files — "Noé Alejandro" <casanejo@...>

Does anybody know which is the fastest way to read a file? Lets say

18 messages 2011/11/09

[#390064] referring to version numbers in a gem — Chad Perrin <code@...>

How do I specify and access a gem's version number within the code of the

28 messages 2011/11/11

[#390238] RVM problem, plz help — Misha Ognev <b1368810@...>

Hi, I have this problem:

15 messages 2011/11/16

[#390308] any command line tools for querying yaml files — Rahul Kumar <sentinel1879@...>

(Sorry, this is not exactly a ruby question).

11 messages 2011/11/18

[#390338] Newbie - cmd question — Otto Dydakt <ottodydakt@...>

I've literally JUST downloaded ruby from rubyinstaller.org.

21 messages 2011/11/19
[#390342] Re: Newbie - cmd question — Otto Dydakt <ottodydakt@...> 2011/11/19

OK thank you, I uninstalled & reinstalled, checking the three boxes at

[#390343] Re: Newbie - cmd question — "Ian M. Asaff" <ian.asaff@...> 2011/11/19

did you type "irb" first to bring up the ruby command prompt?

[#391154] Re: Newbie - cmd question — "Hussain A." <hahmad@...> 2011/12/12

Hi all,

[#391165] Re: Newbie - cmd question — Luis Lavena <luislavena@...> 2011/12/12

Hussain A. wrote in post #1036281:

[#390374] Principle of Best Principles — Intransition <transfire@...>

I seem to run into a couple of design issue a lot and I never know what is

16 messages 2011/11/20

[#390396] how to call Function argument into another ruby script. — hari mahesh <harismahesh@...>

Consider I have a ruby file called library.rb.

10 messages 2011/11/21

[#390496] How to make 1.9.2 my default version using RVM — Fily Salas <fs_tigre@...>

Hi,

25 messages 2011/11/24

[#390535] Is high-speed sorting impossible with Ruby? — "Gaurav C." <chande.gaurav@...>

Well, first of all, I'm new to Ruby, and to this forum. So, hello. :)

39 messages 2011/11/25
[#390580] Re: Is high-speed sorting impossible with Ruby? — Joao Pedrosa <joaopedrosa@...> 2011/11/27

Hi,

[#390593] Re: Is high-speed sorting impossible with Ruby? — "Gaurav C." <chande.gaurav@...> 2011/11/27

Joao Pedrosa wrote in post #1033884:

[#390600] Re: Is high-speed sorting impossible with Ruby? — Douglas Seifert <doug@...> 2011/11/27

A big gain can be had by disabling the garbage collector. Here is my best

[#390601] Re: Is high-speed sorting impossible with Ruby? — Douglas Seifert <doug@...> 2011/11/27

I've thrown various solutions up on github here:

[#390689] Stupid question — James Gallagher <lollyproductions@...>

Hi everyone.

22 messages 2011/11/30

Re: Webcrawler that become enormous

From: Robert Klemme <shortcutter@...>
Date: 2011-11-17 16:40:03 UTC
List: ruby-talk #390268
On Thu, Nov 17, 2011 at 5:14 PM, Lucas Panthe <panthe@libero.it> wrote:
> Hi,
> I've made a script in Ruby 1.8 that use the gems mechanize, nokogiri and
> open-uri and run under Linux.
>
> This script is a webcrawler that scan a bigger site that contains a big
> amount of data of international firm.
>
> I'm interesting in create a db with only some data and not the full
> data.
> Mine script run perfectly and grab all data in the correct order but
> after 6-8 hour that the script run the amount of memory that use is
> enormous (1gb).
>
> I save in a file the results of scraping and empty the buffer of data
> every 10 firm collect.

It seems either you do not free the memory (and thus have created a
leak yourself) or you suffer from the mentioned bug.

> I've follow this post ofr obtain this results because before the script
> used this amount of memory just after 4hour.
> http://stackoverflow.com/questions/181406/ruby-memory-management
>
> Someone can help me to reduce this problem and optimize this script?
> Exist an IDE that make an efficient debug for ruby?
> I think that there is something that I've missed.

First thing I'd do is to update Ruby version to a more recent 1.9.*
version.  That will be faster also and likely has a fix for the
leakage bug mentioned on the stackoverflow page.  If your problem
persists, you need to look into your code.

A simple test would be to write out statistics per class on a regular
basis, e.g.

cnt = Hash.new 0
ObjectSpace.each_object(BasicObject) {|o| cnt[o.class] += 1}
cnt.sort_by {|cl,c| cl.to_s}.each {|cl,c| printf "%10d %s\n", c, cl}

Then compare counts per class.  Of course, you can get a bit more
fancy and calculate deltas etc.  But then there are better tools
around, I guess.

Kind regards

robert

-- 
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

In This Thread

Prev Next