[#368826] ANN: home_run 0.9.1 Released — Jeremy Evans <code@...>

= home_run

14 messages 2010/09/01

[#368894] uninitialized constant NArray (Name Error) — Abder-Rahman Ali <abder.rahman.ali@...>

Following section (2) here:

12 messages 2010/09/02

[#368914] p vs. print — Abder-Rahman Ali <abder.rahman.ali@...>

I wrote the following scripts from section (2) here:

24 messages 2010/09/02
[#368915] Re: p vs. print — Alex Stahl <astahl@...5.com> 2010/09/02

Ruby uses "puts", not "print". "p" is short for "puts".

[#368977] Read cookie — Pål Bergström <pal@...>

How can I get the value of a browser cookie with Ruby?

29 messages 2010/09/03
[#368978] Re: Read cookie — Brian Candler <b.candler@...> 2010/09/03

P奪l Bergstr旦m wrote:

[#368984] Re: Read cookie — Pål Bergström <pal@...> 2010/09/03

Brian Candler wrote:

[#369036] ruby_archive 0.1.0 released — Jonathan Nielsen <jonathan@...>

After a summer of working on various ways to implement it, I'm happy to

10 messages 2010/09/03

[#369106] A better idiomatic way of doing this?! — Tim Romberg <tim.jakobsson@...>

Hi Im new at Ruby and been struggling with this lab I have for a course

12 messages 2010/09/05

[#369113] unable to open X server `' (Magick::ImageMagickError) — Abder-Rahman Ali <abder.rahman.ali@...>

I have written a Ruby script "dicom_info.rb", and when I try running

16 messages 2010/09/06
[#369119] Re: unable to open X server `' (Magick::ImageMagickError) — Brian Candler <B.Candler@...> 2010/09/06

On Mon, Sep 06, 2010 at 12:42:11PM +0900, Abder-Rahman Ali wrote:

[#369132] Re: unable to open X server `' (Magick::ImageMagickError) — Abder-Rahman Ali <abder.rahman.ali@...> 2010/09/06

Thanks a lot Brian.

[#369139] Re: unable to open X server `' (Magick::ImageMagickError) — Brian Candler <B.Candler@...> 2010/09/06

On Mon, Sep 06, 2010 at 11:55:32PM +0900, Abder-Rahman Ali wrote:

[#369140] Re: unable to open X server `' (Magick::ImageMagickError) — Abder-Rahman Ali <abder.rahman.ali@...> 2010/09/06

So, do you suggest installing "cygwin"?

[#369159] Re: unable to open X server `' (Magick::ImageMagickError) — Roger Pack <rogerpack2005@...> 2010/09/06

> you suggest installing "cygwin"?

[#369124] Odd functional programming question — Peter Hickman <peterhickman386@...>

Ok this is probably not really functional programming but I was just

10 messages 2010/09/06

[#369169] How do I request a HTTPS page? — Samuel Sternhagen <samatoms@...>

I would like to access a https page from irb

14 messages 2010/09/06

[#369226] What OS do you use for Ruby development? — Nick Hird <nrhird@...>

I don't want to start any OS wars. I was just curious as to what OS

67 messages 2010/09/07

[#369301] Nokogiri and LibXML — unbewusst.sein@... (Une B騅ue)

Each time i launch a script using Nokogiri i get :

12 messages 2010/09/08

[#369302] Receiving array naturally? — Terry Michaels <spare@...>

As I learn Ruby, I find a lot of flexibility in the syntax. I was

14 messages 2010/09/08

[#369389] Key Associated w/ Maximum Value in Hash — Timothy Baron <timothy.baron@...>

Simple question: what's the cleanest way to retrieve a key associated

11 messages 2010/09/09

[#369477] How to do foo.chomp!.rstrip!.downcase! ? — Geometric Patterns <geometric.patterns@...>

15 messages 2010/09/10

[#369623] Ruby packaging in Debian and Ubuntu: Mythbusting and FAQ — Lucas Nussbaum <lucas@...>

Hi,

11 messages 2010/09/12

[#369638] Declarative relations between object attributes — Knut Franke <knut.franke@...>

Some time ago I stumbled over Cells[1], a Common Lisp extension allowing

16 messages 2010/09/12

[#369710] Encoding error while installing rails on ruby 1.9.2 — Bek Bek <bekis3@...>

Hello everybody,

11 messages 2010/09/14

[#369796] Ruby Multi-threading? — Terry Michaels <spare@...>

I read a Ruby e-book recently that indicated that although Ruby has

12 messages 2010/09/15

[#369952] Developing for Ruby on Windows? — Tom Wardrop <tom@...>

I've heard a lot of criticism about developing for Ruby on Windows, but

11 messages 2010/09/17

[#370039] Ruby-based data language — Intransition <transfire@...>

Has anyone ever endeavored to create a data/configuration file format

14 messages 2010/09/19

[#370053] Getting GUI for ruby for Linux running (QT or wxWidget)? — Markus Fischer <markus@...>

Hi,

23 messages 2010/09/19
[#370054] Re: Getting GUI for ruby for Linux running (QT or wxWidget)? — Markus Fischer <markus@...> 2010/09/19

On 20.09.2010 01:14, Markus Fischer wrote:

[#370116] Re: Getting GUI for ruby for Linux running (QT or wxWidget)? — Quintus <sutniuq@...> 2010/09/20

-----BEGIN PGP SIGNED MESSAGE-----

[#370164] Re: Getting GUI for ruby for Linux running (QT or wxWidget)? — Ryan Melton <ryanmelt@...> 2010/09/21

qt does have a new gem I put together:

[#370205] QT works! (was: Re: Getting GUI for ruby for Linux running (QT or wxWidget)) — Markus Fischer <markus@...> 2010/09/21

Hi,

[#370127] An elegant way... — "F. Senault" <fred@...>

Hello everybody.

23 messages 2010/09/20

[#370210] The Great Ruby GUI Toolkit Roundup — Ed Howland <ed.howland@...>

Hi,

15 messages 2010/09/21

[#370257] having problems with open4 and stuck forked processes — Tim Uckun <timuckun@...>

I am running a batch process which uses the wkhtmltoimage-i386 binary

13 messages 2010/09/22
[#370268] Re: having problems with open4 and stuck forked processes — Robert Klemme <shortcutter@...> 2010/09/22

On Wed, Sep 22, 2010 at 2:31 PM, Tim Uckun <timuckun@gmail.com> wrote:

[#370294] Re: having problems with open4 and stuck forked processes — Tim Uckun <timuckun@...> 2010/09/22

> What do you mean by that?  Goes the timeout undetected?  Can't you

[#370309] Re: having problems with open4 and stuck forked processes — Robert Klemme <shortcutter@...> 2010/09/23

On 23.09.2010 01:59, Tim Uckun wrote:

[#370289] Sorting problem with an Array of Arrays — Paul <tester.paul@...>

Hi there, I have an array of arrays that looks like the following:

15 messages 2010/09/22

[#370296] Ruby Installation Error — Tridib Bandopadhyay <tridib04@...>

I am trying this command to build the ruby interpreter

20 messages 2010/09/23
[#370689] Re: Ruby Installation Error — Brian Candler <b.candler@...> 2010/09/29

Tridib Bandopadhyay wrote:

[#370319] to make dot method dot method work? — Pen Ttt <myocean135@...>

here is the class

14 messages 2010/09/23

[#370373] how do i force ruby to release memory — Amit Tomar <amittomer25@...>

Hi all,

19 messages 2010/09/24
[#370374] Re: how do i force ruby to release memory — Robert Klemme <shortcutter@...> 2010/09/24

On Fri, Sep 24, 2010 at 7:36 AM, Amit Tomar <amittomer25@yahoo.com> wrote:

[#370379] Re: how do i force ruby to release memory — Amit Tomar <amittomer25@...> 2010/09/24

Robert Klemme wrote:

[#370380] Re: how do i force ruby to release memory — Jes俍 Gabriel y Gal疣 <jgabrielygalan@...> 2010/09/24

On Fri, Sep 24, 2010 at 10:31 AM, Amit Tomar <amittomer25@yahoo.com> wrote:

[#370383] Re: how do i force ruby to release memory — Amit Tomar <amittomer25@...> 2010/09/24

Jes炭s Gabriel y Gal叩n wrote:

[#370388] How to delete the browser cache through ruby — Arihan Sinha <arihan_sinha@...>

Hi All,

11 messages 2010/09/24

[#370590] Point me to help w/ multithreading in 1.9.2-p0 — Alex Stahl <astahl@...5.com>

Hi Folks - A week or two ago, I pinged this list for recommendations on

11 messages 2010/09/28
[#370593] Re: Point me to help w/ multithreading in 1.9.2-p0 — Alex Stahl <astahl@...5.com> 2010/09/28

Nevermind... figured it out.

[#370640] puts and return — Jim Haungs <jhaungs@...>

10.times do |i|

14 messages 2010/09/28

[#370661] Color sequences in ri on Windows — Eric Christopherson <echristopherson@...>

After installing some gems, the system recommended that I refresh ri

11 messages 2010/09/28

[#370721] The beauty of Ruby through examples — Adriano Ferreira <adrfer@...>

Hey all,

33 messages 2010/09/29

[#370740] Can't upgrade ruby on Snow Leopard — Ast Jay <azzzz@...>

I've followed the instructions here:

13 messages 2010/09/29

[#370796] How to prevent overwriting methods by accident? — Stefan Salewski <mail@...>

In Ruby we can add new methods to existing classes.

13 messages 2010/09/30
[#370797] Re: How to prevent overwriting methods by accident? — Jeremy Bopp <jeremy@...> 2010/09/30

On 9/30/2010 2:15 PM, Stefan Salewski wrote:

[#370800] Re: How to prevent overwriting methods by accident? — Alex Stahl <astahl@...5.com> 2010/09/30

But is there a way to call the original method instead of just quitting

Re: Using Nokogiri to scrape multiple websites

From: Ryan Mckenzie <ryan@...>
Date: 2010-09-07 09:42:39 UTC
List: ruby-talk #369221
Jesús Gabriel y Galán wrote:
> On Mon, Sep 6, 2010 at 5:01 PM, Ryan Mckenzie <ryan@souliss.com> wrote:
>> Hi Jes�s,
>
>> I'm looking to output the information to an .html document (using the
>> Rails framework) and I'm getting the following error: can't convert
>> Fixnum into Array
>>
>> Also what I'm actually after trying to do is scrap each of the websites
>> to see if they contain a specific url so I would need to pass in a list
>> of about 3-4 keywords for each of the domains.
>>
>> So something like
>>
>> def index
>> � �keywords = %w{accounts resources membership}
>> � �sites = %w{http://www.google.com http://www.yahoo.com}
>> �links = []
>> �sites.each {|site| links.concat(scrape(site, keywords[]))}
>> �end
>>
>> def scrape(website,inputtext)
>> � �require 'open-uri'
>> �require 'nokogiri'
>>
>> � �doc = Nokogiri::HTML(open(website))
>>
>> �for sample in doc.xpath('//a')
>> � �if sample.text == inputtext
>> � � �keywords = doc.xpath('//a')
>> � �else
>> � � �keywords = "MISSING"
>> � �end
>> �end
>> �end
>>
>> Thanks for your time.
> 
> So you want to iterate twice, in each site search for a link that
> contains the specified word? Do you want to also organize for which
> word and site each result comes from? If so, I'd do something like:
> 
> def index
>   keywords = %w{accounts resources membership}
>   sites = %w{http://www.google.com http://www.yahoo.com}
>   links_by_site = Hash.new {|h,k| h[k] = {}}
>   sites.each do |site|
>     keywords.each do |keyword|
>       links[site][keyword] = scrape(site, keyword)
>     end
>   end
>   links
> end
> 
> def scrape(website,inputtext)
>   require 'open-uri'  #these could maybe go at the start of the script
>   require 'nokogiri'
> 
>   regex = /#{inputtext}/
>   links_that_match = []
>   doc = Nokogiri::HTML(open(website))
>   doc.xpath('//a').each do |link|
>     if  regex =~ link.inner_text
>      links_that_match << link.to_html
>    end
>   end
>   links_that_match
> end
> 
> Untested, but it can give you some ideas. The resulting hash will have
> something like:
> 
> {"http://www.google.com" => {"accounts" => [<some links containing the
> word accounts>], "resources" => [<idem for resources>]
> ...
> }
> 
> Jesus.

That works great! Thank you.

Instead of having to pull the items from a hash though I would really 
like to try pull them from a database for when the list gets extremely 
large. I've tried using the hash to pull from a variable but it produces 
an error which says the hash is an odd length. It is only going to be a 
flat table database so all of the data will be called under 
@backlinks.title (the keyword(s)), @backlinks.permalink (for the site)

def index
  @links = Hash.new { |ha,lnk| ha[lnk] = {} }
  @backlinks = Backlink.find(:all)
  keywords = %w{@backlinks.concat(title)}
  sites = %w{@backlinks.concat(permalink)}
  links_by_site = Hash.new {|h,k| h[k] = {}}
  sites.each do |site|
    keywords.each do |keyword|
      @links[site][keyword] = scrape(site, keyword)
    end
  end

Thanks again.

McKenzie

-- 
Posted via http://www.ruby-forum.com/.

In This Thread