[#374683] A algorithm for finding the number — zuerrong <zuerrong@...>

Hi members,

11 messages 2010/12/01

[#374721] FasterCSV parsing issues — Jeremy Woertink <jeremywoertink@...>

I'm using FasterCSV to do an import into my DB, and the CSV file

14 messages 2010/12/01

[#374765] Singleton class, metaclass, eigenclass: what do they mean? — Tony Arcieri <tony.arcieri@...>

Every time I think I have my head around what these terms mean I seem to run

29 messages 2010/12/02
[#374783] Re: Singleton class, metaclass, eigenclass: what do they mean? — Intransition <transfire@...> 2010/12/02

[#374787] Re: Singleton class, metaclass, eigenclass: what do they mean? — Gary Wright <gwtmp01@...> 2010/12/02

[#374803] Re: Singleton class, metaclass, eigenclass: what do they mean? — Intransition <transfire@...> 2010/12/02

[#374825] Re: Singleton class, metaclass, eigenclass: what do they mean? — Tony Arcieri <tony.arcieri@...> 2010/12/02

On Thu, Dec 2, 2010 at 11:50 AM, Intransition <transfire@gmail.com> wrote:

[#374830] Re: Singleton class, metaclass, eigenclass: what do they mean? — Intransition <transfire@...> 2010/12/02

[#374832] Re: Singleton class, metaclass, eigenclass: what do they mean? — Tony Arcieri <tony.arcieri@...> 2010/12/02

On Thu, Dec 2, 2010 at 4:18 PM, Intransition <transfire@gmail.com> wrote:

[#374834] Re: Singleton class, metaclass, eigenclass: what do they mean? — Gary Wright <gwtmp01@...> 2010/12/02

[#374835] Re: Singleton class, metaclass, eigenclass: what do they mean? — Tony Arcieri <tony.arcieri@...> 2010/12/02

On Thu, Dec 2, 2010 at 4:55 PM, Gary Wright <gwtmp01@mac.com> wrote:

[#374844] Re: Singleton class, metaclass, eigenclass: what do they mean? — Gary Wright <gwtmp01@...> 2010/12/03

[#374850] Re: Singleton class, metaclass, eigenclass: what do they mean? — Peter Vandenabeele <peter@...> 2010/12/03

On Fri, Dec 3, 2010 at 5:05 AM, Gary Wright <gwtmp01@mac.com> wrote:

[#374903] Re: Singleton class, metaclass, eigenclass: what do they mean? — Tony Arcieri <tony.arcieri@...> 2010/12/04

On Fri, Dec 3, 2010 at 1:17 AM, Peter Vandenabeele

[#374924] Re: Singleton class, metaclass, eigenclass: what do they mean? — Peter Vandenabeele <peter@...> 2010/12/04

On Sat, Dec 4, 2010 at 4:37 AM, Tony Arcieri <tony.arcieri@medioh.com> wrote:

[#374954] Re: Singleton class, metaclass, eigenclass: what do they mean? — Rick DeNatale <rick.denatale@...> 2010/12/05

On Sat, Dec 4, 2010 at 2:56 PM, Peter Vandenabeele

[#374786] Screen scraping an aspx site with Mechanize — Sofie Willander <sofiewil@...>

Hi,

11 messages 2010/12/02

[#374875] cast object to object — "David E." <davidreynon@...>

So I have an object of class (user defined) Dave() and Dave2()

13 messages 2010/12/03

[#374960] Q: what database would you suggest? — Diego Virasoro <diego.virasoro@...>

Hi,

18 messages 2010/12/05

[#375002] Traverse YAML node tree with non-unique values — "Martin C." <mydoghasworms@...>

I have a YAML document which I believe is valid (at least it would be

11 messages 2010/12/06

[#375018] Manual Memory Management and Automatic Garbage Collection — Tridib Bandopadhyay <tridib04@...>

Hello all

27 messages 2010/12/06

[#375118] HTTP POST request --> Ruby server — Chananya Freiman <thebluedragont@...>

I am making a tiny web server, and I am having problems with HTTP POST

17 messages 2010/12/07

[#375149] ruby book — abe <abedar2000@...>

i am looking for a good ruby book for a developer who has a c

14 messages 2010/12/08

[#375170] Consume Soap Service with Basic Authentication — Chris Gunnels <rfsllc@...>

I have been searching and trying different gems to get this to work, but

10 messages 2010/12/08

[#375192] Splitting on capital letters — Ralph Shnelvar <ralphs@...32.com>

Assume I have camelized string like

13 messages 2010/12/08

[#375213] Making a Website with Ruby (not rails?) — Jesse Jurman <e.j.jurman@...>

I have been programming in Ruby for a while and have made several

12 messages 2010/12/09

[#375270] Help with net/http — Atomic Bomb <atomicmcbomb@...>

I am trying to screen scrape a webpage and pull out the name, address,

19 messages 2010/12/09
[#375273] Re: Help with net/http — Alex Stahl <astahl@...5.com> 2010/12/09

Nokogiri provides a great interface for accessing the data trapped

[#375285] Re: Help with net/http — "A. Mcbomb" <atomicmcbomb@...> 2010/12/10

Thanks Alex.

[#375289] Re: Help with net/http — Jes俍 Gabriel y Gal疣 <jgabrielygalan@...> 2010/12/10

On Fri, Dec 10, 2010 at 6:28 AM, A. Mcbomb <atomicmcbomb@gmail.com> wrote:

[#375291] Re: Help with net/http — "A. Mcbomb" <atomicmcbomb@...> 2010/12/10

I didn't realized that, Jesus but it didn't help in my installation.

[#375292] Re: Help with net/http — Jes俍 Gabriel y Gal疣 <jgabrielygalan@...> 2010/12/10

On Fri, Dec 10, 2010 at 10:48 AM, A. Mcbomb <atomicmcbomb@gmail.com> wrote:

[#375293] Re: Help with net/http — "A. Mcbomb" <atomicmcbomb@...> 2010/12/10

That definately helped, Jesus....thanks.

[#375295] Re: Help with net/http — Jes俍 Gabriel y Gal疣 <jgabrielygalan@...> 2010/12/10

On Fri, Dec 10, 2010 at 11:39 AM, A. Mcbomb <atomicmcbomb@gmail.com> wrote:

[#375298] Re: Help with net/http — "A. Mcbomb" <atomicmcbomb@...> 2010/12/10

Here's what my server is running:

[#375424] Instiki failing to run - msvcrt-ruby18.dll not found — John Smth <blip@...>

Hi

16 messages 2010/12/14

[#375442] do your bit for my mental health - how to find the difference between two strings? — Iain Barnett <iainspeed@...>

Hi,

22 messages 2010/12/14

[#375537] Ruby and science ? — Michel Demazure <michel@...>

I am really puzzled.

56 messages 2010/12/16
[#375538] Re: Ruby and science ? — Phillip Gawlowski <cmdjackryan@...> 2010/12/16

On Thu, Dec 16, 2010 at 11:19 AM, Michel Demazure <michel@demazure.com> wrote:

[#375569] Re: Ruby and science ? — Ryan Davis <ryand-ruby@...> 2010/12/16

[#375581] Re: Ruby and science ? — Michel Demazure <michel@...> 2010/12/17

Ryan Davis wrote in post #968969:

[#375582] Re: Ruby and science ? — Phillip Gawlowski <cmdjackryan@...> 2010/12/17

On Friday, December 17, 2010, Michel Demazure <michel@demazure.com> wrote:

[#375584] Re: Ruby and science ? — Michel Demazure <michel@...> 2010/12/17

Phillip Gawlowski wrote in post #969006:

[#375557] Re: Ruby and science ? — Tony Arcieri <tony.arcieri@...> 2010/12/16

On Thu, Dec 16, 2010 at 3:19 AM, Michel Demazure <michel@demazure.com>wrote:

[#375560] Re: Ruby and science ? — Michel Demazure <michel@...> 2010/12/16

Tony Arcieri wrote in post #968904:

[#375567] Re: Ruby and science ? — Colin Bartlett <colinb2r@...> 2010/12/16

On Thu, Dec 16, 2010 at 6:36 PM, Michel Demazure <michel@demazure.com>wrote:

[#375664] Re: Ruby and science ? — Charles Oliver Nutter <headius@...> 2010/12/18

On Thu, Dec 16, 2010 at 5:15 PM, Colin Bartlett <colinb2r@googlemail.com> wrote:

[#375675] Re: Ruby and science ? — "ara.t.howard" <ara.t.howard@...> 2010/12/18

[#375681] Re: Ruby and science ? — Charles Oliver Nutter <headius@...> 2010/12/19

On Sat, Dec 18, 2010 at 1:00 PM, ara.t.howard <ara.t.howard@gmail.com> wrot=

[#375687] Re: Ruby and science ? — James Edward Gray II <james@...> 2010/12/19

On Dec 18, 2010, at 6:24 PM, Charles Oliver Nutter wrote:

[#375590] Is programming art? — Yu-Hsuan Lai <raincolee@...>

(I'm a high school student confused by this concept)

23 messages 2010/12/17

[#375706] Regexp, String, Symbol literals' object_ids — "Pavel R." <pavel.rosputko@...>

Regexp literals:

14 messages 2010/12/19

[#375725] downloading a file — Rajinder Yadav <devguy.ca@...>

hello what is the best way to download a file?

12 messages 2010/12/20

[#375787] how to know a search result is successfully displayed through its source codes — Fan Jin <jeff_yq@...>

I am working on a project where need to search a keyword by using simple

9 messages 2010/12/21
[#375805] Re: how to know a search result is successfully displayed through its source codes — Jeremy Bopp <jeremy@...> 2010/12/21

On 12/21/2010 01:24 AM, Fan Jin wrote:

[#375839] gem install ruby-debug-ide errors don't give me anything to look for. — Kedar Mhaswade <kedar.mhaswade@...>

Hope I am not missing something obvious. I have searched high and low.

11 messages 2010/12/22

[#375908] What is the the best style and theory of writing a complier in your language — small Pox <smallpox911@...>

What is the the best style and theory of writing a complier in your

8 messages 2010/12/23

[#375921] Numeric comparison with nil - Math masochists only!! — serialhex <serialhex@...>

Alright, i'm trying to do three things at once, and I'm almost succeeding.

17 messages 2010/12/24
[#375950] Re: Numeric comparison with nil - Math masochists only!! — Colin Bartlett <colinb2r@...> 2010/12/24

On Fri, Dec 24, 2010 at 3:45 AM, serialhex <serialhex@gmail.com> wrote:

[#375955] Re: Numeric comparison with nil - Math masochists only!! — serialhex <serialhex@...> 2010/12/25

Colin, your amazing insight has led me to programming greatness!!!

[#376011] Re: Numeric comparison with nil - Math masochists only!! — Robert Klemme <shortcutter@...> 2010/12/27

On Sat, Dec 25, 2010 at 2:34 AM, serialhex <serialhex@gmail.com> wrote:

[#376053] Re: Numeric comparison with nil - Math masochists only!! — serialhex <serialhex@...> 2010/12/28

hey robert, thanks for the great article, i'll keep that stuff in mind as

[#376057] Re: Numeric comparison with nil - Math masochists only!! — Everett L Williams II <rett@...> 2010/12/28

serialhex wrote:

[#376063] Re: Numeric comparison with nil - Math masochists only!! — serialhex <serialhex@...> 2010/12/28

>

[#376060] From python to ruby — AM <al.ma@...>

Hello

18 messages 2010/12/28

[#376066] Should I learn Ruby? — Din Ibbles <d.sp@...>

I am wondering whether to learn Ruby, as I would like to get a job after

21 messages 2010/12/28

[#376075] convert String "1;2;3;4;5;" to Array [1, 2, 3, 4, 5] — "Thomas T." <tthackery@...>

I'm trying to convert a String of numbers that are separated by

10 messages 2010/12/28

[#376153] Parsing the Ruby File — "Thillai S." <thillaiselvan@...>

Hai any one pls guide me...

15 messages 2010/12/30

Re: Help with net/http

From: Alex Stahl <astahl@...5.com>
Date: 2010-12-09 21:02:59 UTC
List: ruby-talk #375273
Nokogiri provides a great interface for accessing the data trapped
inside markup.

Try something like:

page = Nokogiri::HTML res.body
data = []
page.xpath("//xpath/to/table").each do |node|
  data << node.xpath("./rel/xpath/to/data/text()")
end




________________________________________________________________________

Alex Stahl | Sr. Quality Engineer | hi5 Networks, Inc. | astahl@hi5.com
| 

On Thu, 2010-12-09 at 14:43 -0600, Atomic Bomb wrote:

> I am trying to screen scrape a webpage and pull out the name, address,
> city, state, zip and phone on a site that lists apartments for rent.
> 
> Here is my code:
> ------------------------
>    temparray = Array.new
> 
>    url = URI.parse("http://www.apartment-directory.info")
>    res = Net::HTTP.start(url.host, url.port) {|http|
>    http.get('/connecticut/0')
>    }
>    # puts res.body
> 
>    res.body.each_line {|line|
>       line.gsub!(/\"/, '')
>      temparray.push(line) if line =~ /<td\svalign=top/
>       }
>           temparray.each do |j|
>              # j.gsub!(/<a\shref=\/map.*<\/a>/,'')
>               j.gsub!(/\shref=\/map\//,'')
>               j.gsub!(/\d+\sclass=map>Map\&nbsp\;It!/,'')
>               j.gsub!(/<\/td>/,'')
>               j.gsub!(/<td\svalign=top>/, '')
>               j.gsub!(/<td\svalign=top\snowrap>/, '')
>               j.gsub!(/<tr\sbgcolor=white>/, '<br>')
>               j.gsub!(/MapIt!/, ', ')
>               j.gsub!(/\(/, ', (')
>               j.gsub!(/<\/tr>/,'')
> 
>            puts j
>        }
>             end
> ----------------------
> I am able to grab the HTML from the page, I then gsub! out a " sign
> then push each line that starts with <td valign=top onto an array. I
> then iterate through the array and try to remove what I don't want with
> more gsub! commands. The output from this still has HTML tags on it and
> looks good if I output it to a html page (you can see the output here:
> http://www.holy-name.org/ct.html) but I really need to remove the HTML
> tags and get just the important facts into a CSV file. Since there are 4
> elements in the array for each record, the only way I could get it to
> work on a web page was to add a <br> between records.
> 
> Is there a better way to pull out the pertinent info and avoid all the
> HTML tags?
> 
> thanks
> 
> atomic
> 

In This Thread