New Compaq laptop and OfficeMax I got a PSP
Jan 19

I couldn’t find a good simple explanation of this online, I had to look at some examples and figure it out so I thought I’d post this for myself and others. Here is how to get all links on a page using Hpricot:


  def get_links(doc)
    urls = []
    unfiltered_links = (doc/"a")
    unfiltered_links.each { |alink|
      urls < <  alink.attributes['href']
    }
    return urls
  end

Tags: , , ,

3 Responses to “How to get all links on a page with Hpricot”

  1. devJ Says:

    do you have any examples to find all forms (GET and POST) in a web page

  2. p3t0r Says:

    If you use the Enumeration# (or collect) method the code would be much easier:

    def get_links(doc)
    (doc/”a”).map{|alink| alink.attributes[’href’]}
    end

  3. p3t0r Says:

    I meant to say ‘Enumeration#map’

Leave a Reply

*
To prove you're a person (not a spam script), type the security word shown in the picture.
Anti-Spam Image