Wednesday 15 April 2015

php - Scrape links from HTML -


I'm always using preg_match to scrap URLs from HTML files but I Only wanted to remove the URL whose Mp3 as their extension. I was asked to try DOM and I am trying to fix a code but it does not work. I get a blank page which I do

What am I doing wrong?

  & lt ;? Php $ url = 'http://www.mp3olimp.net/miley-cyrus-when-i-look-at-you/'; $ Html = @file_get_html ($ url); $ Dom = new DOMDocument (); $ Doctor & gt; LoadHTML ($ HTML); $ Xpath = new DOMXPath ($ doc); $ Links = $ xpath- & gt; Query ('//A [end-of-life (@HFFA, mp3')] / @ href '; $ Link echo; ? & Gt;    

There are some problems!

  • As mentioned, remove file_get_html () before viewing to see the errors. Get HTML content that will work on
  • file_get_contents ($ url)
  • Type <, code = $ dom = should be $ doc =
  • Another upset point, the HTML source is quite flawed, Due to errors later on.
  • End-end () is only supported in XPath 2.0, uses PHP XPath 1.0. So you have to find another way to check the end of the rigax should do some bit of trick.

No comments:

Post a Comment