Scraper scripts often need to extract all links on a given page. This can be done in a number of ways like regex, domdocument etc.
Here is simple code snippet to do this using domdocument.
/* Function to get all links on a certain url using the DomDocument */ function get_links($link) { //return array $ret = array(); /*** a new dom object ***/ $dom = new domDocument; /*** get the HTML (suppress errors) ***/ @$dom->loadHTML(file_get_contents($link)); /*** remove silly white space ***/ $dom->preserveWhiteSpace = false; /*** get the links from the HTML ***/ $links = $dom->getElementsByTagName('a'); /*** loop over the links ***/ foreach ($links as $tag) { $ret[$tag->getAttribute('href')] = $tag->childNodes->item(0)->nodeValue; } return $ret; } //Link to open and search for links $link = "http://www.php.net"; /*** get the links ***/ $urls = get_links($link); /*** check for results ***/ if(sizeof($urls) > 0) { foreach($urls as $key=>$value) { echo $key . ' - '. $value . '<br >'; } } else { echo "No links found at $link"; }
Hello, I faced with the problem of drawing links from the web site. How to pull links from html I understood, but how to pull links are loaded dynamically – I do not understand. Please tell me how to pull the page link which redirects google advertising.