php - Parse urls via curl from one page, loop curl_init(urls) and go over them and get div block
I need to parse one page via curl, get urls, go again through parsed urls, use each url and get div block.
For example:
optnow.ru/catalog - it's page with catalog url's, get link with 'cat-name' class and add '?page=0' for view all products without navigation
Go through each catalog's url, parse product's urls with class 'link-pv-name'
- Go through each parsed product's url and parse '.description div p' element.
It'll be good if I use curl, not 'simple html dom', because I'm tryied use this library and after n-time I was get 503 or 504 error.
When I use this code
$curl = curl_init('http://optnow.ru/catalog/');
curl_setopt($curl, CURLOPT_RETURNTRANSFER, true);
curl_setopt($curl, CURLOPT_FOLLOWLOCATION, true);
$html = curl_exec($curl);
curl_close($curl);
$doc = new DOMDocument();
$doc->loadHTML($html);
$xpath = new DOMXpath($doc);
$elements = $xpath->query("/div[@class='cat-name']/a");
print_r($elements);
I'm getting
Warning: DOMDocument::loadHTML(): Tag header invalid in Entity, line: 100
Warning: DOMDocument::loadHTML(): Tag figure invalid in Entity, line: 102
Warning: DOMDocument::loadHTML(): htmlParseEntityRef: expecting ';' in Entity, line: 107
Warning: DOMDocument::loadHTML(): Tag footer invalid in Entity, line: 268