php - Parse urls via curl from one page, loop curl_init(urls) and go over them and get div block

442

I need to parse one page via curl, get urls, go again through parsed urls, use each url and get div block.

For example:

  1. optnow.ru/catalog - it's page with catalog url's, get link with 'cat-name' class and add '?page=0' for view all products without navigation

  2. Go through each catalog's url, parse product's urls with class 'link-pv-name'

  3. Go through each parsed product's url and parse '.description div p' element.

It'll be good if I use curl, not 'simple html dom', because I'm tryied use this library and after n-time I was get 503 or 504 error.

When I use this code

$curl = curl_init('http://optnow.ru/catalog/');
curl_setopt($curl, CURLOPT_RETURNTRANSFER, true);
curl_setopt($curl, CURLOPT_FOLLOWLOCATION, true);
$html = curl_exec($curl);
curl_close($curl);

$doc = new DOMDocument();
$doc->loadHTML($html);
$xpath = new DOMXpath($doc);
$elements = $xpath->query("/div[@class='cat-name']/a");
print_r($elements);

I'm getting

Warning: DOMDocument::loadHTML(): Tag header invalid in Entity, line: 100
Warning: DOMDocument::loadHTML(): Tag figure invalid in Entity, line: 102
Warning: DOMDocument::loadHTML(): htmlParseEntityRef: expecting ';' in Entity, line: 107
Warning: DOMDocument::loadHTML(): Tag footer invalid in Entity, line: 268

People are also looking for solutions to the problem: php - gettext is not giving result from .mo file, it is showing same as I am writing in echo

Source

Didn't find the answer?

Our community is visited by hundreds of web development professionals every day. Ask your question and get a quick answer for free.

Ask a Question

Write quick answer

Do you know the answer to this question? Write a quick response to it. With your help, we will make our community stronger.

Similar questions

Find the answer in similar questions on our website.