php - Looping Through Multiple HTML Elements with DOMDocument

381

I have a page that looks something like this:

...
<div >

<div >
<h3>Info 1</h3>
<span >Title for Info 1</span>
<a href="http://www.example.com/1">Link to Example 1</a>
</div> <!-- /info -->

<div >
<h3>Info 2</h3>
<span >Title for Info 2</span>
<a href="http://www.example.com/2">Link to Example 2</a>
</div> <!-- /info -->

<div >
<h3>Info 3</h3>
<span >Title for Info 3</span>
<a href="http://www.example.com/3">Link to Example 3</a>
</div> <!-- /info -->

</div> <!-- /container -->
...

The structure of each of the info class divs is the same, and I'd like to be able to loop through the document and for each div with a class of info, parse the various component into either an array or individual variables for the purposes of outputting the data in some sort of human-readable format, like a csv file or HTML table.

I've tried using the DOMDocument method, and using getElementByTagName to extract the contents of each tag, but because the div contains multiple tag types (h3, a, span), I haven't figure out how to accomplish what I'm looking to do.

In the end, I want to be able to put the data in a format like this:

divclass, h3, spanclass, spantitle, ahref, a
info, Info 1, title, Title for Info 1, http://www.example.com/1, Link to Example 1
...

Thanks!

262

Answer

Solution:

<?php
$html = '
<div >

<div >
<h3>Info 1</h3>
<span >Title for Info 1</span>
<a href="http://www.example.com/1">Link to Example 1</a>
</div> <!-- /info -->

<div >
<h3>Info 2</h3>
<span >Title for Info 2</span>
<a href="http://www.example.com/2">Link to Example 2</a>
</div> <!-- /info -->

<div >
<h3>Info 3</h3>
<span >Title for Info 3</span>
<a href="http://www.example.com/3">Link to Example 3</a>
</div> <!-- /info -->

</div> <!-- /container -->
';


$dom_document = new DOMDocument();

$dom_document->loadHTML($html);
$dom_document->preserveWhiteSpace = false;

//use DOMXpath to navigate the html with the DOM
$dom_xpath = new DOMXpath($dom_document);

$elements = $dom_xpath->query("//*[@class='info']");

if (!is_null($elements)) {

 foreach ($elements as $element) {
  echo "\n[". $element->nodeName. "]";

  $nodes = $element->childNodes;
  foreach ($nodes as $node) {
   echo $node->nodeValue. "\n";
  }

 }
}

People are also looking for solutions to the problem: php - how to get details in html tags using preg_match_all

Source

Didn't find the answer?

Our community is visited by hundreds of web development professionals every day. Ask your question and get a quick answer for free.

Ask a Question

Write quick answer

Do you know the answer to this question? Write a quick response to it. With your help, we will make our community stronger.

Similar questions

Find the answer in similar questions on our website.