php / html: replace html closing tags with newlines


I am crawling the web for html, and when I use php strip_tags it smushes the entire html into one line removing all structure.

I would like to preserve structure, by replacing closing h, p and br tags with newlines.

Would a preg replace be the best solution for this?

Once I replaced all closing tags I would run a strip tags but this way I would have a basic structure.




$str = 'some html';
$tags = array('</p>','<br />','<br>','<hr />','<hr>','</h1>','</h2>','</h3>','</h4>','</h5>','</h6>');
$str = str_replace($tags,"\n",$str);

// then strip tags



Why not just run it through tidy afterwords to get the structure back?

People are also looking for solutions to the problem: php - What does superuser mean in the rights-extension of Yii?


Didn't find the answer?

Our community is visited by hundreds of web development professionals every day. Ask your question and get a quick answer for free.

Ask a Question

Write quick answer

Do you know the answer to this question? Write a quick response to it. With your help, we will make our community stronger.

Similar questions

Find the answer in similar questions on our website.