xml - Parsing Atom Feed in PHP
I've written the following PHP to parse RSS and Atom feeds. This approach differs from others I've seen in that it simply checks a couple of places in the parsed XML for the item array.
function LoadItems($id, $feed)
{
/* Load items into global $rssItems array */
$rssItems = array();
try {
$rss = simplexml_load_file($feed);
} catch (Exception $e) {
echo "<div>Load failed \"" . $feed . "\"</div>\n";
return;
}
if (!($rss->channel->item))
/* This appears to be where Atom feed item lists are parsed to */
$items = $rss->item;
else
$items = $rss->channel->item;
foreach ($items as $item) {
$item = array(
"id" => $id,
"feedTitle" => $rss->channel->title,
"feedLink" => $rss->channel->link,
"itemTitle" => $item->title,
"itemPubDate" => $item->pubDate,
"itemLink" => $item->link,
"itemDesc" => RemoveLinks($item->description));
array_push($rssItems, $item);
}
/*
* Sort all items from all feeds in reverse chronological
* order
*/
usort($rssItems, 'RSS_CMP');
return $rssItems;
}
My question is this. I discovered this by accident. I was just looking at the print_r output of the parsed XML files and noticed the different structures for RSS and Atom. Is this kosher to do? I mean, is simplexml_load_file going to put these items in this place for all atom feeds? Is this solution then applicable to all feeds that are atom that are parsed with simplexml_load_file()?
Answer
Solution:
As you already noticed, RSS and Atom are 2 different standards. They both use XML but the differ in their structure.
simplexml_load_file
in php is quite easy to use and you can check for specific parts of the feed (e.g. thechannel
in atom or rss starting withfeed
).But if you want to do a little bit more with the feed data I would recommend you to use some libary supporting rss and atom so you don't need to mess around with the differences.
e.g. with simplepi it's quite easy to access feed-data and it handels both standards (for help see wiki)