Displaying an RSS Feed
One
of the "secrets" of successful web sites is that they keep users
coming back. So what can you do that will attract users and keep them coming
back? You attract them by providing information that is useful to the user
and you keep them coming back by continually updating that information so that
it's fresh each time the user visits.
Creating a web site is the easy part. Keeping it fresh is an ongoing challenge
that can be a lot of work. Fortunately, there is a tool that can make that
job a lot easier. That tool is called RSS. RSS is an acronym meaning Really
Simple Syndication. RSS is a small subset of XML.
fresh is an ongoing challenge that can be a lot of work.”
There are a lot of free "RSS feeds" that allow you to get a multitude
of different things. You can display that RSS data in a specialized application
called a news aggregator, or (as we want) you can display it in a web page.
While, as I say, there are a lot of free feeds, they are not totally benevolent
exercises. Normally, an RSS feed will provide a short description of an item
and provide a link to the complete item on the provider's web site. In other
words, you can display a short description of a news story, for example, but
the complete story will be displayed on the provider's site, which you can
link to.
This is a simple example of using RSS to get a news feed from wired.com.
To begin, the URL to get the raw XML file for the RSS feed is http://www.wired.com/news/feeds/rss2/0,2610,,00.xml.
If you click that link, you'll see the XML that is served by the RSS feed.
Note that it's just a raw XML file, so you'll have to use your browser's Back
button to get back to this page.
So, how do we turn that XML into a presentable page that you can display
on your web site? It's not as hard as you might think! PHP has built-in functions
that will parse an XML file. I've used those functions to build a generic wrapper
that will accept a URL to an RSS feed and parse it into a PHP object. The usage
is hopefully easy.
UPDATE: The rssFeed class has been updated 3 May, 2005.
The original version used the PHP function file_get_contents() to get the
content of the remote XML file. Recently, however, my web host disabled the
allow_url_fopen setting in PHP. That prevented the file_get_contents() function
from getting the content of any remote files.
At first, I was a bit puzzled and perhaps a little angry about that. Then
I saw what some people were doing with those functions. Some people
were actually grabbing code from remote sites and executing it, without
the slightest idea what was in the files! This was creating a huge problem
for the host. Note that the previous version of the rssFeed class did
not execute foreign code. It simply parsed the XML data. There was never
a security problem with it. The problem was in the way some people were
using some of the functions the class relied upon.
After understanding
the problem better, I agreed that this was a good move on their part.
My host suggested using the PHP cURL library instead. I tried that, but found
it to be less reliable. Never wanting to admit defeat, I re-wrote the class
to use direct socket I/O. It now works reliably.
If you have an older version, I'd suggest you download the updated version.
It has no external dependencies. You don't need the allow_url_fopen
setting to be turned on and you don't need the cURL library. Usage is
exactly the same. You will not need to change any existing code that
uses the class. Simply replace the old rssreader.php file with the new
one.
rssreader.php
<?php
// Generic container for the complete RSS feed
class rssFeed{
var $title="";
var $copyright="";
var $description="";
var $image;
var $stories=array();
var $url="";
var $xml="";
var $link="";
var $error="";
var $maxstories=0;
// public methods
function parse(){
$parser=xml_parser_create();
xml_set_element_handler($parser, "startElement", "endElement");
xml_set_character_data_handler($parser, "characterData");
xml_parse($parser, $this->xml, true)
or die(sprintf("XML error: %s at line %d",
xml_error_string(xml_get_error_code($parser)),
xml_get_current_line_number($parser)));
xml_parser_free($parser);
}
function showHeading($tag=""){
$tag=$tag?$tag:"h1";
if($this->title)
print "<$tag>$this->title</$tag>\n";
}
function showImage($align=""){
$this->image->show($align);
}
function showLink(){
if($this->link)
print "<a href=\"$this->link\">$this->link</a>\n";
}
function showDescription(){
if($this->description)
print "<p>$this->description</p>\n";
}
function showStories(){
echo "<dl>\n";
$n=0;
foreach($this->stories as $story){
$n++;
if ($this->maxstories && $n>$this->maxstories)
break;
$story->show();
}
echo "</dl>\n";
}
// Methods used internally
// Constructor: Expects one string parameter that is the URI of the RSS feed
function rssFeed($uri=''){
$this->image=new rssImage();
if($uri){
$this->url=$uri;
$this->getFeed();
} else {
$this->error="No URL for RSS feed";
}
}
// Retrieves the XML from the RSS supplier
function getFeed(){
// if we have a URL
if ($this->url){
if (extension_loaded('curl')) {
$this->xml=$this->getRemoteFile($this->url);
}
}
}
function getRemoteFile($url){
$s=new gwSocket();
if($s->getUrl($url)){
if(is_array($s->headers)){
$h=array_change_key_case($s->headers, CASE_LOWER);
if($s->error) // failed to connect with host
$buffer=$this->errorReturn($s->error);
elseif(preg_match("/404/",$h['status'])) // page not found
$buffer=$this->errorReturn("Page Not Found");
elseif(preg_match("/xml/i",$h['content-type'])) // got XML back
$buffer=$s->page;
else // got a page, but wrong content type
$buffer=$this->errorReturn("The server did not return XML. The content type returned was ".$h['content-type']);
} else {
$buffer=$this->errorReturn("An unknown error occurred.");
}
}else{
$buffer=$this->errorReturn("An unknown error occurred.");
}
return $buffer;
}
function errorReturn($error){
$retVal="<?xml version=\"1.0\" ?>\n".
"<rss version=\"2.0\">\n".
"\t<channel>\n".
"\t\t<title>Failed to Get RSS Data</title>\n".
"\t\t<description>An error was ecnountered attempting to get the RSS data: $error</description>\n".
"\t\t<pubdate>".date("D, d F Y H:i:s T")."</pubdate>\n".
"\t\t<lastbuilddate>".date("D, d F Y H:i:s T")."</lastbuilddate>\n".
"\t</channel>\n".
"</rss>\n";
return $retVal;
}
function addStory($o){
if(is_object($o))
$this->stories[]=$o;
else
$this->error="Type mismatach: expected object";
}
}
class rssImage{
var $title="";
var $url="";
var $link="";
var $width=0;
var $height=0;
function show($align=""){
if($this->url){
if($this->link)
print "<a href=\"$this->link\">";
print "<img src=\"$this->url\" style=\"border:none;\"";
if($this->title)
print " alt=\"$this->title\"";
if($this->width)
print " width=\"$this->width\" height=\"$this->height\"";
if($align)
print " align=\"$align\"";
print ">";
if($this->link)
print "</a>";
}
}
}
class newsStory{
var $title="";
var $link="";
var $description="";
var $pubdate="";
function show(){
if($this->title){
if($this->link){
echo "<dt><a href=\"$this->link\">$this->title</a></dt>\n";
}elseif($this->title){
echo "<dt>$this->title</a></dt>\n";
}
echo "<dd>";
if($this->pubdate)
echo "<i>$this->pubdate</i> - ";
if($this->description)
echo "$this->description";
echo "</dd>\n";
}
}
}
class gwSocket{
var $Name="gwSocket";
var $Version="0.1";
var $userAgent="Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)";
var $headers;
var $page="";
var $result="";
var $redirects=0;
var $maxRedirects=3;
var $error="";
function getUrl( $url ) {
$retVal="";
$url_parsed = parse_url($url);
$scheme = $url_parsed["scheme"];
$host = $url_parsed["host"];
$port = $url_parsed["port"]?$url_parsed["port"]:"80";
$user = $url_parsed["user"];
$pass = $url_parsed["pass"];
$path = $url_parsed["path"]?$url_parsed["path"]:"/";
$query = $url_parsed["query"];
$anchor = $url_parsed["fragment"];
if (!empty($host)){
// attempt to open the socket
if($fp = fsockopen($host, $port, $errno, $errstr, 2)){
$path .= $query?"?$query":"";
$path .= $anchor?"$anchor":"";
// this is the request we send to the host
$out = "GET $path ".
"HTTP/1.0\r\n".
"Host: $host\r\n".
"Connection: Close\r\n".
"User-Agent: $this->userAgent\r\n";
if($user)
$out .= "Authorization: Basic ".
base64_encode("$user:$pass")."\r\n";
$out .= "\r\n";
fputs($fp, $out);
while (!feof($fp)) {
$retVal.=fgets($fp, 128);
}
fclose($fp);
} else {
$this->error="Failed to make connection to host.";//$errstr;
}
$this->result=$retVal;
$this->headers=$this->parseHeaders(trim(substr($retVal,0,strpos($retVal,"\r\n\r\n"))));
$this->page=trim(stristr($retVal,"\r\n\r\n"))."\n";
if(isset($this->headers['Location'])){
$this->redirects++;
if($this->redirects<$this->maxRedirects){
$location=$this->headers['Location'];
$this->headers=array();
$this->result="";
$this->page="";
$this->getUrl($location);
}
}
}
return (!$retVal="");
}
function parseHeaders($s){
$h=preg_split("/[\r\n]/",$s);
foreach($h as $i){
$i=trim($i);
if(strstr($i,":")){
list($k,$v)=explode(":",$i);
$hdr[$k]=substr(stristr($i,":"),2);
}else{
if(strlen($i)>3)
$hdr[]=$i;
}
}
if(isset($hdr[0])){
$hdr['Status']=$hdr[0];
unset($hdr[0]);
}
return $hdr;
}
}
/*
end of classes - global functions follow
*/
function startElement($parser, $name, $attrs) {
global $insideitem, $tag, $isimage;
$tag = $name;
if($name=="IMAGE")
$isimage=true;
if ($name == "ITEM") {
$insideitem = true;
}
}
function endElement($parser, $name) {
global $insideitem, $title, $description, $link, $pubdate, $stories, $rss, $globaldata, $isimage;
$globaldata=trim($globaldata);
// if we're finishing a news item
if ($name == "ITEM") {
// create a new news story object
$story=new newsStory();
// assign the title, link, description and publication date
$story->title=trim($title);
$story->link=trim($link);
$story->description=trim($description);
$story->pubdate=trim($pubdate);
// add it to our array of stories
$rss->addStory($story);
// reset our global variables
$title = "";
$description = "";
$link = "";
$pubdate = "";
$insideitem = false;
} else {
switch($name){
case "TITLE":
if(!$isimage)
if(!$insideitem)
$rss->title=$globaldata;
break;
case "LINK":
if(!$insideitem)
$rss->link=$globaldata;
break;
case "COPYRIGHT":
if(!$insideitem)
$rss->copyright=$globaldata;
break;
case "DESCRIPTION":
if(!$insideitem)
$rss->description=$globaldata;
break;
}
}
if($isimage){
switch($name){
case "TITLE": $rss->image->title=$globaldata;break;
case "URL": $rss->image->url=$globaldata;break;
case "LINK": $rss->image->link=$globaldata;break;
case "WIDTH": $rss->image->width=$globaldata;break;
case "HEIGHT": $rss->image->height=$globaldata;break;
}
}
if($name=="IMAGE")
$isimage=false;
$globaldata="";
}
function characterData($parser, $data) {
global $insideitem, $tag, $title, $description, $link, $pubdate, $globaldata;
if ($insideitem) {
switch ($tag) {
case "TITLE":
$title .= $data;
break;
case "DESCRIPTION":
$description .= $data;
break;
case "LINK":
$link .= $data;
break;
case "PUBDATE":
case "DC:DATE":
$pubdate .= $data;
break;
}
} else {
$globaldata.=$data;
}
}
?>
Example Code
<?php
// Include the file that does all the work
include("rssreader.php");
// This is the URL to the actual RSS feed. Change this value
// if you want to show a different feed.
$url="http://www.wired.com/news/feeds/rss2/0,2610,,00.xml";
// Create an instance of the rssFeed object, passing it
// the URL of the feed
$rss=new rssFeed($url);
// If there was an error getting the data
if($rss->error){
// Show the error
print "<h1>Error:</h1>\n<p><strong>$rss->error</strong></p>";
}else{
// Otherwise, we have the data, so we call the parse method
$rss->parse();
// The showHeading can accept a paramater that will be used
// as the tag to wrap the heading. In this case, we're wrapping
// the title in an <h1> tag
$rss->showHeading("h1");
// Display the image if there is one
$rss->showImage("left");
// If the RSS feed provides a link
if($rss->link){
// Display it
print "<p>Provided courtesy of:<br>\n";
$rss->showLink();
}
// Display the description
$rss->showDescription();
// Show the news stories
$rss->showStories();
}
?>
So, what does that do? Well, here's an
example using the exact code above, with minimal attempt at styling. What
could be easier?
Styling the Output
You have your choice of tags with which to wrap the title displayed by the
showHeading method. Simply pass the tag you want to use (without the angle
brackets) to the showHeading call. For example, to wrap the heading in <h2>
tags,
call the method as
showHeading("h2")
You can use whatever CSS you want to style the tag used to display the heading.
News stories are displayed in a definition list. The story headline is displayed
as a defined term, a <dt>, and the description is displayed as the definition,
a <dd>. You can use CSS to style those two tags like you want them to
appear.
If any of the styles conflict with other styles used in your page, I would
suggest simply wrapping the RSS data in a container, a <div>
for
example, and style that. Let's say we wanted to make the links green in our
RSS display:
#rss a{color: green;}
Then, just put the RSS output inside a <div id="rss">
:
<div id="rss">
<?php
$rss=new rssFeed($url);
...
</div>
Conclusion
I hope this is helpful to you. It should get you started on the road to providing
continually fresh information on your web site. Good luck!
src="http://pagead2.googlesyndication.com/pagead/show_ads.js">
No comments:
Post a Comment