I want my RSS valid!

I am making a set of components that other developers at work are using. I have made a component for fetching remote content that handles timeout, caching and so forth. I have also made a component that uses this to fetch RSS feeds.

The method that fetches the RSS feeds looked a little something like:

protected function _get($url) {
    $content = parent::_get($url);
    return simplexml_load_string($content);
}

Now, this works fine as long as $url points to a valid RSS feed. When it doesn’t, simplexml_load_string throws some warnings but returns a valid SimpleXMLElement anyhow. I would rather have the method throwing an exception than returning a fubar SimpleXMLElement. After a quick fix I landed on something like this:

protected function _get($url) {
    $contents = parent::_get($url);

    // enable libxml internal errors
    libxml_use_internal_errors(true);

    // clear possible libxml errors
    libxml_clear_errors();

    // generate SimpleXMLElement object
    $sxml = simplexml_load_string($contents);

    // do we have any errors?
    if (($error = libxml_get_last_error()) !== false) {
        throw new VG_Wget_Rss_Exception('Invalid RSS: ' . $error->message);
    }

    // yey! Valid RSS, return the SimpleXMLElement
    return $sxml;
}

So … don’t even think about providing any invalid RSS feeds to us!

Advertisements
This entry was posted in PHP, Technology, Work related and tagged , , . Bookmark the permalink.

2 Responses to I want my RSS valid!

  1. Sniper says:

    What about use tidy for fix invalid RSS feed? I think RSS is normal XML document, so tidy should be able to fix it.

  2. christer says:

    @Sniper: The problem arose when I was fetching a feed from our discussion board that uses pretty urls in the form of:

    http://domain/category/forum/format/rss

    When a developer specifies a non-existing category in the component I made the discussion board redirects the client to the front page which caused some problems.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s