Parsing RSS At All Costs (O'Reilly)
[Posted January 29, 2003 by cook]
Mark Pilgrim
talks about dealing with malformed RSS data on O'Reilly.
"
As I said in last month's article, RSS is an XML-based format for syndicating news and news-like sites. XML was chosen, among other reasons, to make it easier to parse with off-the-shelf XML tools. Unfortunately in the past few years, as RSS has gained popularity, the quality of RSS feeds has dropped. There are now dozens of versions of hundreds of tools producing RSS feeds. Many have bugs. Few build RSS feeds using XML libraries; most treat it as text, by piecing the feed together with string concatenation, maybe (or maybe not) applying a few manually coded escaping rules, and hoping for the best."
(
Log in to post comments)