|
Lots Of Broken Feeds I have, on occasion, emailed people to tell them about problems I've encountered pulling in their XML feeds in my bloggregator. The thing is, I don't often notice that someone's feed has gone bad. Usually the posts simply stop coming in, and this is indistinguishable from cases where the poster has simply taken a break, so I don't notice. Then the feed becomes good again, and there's a rush of posts and I realize that I haven't read anything from that particular source in a month. Recently, I added code to my aggregator to actually flag the bad feeds, and I've become aware that it's not just one or two feeds that are broken at any one time. I've decided to post the information about the broken feeds rather than email the individual bloggers. Why? I'm not out to embarass anyone, really. In some cases, I think the blog templates we're using could do a better job of making sure that the XML they produce is valid, and I want to get the word out about how many feeds are being affected. Also, I think that the Domino blogger community has gotten somethingpretty cool going, but it would benefit from wider exposure to the overall blogging community if blog template coders and blog maintainers do a better job of enforcing and checking the validity of feeds. So, here is my list of current broken feeds: Mal-Formed Dates and/or Time Travel
These are feeds that show up out of sequence in my aggregator's date/time sorted views, due to problems either with the date format or perhaps the system date/time setting. - Jack Dausman (Leadership By Numbers): PubDate tag of two most recent articles is "Fri, 9 Apr 2004 -1:-1:-1 -0400", which the parse in my bloggregator code decided was 1 September 2004.
- NotesTips): Pubdate tag of most recent article is Fri, 4 Jun 2004 10:02:38 -0100.
URL Issues
These are feeds whose published URLs don't seem to be pointing to an actual XML page. They either come back with nothing at all, an error page, an empty page of HTML, or a response from a totally unexpected server
XML Character or Entity Problems
These are feeds that generate errors in the DOM parser class in LotusScript (and in the IE 6 XML parser as well) due to invalid entities. The most common cause is an nbsp entity included in text. One or more of the blog templates that we're using probably needs a fix to deal with this.
Miscellaneous XML Issues
- Rob Novak This feed uses a namespace called 'content', but it isn't declared.
- Libby Schwarz (NotesGirl) There's a full feed there, but it appears to be nested inside an HTML document.
|