|
Rethinking the Aggregator A while back I pondered two questions about my aggregator database:
I'm back to thinking about both of these questions. I have pulled in more than 1500 articles from 74 RSS feeds, and (needless to say) I've stored each article as a response doc to the parent doc that contains all the details of each feed, and I've become quite dependent on unread marks on the articles.
Why wouldn't I depend on unread marks? It's the way I've dealt with highly active Notes databases for years. The speed of navigation through a view and the in-your-face visual difference between read and unread docs allows me to quickly sort through the full collection of documents. I also have that great "next unread" smarticon to further speed up the process. Web users, however, will not want to have to scroll through thousands of links every time. won't always cache enough history to reliably see all their read links in a different color, and won't have a "next unread" button available -- so I've planned all along to offer a web UI that offers the option to filter out just the last day's worth of articles, the last five days, etc.
The problem with this, however, is that I'll either need to depend on agents (or servlets) for doing the time comparisons and presenting the web UI, have time-dependent views that are always out of date and cause lots of indexing grief as the database grows from 1500 to 15000 to 150000 entries, use the old @TextToTime() trick to have time dependent views that are always out of date but at least don't kill performance, or have a daily agent "age" the documents by setting a field that is referenced in selection formulas for the "one day", "five day", etc. views. The latter is the approach I would use for a pure web app, but it makes unread marks useless for me in the Notes client, so I really don't want to do it. Since none of the other techniques are appealing to me either, I'm wondering whether I should double-store the last 30 days worth of feed data: once as response docs for use by the Notes client, and once as entries in a table in the parent doc for use by the web UI...
Even though anyone with a relational background would cringe at the thought of double-storage, I'm very tempted to do this. The downside will be having to worry about blowing the summary limit. The upsides will be that the application will efficiently pre-compute elements required for the web UI, there will be no indexing complications, and the Notes UI's unread marks will continue to be useful.
|