Creating your own RSS aggregator with Drupal

Even though sites such as Bloglines are quite useful in their easy aggregation of RSS content, I felt I wanted a bit more control. One way is to write your own aggregator, but another is to install an existing system on your own server. From several sides, I got the tip to install Drupal and see if that suited my needs. Since RSS aggregation is a default module in Drupal, that indeed sounded quite interesting.

Reading around a bit, I came by quite a few mentions of a quite superior API as well, which would be good at a later date, if I wanted to extend Drupal a bit more with my own functionality. Up until this date, I'll have to take people's word for this, because I haven't looked into the API yet.

What I did do, however, was to create my own RSS aggregation site. And it was literally just a matter of installing Drupal, configuring it correctly, and up up and away! It was up and running.

The only thing that not all servers may support and which might be useful, is the support of so called cron jobs, a system which allows you to schedule certain actions to be done on a regular basis.

My configuration is quite simple. Aside, obviously, from the site title, I changed a few settings and had everything running as I wanted it. First thing to change is the default front page, which is what the site loads on the homepage. I changed this to 'aggregator' to ensure the aggregator information is shown immediately on the homepage. I also enabled clean urls since my server has support for these and they look better.

Then I went on to configure the aggregator, which is nothing more than just adding categories and feeds to the system. You can manually fetch items for the first time, but when you have cron set up correctly, it'll automatically start fetching at the configured intervals.

One small issue I had which at first prevented me from actually running the automated import of new items was the Access control. You need to ensure that the anonymous user, which is the user that the cron jobs are run under, has the right to access the news feeds. That is enough for it to work.

Though still using a default template, I now have a fully running system that I already use on a regular basis to read tech-related weblogs.