RSS Reader Using Hugo

Jul 02, 2025

Hey 👋,

inspired by this RSS Server Side Reader idea I created my own within Hugo.

Find it at /reader!

Why?

For the last year, I have been trying to find interesting blogs to follow. Every time I found one I added it to Feedly. That was nice, but if I didn’t check it in a while content would pile up.

After reading the aforementioned post, I thought that was a nice and simple idea. And I love nice and simple ideas so… I decided to implement it for myself.

Generating the feed

With the help of Claude Code I created the following PHP script at scripts/generate-reader.php. I used PHP because it’s the one I am most comfortable with, but any similar script should work.

💡 You can ask your favorite AI tool to rewrite it into your preferred language. Example: python, javascript, etc.

<?php

// Local feeds configuration - replace with your desired RSS feeds
const FEEDS = [
    [
        'rss_url' => 'https://aaron.com.es/index.xml',
    ],
    // Add more feeds here as needed
];

const OUTPUT_FILE = __DIR__ . '/../content/reader/index.html';
const CACHE_DIR = '/tmp/generate-reader-cache';
const CACHE_DURATION = 24 * 60 * 60; // 24 hours
const MAX_AGE_DAYS = 30; // Only show entries from last 30 days

function getCacheKey($url) {
    return CACHE_DIR . '/' . md5($url) . '.cache';
}

function fetchRSSFeed($url) {
    // Check cache first
    $cacheFile = getCacheKey($url);
    if (file_exists($cacheFile) && (time() - filemtime($cacheFile)) < CACHE_DURATION) {
        echo "Using cached data for: $url\n";
        $content = file_get_contents($cacheFile);
    } else {
        echo "Fetching fresh data for: $url\n";
        $context = stream_context_create([
            'http' => [
                'timeout' => 10,
                'user_agent' => 'Mozilla/5.0 (compatible; PHP RSS Reader)'
            ]
        ]);

        $content = file_get_contents($url, false, $context);
        if ($content === false) {
            return null;
        }

        // Ensure cache directory exists
        if (!is_dir(CACHE_DIR)) {
            mkdir(CACHE_DIR, 0755, true);
        }

        // Save to cache
        file_put_contents($cacheFile, $content);
    }

    libxml_use_internal_errors(true);
    $xml = simplexml_load_string($content);
    if ($xml === false) {
        return null;
    }

    return $xml;
}

function parseRSSFeed($xml) {
    $items = [];

    // Handle RSS format
    if (isset($xml->channel->item)) {
        foreach ($xml->channel->item as $item) {
            $pubDate = (string)$item->pubDate;
            $timestamp = $pubDate ? strtotime($pubDate) : 0;

            $items[] = [
                'title' => (string)$item->title,
                'link' => (string)$item->link,
                'date' => $timestamp ?: time(),
                'dateString' => $pubDate ?: date('Y-m-d')
            ];
        }
    }
    // Handle Atom format
    elseif (isset($xml->entry)) {
        foreach ($xml->entry as $entry) {
            $published = (string)$entry->published;
            $updated = (string)$entry->updated;
            $dateStr = $published ?: $updated;
            $timestamp = $dateStr ? strtotime($dateStr) : 0;

            $link = '';
            if (isset($entry->link)) {
                if (isset($entry->link->attributes()->href)) {
                    $link = (string)$entry->link->attributes()->href;
                } else {
                    $link = (string)$entry->link;
                }
            }

            $items[] = [
                'title' => (string)$entry->title,
                'link' => $link,
                'date' => $timestamp ?: time(),
                'dateString' => $dateStr ?: date('Y-m-d')
            ];
        }
    }

    // Sort by date descending and take first 3
    usort($items, fn($a, $b) => $b['date'] - $a['date']);
    return array_slice($items, 0, 3);
}

function generateHTML($allEntries) {
    $html = "---\ntitle: Reader\ntype: page\n---\n\n";
    $html .= "<link rel=\"stylesheet\" href=\"/css/reader.css\">\n\n";
    $html .= "<p>These are the most recent articles from the blogs/sites I follow. Updated daily.</p>\n\n";

    // Group entries by date
    $entriesByDate = [];
    foreach ($allEntries as $entry) {
        $date = date('Y-m-d', $entry['date']);
        if (!isset($entriesByDate[$date])) {
            $entriesByDate[$date] = [];
        }
        $entriesByDate[$date][] = $entry;
    }

    // Generate HTML grouped by date
    foreach ($entriesByDate as $date => $entries) {
        $html .= "<h3 class=\"reader-date-header\">{$date}</h3>\n";
        $html .= "<ul class=\"reader-list\">\n";

        foreach ($entries as $entry) {
            // Escape HTML for safety
            $articleTitle = htmlspecialchars($entry['title'], ENT_QUOTES);
            $articleUrl = htmlspecialchars($entry['link'], ENT_QUOTES);
            $blogTitle = htmlspecialchars($entry['blog_title'], ENT_QUOTES);

            $html .= "<li>\n";
            $html .= "  <a class=\"reader-link\" href=\"{$articleUrl}\" target=\"_blank\">{$articleTitle}</a> ";

            $html .= "<span class=\"reader-blog\">{$blogTitle}</span>";

            $html .= "\n</li>\n";
        }

        $html .= "</ul>\n\n";
    }

    $html .= "<p><em>Generated on " . date('Y-m-d H:i:s') . "</em></p>\n";

    return $html;
}

try {
    echo "Processing local feeds...\n";
    $feeds = FEEDS;

    $allEntries = [];
    $cutoffTimestamp = time() - (MAX_AGE_DAYS * 24 * 60 * 60);
    $feedsWithoutRecentEntries = [];

    foreach ($feeds as $feed) {
        if (empty($feed['rss_url'])) {
            echo "Skipping feed with empty rss_url: " . ($feed['url'] ?: 'unknown') . "\n";
            continue;
        }

        echo "Processing: {$feed['rss_url']}\n";

        $xml = fetchRSSFeed($feed['rss_url']);
        if ($xml === null) {
            echo "Failed to fetch or parse: {$feed['rss_url']}\n";
            continue;
        }

        $items = parseRSSFeed($xml);
        $recentItems = [];

        foreach ($items as $item) {
            if ($item['date'] >= $cutoffTimestamp) {
                $item['blog_title'] = parse_url($feed['rss_url'], PHP_URL_HOST);
                $recentItems[] = $item;
                $allEntries[] = $item;
            }
        }

        if (empty($recentItems)) {
            $feedsWithoutRecentEntries[] = [
                'url' => $feed['url'] ?: $feed['rss_url'],
                'title' => parse_url($feed['url'] ?: $feed['rss_url'], PHP_URL_HOST),
                'oldest_entry_date' => !empty($items) ? date('Y-m-d', max(array_column($items, 'date'))) : 'No entries found'
            ];
        }

        echo "Found " . count($items) . " total items, " . count($recentItems) . " within last " . MAX_AGE_DAYS . " days\n";
    }

    // Sort all entries chronologically (newest first)
    usort($allEntries, fn($a, $b) => $b['date'] - $a['date']);

    // Ensure output directory exists
    $outputDir = dirname(OUTPUT_FILE);
    if (!is_dir($outputDir)) {
        mkdir($outputDir, 0755, true);
    }

    // Generate and write HTML
    $html = generateHTML($allEntries);
    file_put_contents(OUTPUT_FILE, $html);

    echo "Generated " . count($allEntries) . " total entries in " . OUTPUT_FILE . "\n";

    if (!empty($feedsWithoutRecentEntries)) {
        echo "\n⚠️  NOTICE: The following sources have no entries within the last " . MAX_AGE_DAYS . " days:\n";
        foreach ($feedsWithoutRecentEntries as $feed) {
            echo "  - {$feed['title']} (latest: {$feed['oldest_entry_date']})\n";
        }
        echo "\n";
    }

    echo "Done!\n";

} catch (Exception $e) {
    echo "Error: " . $e->getMessage() . "\n";
    exit(1);
}

This script:

Contains a list of RSS feeds to fetch (FEEDS constant).
Fetches the content for each of them (fetchRSSFeed function).
Caches the contents for 24h so we don’t hammer the servers unnecessarily during testing.
Parses the RSS/Atom feeds and keeps the 3 most recent items (parseRSSFeed function) .
Filters out the articles that are older than 30 days.
Generates HTML with frontmatter header and necessary CSS (generateHTML function).

📝 The script above is not exactly the same I am using. I am retrieving the list of feeds from my beloved API instead of the local FEEDS constant.

Note-to-self: I seriously need to talk about the features I add to my API.

Apart from the script I have added:

some styling at static/css/reader.css to make it look nicer.
a layout file at layouts/page/single.html, which will be used because I set the type: page in the frontmatter, but you might not need this. I just didn’t want to add some of the partials I use in my default layout in that page (backlinks, comments, etc).

How to make it run?

You can, of course run it manually php scripts/generate-reader.php && hugo. But I already have a GitHub Action that deploys my blog on every new commit to the main branch. So I added a new workflow to generate the reader daily at 00:00 UTC at .github/workflows/generate-reader.yml:

name: Generate Reader RSS

on:
  schedule:
    - cron: '0 0 * * *'  # Daily at 00:00 UTC
  workflow_dispatch:

jobs:
  generate-reader:
    runs-on: ubuntu-latest
    steps:
      - name: Checkout code
        uses: actions/checkout@v2
        with:
          fetch-depth: 0
      
      - name: Setup PHP
        uses: shivammathur/setup-php@v2
        with:
          php-version: '8.1'
      
      - name: Generate reader RSS
        run: php scripts/generate-reader.php
      
      - name: Setup GIT configuration
        run: |
          git config user.name github-actions
          git config user.email github-actions@github.com          
      
      - name: Commit and push changes
        run: |
          git add content/reader/
          if git diff --staged --quiet; then
            echo "No changes to commit"
          else
            git commit -m "Auto-generate reader RSS feed"
            git push origin master
          fi

One could easily add hugo generation in the workflow itself (since the script is only generating the content/ file). But in my case I wanted to reuse my existing “build and deploy” workflow. So I edited it, so that it is triggered when this new workflow completes.

on:
  workflow_run:
    workflows: ["generate-reader.yml"]
    types:
      - completed

📝 This might sound confusing because I haven’t shared my “build and deploy” workflow yet. If this is strange to you send me a comment mentioning this and I will try to clarify better or give a more complete example.

Hope you like this! And feel free to peek on what I am reading anytime! You might find the blogs as interesting as I do.

As always, if this was helpful. Please, click the like button below!

Send a comment

👍