Vertretungsplan // Leon Wierling

// system architecture

vplan-aes-cf-worker — light mode

vplan-aes-cf-worker — dark mode

2CF Workers

1hCron Interval

60sCache TTL

∞Edge PoPs

// breakdown

vplan-worker — the snapshot engine [TypeScript]

▾

              01
              A Cloudflare Cron Trigger fires every hour. The worker launches a
              Puppeteer browser session (via @cloudflare/puppeteer) and
              navigates to both subst_001.htm and subst_002.htm on the Untis portal.
            

              02
              Before generating a PDF it scrapes the page for a date string — trying the
              title, a "Stand:" timestamp, header tags, and finally falling back to today's
              date if nothing is found.
            

              03
              Each page is printed to A4 PDF with
              page.pdf() and uploaded to R2 under the key
              DD.MM.YYYY/subst_00N.pdf with rich custom metadata (date, source URL, polledAt).
            

              04
              A separate GET /vplan/parse/:date endpoint extracts plain text from stored PDFs
              via unpdf (WASM, no Node) and runs a custom token-stream parser
              that handles the flat Untis table layout — splitting rows on repeating day+date anchors,
              expanding multi-class cells, and detecting Entfall via column 12 "x" markers.
            

vplan-aes-cf-worker — the API + frontend [JavaScript]

▾

              01
              The main worker fetches subst_001.htm and subst_002.htm
              live on each request (with a 60-second Cloudflare cache). It decodes
              the response with the correct charset from the Content-Type header (often ISO-8859-1 for German Umlauts).
            

              02
              A regex-based HTML parser extracts table rows matching
              .list* CSS classes, reconstructing 12-column substitution records.
              Entfall detection is tri-modal: explicit "x" in column 12, info field containing
              "entfall", or both teacher and room being empty dashes.
            

              03
              The admin dual-merge endpoint fetches both files in parallel via
              Promise.allSettled. It compares their "Stand:" timestamps to decide
              which is primary, then merges them into a deduplicated map keyed by
              datum|klasse|stunde|fach. Multi-class rows like "07A, 07B, 07C" are
              expanded into individual records.
            

              04
              The frontend is a React 18 + Tailwind SPA served via
              env.ASSETS. It auto-refreshes every 60 s, splits compound lesson
              slots like "1-2" into individual hour cards, and persists the class filter and
              view mode to localStorage.
            

ChatRoom Durable Object — realtime chat [Durable Objects]

▾

              01
              Each chat room is a Cloudflare Durable Object with an
              embedded SQLite database. On construction it creates two tables: messages
              and meta. Messages older than 1 hour are pruned on each write to keep
              storage lean.
            

              02
              Clients connect via WebSocket. The DO uses hibernation
              (state.acceptWebSocket) so it only consumes compute when messages
              arrive. Incoming JSON is validated, stored, and broadcast to all other connected
              sockets in the same room.
            

              03
              Mentioning @ai or @smartypants triggers
              Groq (llama-3.3-70b-versatile) via a 15-second timeout fetch.
              The last 20 messages are sent as conversation context. A 3-second per-room cooldown
              (stored in the meta table) prevents API flooding.
            

API endpoints exposed [Routes]

▾

          
            
                MethodPathWhat it doesAuth
              

            
                GET
                /api/vertretungsplan/today
                Live parse of subst_001.htm → JSON entries
                none
              

                GET
                /api/vertretungsplan/tomorrow
                Live parse of subst_002.htm → JSON entries
                none
              

                GET
                /api/admin/vplan-dual
                Parallel fetch + timestamp-aware merge of both sources
                X-Admin-Code
              

                GET
                /api/news
                Scrape + decode AES news feed, cache 5 min
                none
              

                GET
                /api/calendar
                Proxy school iCal feed, cache 10 min
                none
              

                GET
                /api/chat/:code/history
                Last hour of chat messages for a room
                none
              

                GET
                /api/chat/:code/ws
                WebSocket upgrade → Durable Object room
                none
              

                GET
                /api/hub/files
                List all files in the R2 file hub
                none
              

                POST
                /api/hub/upload
                Multipart file upload to R2 (PNG/JPEG/PDF ≤10 MB)
                none
              

                POST
                /api/mensa/sync
                FlareSolverr bypass → scrape & store Mensa PDFs
                env secret
              

                GET
                /vplan/parse/:date
                Extract & parse stored R2 PDFs for a date (vplan-worker)
                X-API-Key
              

          
        

Method	Path	What it does	Auth
GET	/api/vertretungsplan/today	Live parse of subst_001.htm → JSON entries	none
GET	/api/vertretungsplan/tomorrow	Live parse of subst_002.htm → JSON entries	none
GET	/api/admin/vplan-dual	Parallel fetch + timestamp-aware merge of both sources	X-Admin-Code
GET	/api/news	Scrape + decode AES news feed, cache 5 min	none
GET	/api/calendar	Proxy school iCal feed, cache 10 min	none
GET	/api/chat/:code/history	Last hour of chat messages for a room	none
GET	/api/chat/:code/ws	WebSocket upgrade → Durable Object room	none
GET	/api/hub/files	List all files in the R2 file hub	none
POST	/api/hub/upload	Multipart file upload to R2 (PNG/JPEG/PDF ≤10 MB)	none
POST	/api/mensa/sync	FlareSolverr bypass → scrape & store Mensa PDFs	env secret
GET	/vplan/parse/:date	Extract & parse stored R2 PDFs for a date (vplan-worker)	X-API-Key

Interesting engineering decisions [Design]

▾

              ⚡
              No n8n dependency — early versions piped everything through n8n
              webhooks, adding 8–10 s of latency. Rewriting the HTML parser directly in the Worker
              dropped response time to under 300 ms globally.
            

              🔀
              Dual-source merge — the school sometimes updates subst_001 and
              subst_002 asynchronously. The admin merge endpoint compares ISO timestamps from the
              "Stand:" field and designates the newer file as "primary", letting it overwrite
              any conflicting rows from the secondary.
            

              🔤
              Charset detection — Untis serves HTML in ISO-8859-1 without
              declaring it consistently. The parser reads the charset= parameter from
              the Content-Type header before decoding, so German Umlauts (ä, ö, ü, ß) always
              survive the round-trip.
            

              🗄️
              SQLite in a Durable Object — instead of a separate D1 database
              for chat history, the DO uses its embedded SQLite via state.storage.sql.exec().
              This keeps all room state co-located with the WebSocket connections and eliminates
              cross-datacenter round-trips.
            

              🦎
              WASM PDF parsing on the edge — the vplan-worker uses
              unpdf (compiled to WASM) to extract text from stored R2 PDFs without
              any Node.js runtime. The token stream is then walked with a state machine that
              anchors on repeating "Mi 6.5." / "Do 7.5." day+date patterns to split rows.
            

Current state & future plans [Post Mortem]

▾

              The app is live and used daily during the school year. The core vplan pipeline
              is stable — the main active development is around the surrounding features: the Datei-Hub
              (file sharing for class materials), the Mensa scraper (bypassing Cloudflare via
              self-hosted FlareSolverr), and the school calendar ICS proxy.
            


              Next planned: push notifications when a class's schedule changes (comparing
              the new parse result against the previous snapshot stored in D1), and a
              teacher-view mode behind the admin code that shows room conflicts across
              all classes simultaneously.
            

// tech stack

RuntimeCloudflare Workers (edge)

Language (main)JavaScript (ES modules)

Language (snap)TypeScript

FrontendReact 18 + Tailwind CSS

AI / LLMGroq — llama-3.3-70b-versatile

StorageR2 (PDF archive + Hub)

Chat DBSQLite via Durable Objects

Access LogD1 (SQLite, vplan-access-log)

Browser@cloudflare/puppeteer

PDF parseunpdf (WASM)

SchedulingCloudflare Cron Triggers

Mensa bypassFlareSolverr (self-hosted)

Domainvp.l3on.org