A modern, scalable web app and data pipeline for viewing and filtering public ice rink schedules in the Denver metro area. Built with React frontend and Cloudflare Workers backend for global edge deployment.
- Fast, filterable schedule viewer for multiple rinks with real-time data
- Progressive Web App (PWA) with offline support and installable app experience
- Native Android App available for Google Play Store distribution
- Modular React frontend with custom hooks, filter components, and URL state management
- Robust Cloudflare Workers backend with automated scraping and edge caching
- Durable Objects scheduling for reliable, distributed scraper orchestration
- KV storage for fast, globally distributed data access
- Mobile-friendly, accessible UI with modern design and smart install detection
- Automated data freshness with configurable splay timing to avoid rate limiting
- Service worker caching with stale-while-revalidate strategy for optimal performance
The Denver Rink Schedule Viewer is available as a native Android app built with Capacitor, providing a native app experience while leveraging the existing PWA infrastructure.
- 📱 Native Android App: Full native app experience with Play Store distribution
- 🔄 All PWA Features: Offline support, caching, and background sync
- ⚡ Native Performance: Optimized for Android with native UI elements
- 🚀 Automatic Updates: Seamless updates through the web layer
- 🔔 Native Integration: Android-specific features and optimizations
- Capacitor Integration: Wraps the existing PWA in a native Android container
- Automated CI/CD: GitHub Actions for building and deploying to Play Store
- Multiple Build Variants: Debug, release, and development builds
- Comprehensive Testing: Automated testing pipeline for Android builds
See Android Development Guide for detailed setup and development instructions.
This application is a fully-featured Progressive Web App that works offline and can be installed on mobile devices and desktop computers.
- 📱 Installable App: Install directly from browser on iOS, Android, and desktop
- 🔄 Offline Support: View cached schedule data when internet is unavailable
- ⚡ Fast Loading: Service worker caching for instant startup and navigation
- 🚀 App-like Experience: Runs in standalone mode without browser UI
- 🔔 Fresh Data Notifications: Background updates with stale-while-revalidate strategy
- Open the site in Chrome or Safari
- Tap the "📱 Install App" button when available
- Or use browser's "Add to Home Screen" option
- App icon will appear on your home screen
- Look for install icon in browser address bar
- Click "💻 Install" button on the site
- App will open in its own window
- Pin to taskbar for easy access
- ✅ Chrome/Edge: Full PWA support with install prompts
- ✅ Safari: Add to home screen functionality
- ❌ Firefox: Limited PWA support (no install button shown)
- Schedule Data: Last loaded events remain available offline
- Filtering: All filter options work with cached data
- Navigation: Full app navigation works without internet
- Auto-Update: Fresh data loads automatically when connection returns
- Service Worker: Custom caching strategy in
public/sw.js
- Web App Manifest: PWA configuration in
public/manifest.json
- Smart Install Detection: Cross-browser compatibility in
HeaderActions.tsx
- Cache Strategy: Stale-while-revalidate for optimal performance
- Deployed to GitHub Pages
- Fetches data from Cloudflare Workers API
- Modular component architecture with TypeScript
- Centralized Scheduler (
workers/scheduler.ts
) - Single cron trigger manages all scraper scheduling - Data API Worker (
workers/data-api.ts
) - Serves aggregated data from KV storage - Scraper Workers - Individual workers for each rink with Durable Object scheduling
- ScraperHelpers (
workers/helpers/scraper-helpers.ts
) - Shared utilities and patterns
- 🧊 Ice Ranch (Littleton - South Park) - RSS feed scraper
- 🐻 Big Bear Ice Arena (Lowry) - HTML scraper
- 🏫 DU Ritchie Center (University of Denver) - HTML scraper
- ⛸️ Foothills Edge Ice Arena (Littleton - Ken Caryl) - HTML scraper
- 🏢 SSPRD Ice Center (Englewood/Highlands Ranch) - HTML scraper
# Install dependencies
npm install
# For Android development, also install Android SDK and Java 17
npm run dev # Start Vite dev server on http://localhost:5173
# Build web app and sync to Android
npm run cap:build
# Run Android app in development
npm run cap:android
# Build Android APK
cd android && ./gradlew assembleDebug
# Start all workers for development
./scripts/dev-workers.sh
# Advanced options
./scripts/dev-workers.sh --port-start 9000 --test
./scripts/dev-workers.sh --include "data-api" --include "ice-ranch"
./scripts/dev-workers.sh --exclude "big-bear" --exclude "ssprd"
# Test data API
curl http://localhost:8787/api/health
curl http://localhost:8787/api/all-events
# Trigger individual scrapers
curl -X POST http://localhost:8788 # Ice Ranch
curl -X POST http://localhost:8789 # Big Bear
# etc.
The frontend automatically deploys to GitHub Pages when changes are pushed to the main branch.
# Deploy centralized scheduler (single cron trigger for all scrapers)
wrangler deploy --config wrangler-scheduler.toml
# Deploy data API
wrangler deploy --config wrangler.toml
# Deploy individual scrapers (no cron triggers - managed by scheduler)
wrangler deploy --config wrangler-ice-ranch.toml
wrangler deploy --config wrangler-big-bear.toml
wrangler deploy --config wrangler-du-ritchie.toml
wrangler deploy --config wrangler-foothills-edge.toml
wrangler deploy --config wrangler-ssprd.toml
App.tsx
— Main app component with routingcomponents/
— Modular React componentsEventList.tsx
,EventCard.tsx
— Event display componentsFilterControls.tsx
— Master filter componentCategoryFilter.tsx
,DateFilter.tsx
,TimeFilter.tsx
— Individual filtersRinkFilter.tsx
,RinkTabs.tsx
— Rink selection componentsErrorBoundary.tsx
— Error handling component
hooks/
— Custom React hooksuseEventData.ts
— Data fetching and managementuseEventFiltering.ts
— Event filtering logicuseUrlState.ts
— URL state synchronization
rinkConfig.ts
— Rink metadata and configurationtypes.ts
— Shared TypeScript definitions
scheduler.ts
— Centralized scheduler with single cron trigger for all scrapersdata-api.ts
— Main API worker for serving aggregated datahelpers/scraper-helpers.ts
— Shared utilities and Durable Object patternsscrapers/
— Individual scraper workersice-ranch.ts
— RSS feed scraper with Durable Object schedulingbig-bear.ts
— HTML scraper with Durable Object schedulingdu-ritchie.ts
— HTML scraper with Durable Object schedulingfoothills-edge.ts
— HTML scraper with Durable Object schedulingssprd.ts
— HTML scraper with Durable Object scheduling
wrangler-scheduler.toml
— Centralized scheduler configuration with cron triggerwrangler*.toml
— Individual worker configurations (no cron triggers)scripts/dev-workers.sh
— Development script for running all workers locallyrefactor/
— Technical documentation and improvement plans
The project uses a centralized scheduler to overcome Cloudflare's 5-cron trigger limit and enable unlimited rink additions:
- Single Cron Trigger - Only the scheduler worker has a cron trigger (
0 */6 * * *
) - Dynamic Worker Communication - Scheduler calls scrapers via HTTP with
global_fetch_strictly_public
flag - Environment-Based Configuration - Add new rinks by updating
SCRAPER_ENDPOINTS
inwrangler-scheduler.toml
- No Hardcoded Dependencies - Scheduler dynamically generates URLs from template patterns
- Scalable Architecture - Support for unlimited rinks without hitting Cloudflare limits
All scrapers use a consistent Durable Object pattern abstracted into ScraperHelpers
:
handleSchedulerFetch()
— Common request handling with status endpointshandleSchedulerAlarm()
— Automatic rescheduling with configurable splay timinggetAlarmTime()
— Random delay calculation to avoid rate limiting
Scrapers use environment variables for timing configuration:
SCRAPER_SPLAY_MINUTES
— Maximum random delay between scrapes (default: 360 minutes)
- Durable Objects schedule scraper execution with random splay delays
- Scrapers fetch data from rink websites and parse into standardized format
- KV Storage stores both events data and metadata for each rink
- Data API aggregates data from KV storage and serves to frontend
- Frontend displays real-time data with client-side filtering
GET /api/health
— Health checkGET /api/all-events
— All events from all rinksGET /api/all-metadata
— Metadata for all rinksGET /data/{rinkId}.json
— Events for specific rinkGET /data/{rinkId}-metadata.json
— Metadata for specific rink
GET /status
— Scheduler status and next run timePOST /
— Manually trigger scraper execution
bun test # Run all tests
bun test:watch # Run tests in watch mode
# Automated testing with dev script
./scripts/dev-workers.sh --test
# Manual endpoint testing
curl http://localhost:8787/api/health
curl -X POST http://localhost:8788 # Trigger scraper
The centralized scheduler architecture makes adding new rinks incredibly simple:
- Create scraper worker in
workers/scrapers/new-rink.ts
- Create wrangler config
wrangler-new-rink.toml
(without cron triggers) - Add to scheduler - Update
SCRAPER_ENDPOINTS
inwrangler-scheduler.toml
:SCRAPER_ENDPOINTS = "ice-ranch,big-bear,du-ritchie,foothills-edge,ssprd,new-rink"
- Add to frontend
rinkConfig.ts
-
Scraper Worker (
workers/scrapers/new-rink.ts
):export class NewRinkScheduler { async fetch(request: Request): Promise<Response> { return ScraperHelpers.handleSchedulerFetch( request, this.state, this.env, 'new-rink', () => this.runScraper() ); } async alarm(): Promise<void> { return ScraperHelpers.handleSchedulerAlarm( this.state, this.env, 'new-rink', () => this.runScraper() ); } }
-
Wrangler Configuration (
wrangler-new-rink.toml
):name = "rink-scraper-new-rink" main = "workers/scrapers/new-rink.ts" compatibility_date = "2024-10-21" [[kv_namespaces]] binding = "RINK_DATA" id = "a38bbfdc3fe74d69a0ef39550960eca3" [[durable_objects.bindings]] name = "NEW_RINK_SCHEDULER" class_name = "NewRinkScheduler" [vars] SCRAPER_SPLAY_MINUTES = "350" # No cron triggers - managed by centralized scheduler
-
Deploy Everything:
# Deploy new scraper wrangler deploy --config wrangler-new-rink.toml # Redeploy scheduler with updated SCRAPER_ENDPOINTS wrangler deploy --config wrangler-scheduler.toml
- ✅ No Cron Limit - Add unlimited rinks without hitting Cloudflare's 5-cron limit
- ✅ Simple Configuration - Just update one environment variable
- ✅ Automatic Scheduling - New scrapers immediately get scheduled
- ✅ No Code Changes - Scheduler dynamically discovers new scrapers
- ✅ Consistent Monitoring - All scrapers visible in scheduler status endpoint
# Test scraper directly
curl https://rink-scraper-new-rink.qbrd.workers.dev/
# Check scheduler status
curl https://rink-scheduler.qbrd.workers.dev/status
# Manually trigger all scrapers
curl https://rink-scheduler.qbrd.workers.dev/trigger
- Edge deployment via Cloudflare Workers for global low latency
- Automatic retries and error handling in scrapers
- Rate limiting protection with random splay delays
- Data caching at edge locations via KV storage
- Graceful degradation if individual scrapers fail
- Real-time updates with configurable refresh intervals
Contributions welcome! The codebase is designed for maintainability:
- Modular architecture with clear separation of concerns
- Shared patterns via ScraperHelpers reduce code duplication
This project has provided valuable insights into scraping different types of websites for ice rink schedules. Here are the key challenges, solutions, and lessons learned:
Approach: RSS feed parsing with XML string manipulation
Challenges:
- No
xml2js
library available in Cloudflare Workers runtime - CDATA sections and HTML entities requiring manual cleaning
- Inconsistent date formats and timezone handling
Solutions:
- Custom XML parsing using regex patterns
- Manual HTML entity decoding (
&
,<
, etc.) - Robust timezone conversion for Mountain Time to UTC
- Tag-based event categorization from RSS metadata
Key Code Pattern:
private parseBasicXML(xml: string): any[] {
const itemRegex = /<item>(.*?)<\/item>/gs;
// Extract title with CDATA handling
let titleMatch = itemContent.match(/<title><!\[CDATA\[(.*?)\]\]><\/title>/s);
if (!titleMatch) {
titleMatch = itemContent.match(/<title>(.*?)<\/title>/s);
}
}
Approach: Reverse-engineered API calls with form data
Challenges:
- Complex form parameters with multiple reservation types and resources
- Server-side timezone assumptions (Mountain Time returned as UTC)
- API responses requiring time zone correction
Solutions:
- Form data analysis to identify required parameters
- Manual timezone adjustment (+6 hours for MT to UTC conversion)
- Comprehensive form field mapping for all event types
Key Code Pattern:
const formData = new URLSearchParams({
'ReservationTypes[0].Selected': 'true',
'ReservationTypes[0].Id': '-1',
// ... 14 more reservation types
'Resources[0].Id': '-1',
// ... 6 more resources
});
Approach: Google Calendar iCal parsing
Challenges:
- Complex iCal format with multi-line folding
- Timezone data blocks requiring parsing
- HTML descriptions needing cleaning
- Multiple calendar aggregation
Solutions:
- Custom iCal parser handling line folding and escaping
- HTML description cleaning with essential info extraction
- Event deduplication across multiple calendars
- Selective event filtering (basketball vs. ice events)
Key Code Pattern:
// Handle iCal line folding
while (i + 1 < lines.length && (lines[i + 1].startsWith(' ') || lines[i + 1].startsWith('\t'))) {
i++;
line += lines[i].substring(1);
}
Approach: Extract JSON data from inline JavaScript
Challenges:
- JavaScript object parsing without
eval()
- Dynamic event object structure
- Time parsing from various formats
- Fallback parsing when JavaScript extraction fails
Solutions:
- Regex-based JavaScript object extraction
- Brace counting for proper JSON boundary detection
- Multiple time format parsing (12-hour with AM/PM)
- DOM parsing fallback for robust data extraction
Key Code Pattern:
const eventsStartMatch = html.match(/events\s*=\s*\{"[0-9]{4}-[0-9]{2}-[0-9]{2}"/);
let braceCount = 0;
for (let i = startIndex; i < html.length; i++) {
if (html[i] === '{') braceCount++;
if (html[i] === '}') braceCount--;
if (braceCount === 0) break;
}
Approach: Server-side rendered data extraction
Challenges:
- JavaScript variable extraction from HTML
- Multi-facility data aggregation
- Facility ID to rink mapping
- Event categorization without explicit tags
Solutions:
- Regex extraction of
_onlineScheduleList
JavaScript array - Facility-based event routing and aggregation
- Custom facility metadata for each location
- Heuristic event categorization
Problem: Websites assume local timezone (Mountain Time) but don't specify
Solution: Standardized UTC conversion in ScraperHelpers
static parseMountainTime(dateStr: string, timeStr: string): Date {
// Convert MT to UTC by adding 6-7 hours depending on DST
}
Problem: Inconsistent event naming across venues
Solution: Shared categorization logic with keyword matching
static categorizeEvent(title: string): string {
const lower = title.toLowerCase();
if (lower.includes('public') || lower.includes('open')) return 'Public Skate';
if (lower.includes('hockey')) return 'Hockey';
// ... more patterns
}
Problem: Event descriptions contain HTML tags and entities
Solution: Progressive HTML cleaning with essential info preservation
private cleanHtmlDescription(htmlDescription: string): string {
return htmlDescription
.replace(/&/g, '&')
.replace(/<\/p>/g, '\n\n')
.replace(/<[^>]*>/g, '') // Remove all tags
.replace(/\n{3,}/g, '\n\n'); // Collapse newlines
}
- Graceful degradation when individual scrapers fail
- Specific error logging for debugging different website issues
- Automatic retry logic with exponential backoff (planned)
- Fallback parsing methods when primary extraction fails
- Random splay delays (0-360 minutes default) to avoid detection
- Respectful User-Agent strings mimicking real browsers
- Configurable timing via environment variables
- Distributed scheduling via Durable Objects
- Event date filtering (next 30 days only)
- Duplicate removal across multiple data sources
- Required field validation (title, start/end times)
- Timezone consistency (all stored as UTC)
- Regex-based extraction instead of full DOM parsing where possible
- Stream processing for large data sets
- Early termination on parsing errors
- Minimal memory allocation in Workers environment
- KV storage for globally distributed event data
- Metadata separation for status and error information
- Incremental updates rather than full rebuilds
- Edge caching via Cloudflare infrastructure
- Comprehensive logging with emoji prefixes for easy identification
- HTML snapshot saving during development for offline testing
- Response validation to catch API changes early
- Error context preservation for remote debugging
- Local HTML files for testing parsing logic
- Mock data generation for consistent testing
- Integration tests with real endpoints (limited)
- Fallback validation ensuring robustness
- Shared helper functions for common operations
- Consistent error handling across all scrapers
- Configuration externalization via environment variables
- Clear separation between parsing and scheduling logic
- RSS feed sometimes includes HTML in descriptions requiring cleaning
- Event tags are encoded as URL parameters, not in RSS categories
- Date formats inconsistent between title and pubDate fields
- API returns times in Mountain Time but parses as UTC
- Complex reservation type system requiring specific ID mapping
- Resource allocation affects which events are visible
- Multiple calendars require aggregation and deduplication
- HTML descriptions need aggressive cleaning for mobile display
- Basketball events mixed with ice events requiring filtering
- Event data embedded in page JavaScript, not API
- Time formats vary between "12:00 PM" and "12:00PM"
- Calendar system occasionally changes JavaScript structure
- Multi-facility site requiring event routing by facility ID
- JavaScript variables change names between updates
- Event names often include program codes requiring cleanup
These lessons learned have shaped the robust, maintainable scraping architecture that can adapt to website changes and handle edge cases gracefully.
🏒 Built with ❤️ for the Denver hockey community
© 2025 Denver Rink Schedule Viewer