A site that scrapes jobs directly from company portals with LLM-powered scrapers, so I don't have to manually write parsers.
While applying, it's good to apply to recently posted jobs but searching for them manually across multiple sites gets old.
Sometimes these job sites do let you setup notifications, but even then the implementation is inconsistent.
There are browser extensions that automatically watch for changes, but that still feels clunky.
There is a site that do this: JobRadar but it seems that they're missing a lot of companies and is not user-configurable.
The goal is a site where user can add sites to track themselves, but right now a config file is used. There're also cost concerns as the number of sources grow.
Install uv
# MAC
brew install uv
# Fedora
sudo dnf install uv
Then run this init script on first run
./init.sh
Now you can use the start script and follow its instructions
./start.sh
NOTE 1: Ollama is currently not being used, since using smaller local models results in bad parsing.
Next good-to-haves:
- Location filter
- Browser notification
- Team/Category filtering
Design choices:
- User config stored in client (exportable).
- API service handles requests, email/browser notifications, and scraper scheduling
- Postgres stores scraped content, page hash, and scraping history
- Scraper first does a diff detect on page hash. If page has changed, it will use llm-scraper with local ollama to parse.
- Ollama will be used to host SmolLM 1.7B.
- SmolLM should let us scrape more reliably from any website without having to manually configure rules for each site.
- Fix docs to mention crawl4ai instead of llm-scraper
- Update docs architectures and guide