JoeyLLM is an open-source, developer-led initiative to build Australia’s first sovereign foundational language model (LLM)—designed to reflect national values, understand local language, and serve long-term public and strategic interests.
This is not a fine-tuned version of an overseas model. JoeyLLM is foundational infrastructure—a homegrown base layer for AI innovation, national capability, and digital independence.
Simple Explanation - imagine you're talking to a really, really smart parrot that has read almost the entire internet. That's kind of what a large language model (LLM) is like - a computer program that has learned to understand and generate human language by studying a massive amount of text. When you talk to it, it tries to figure out what you mean based on what it has "read" before and then puts together a response that sounds like a human wrote it.
Extended Explanation - a Large Language Model is a fundamentally complex statistical model built using neural network principles and trained on massive datasets to perform language-related tasks. It takes text as input and produces text as output. Think of the LLM as a function with a vast internal state (the parameters). The input text is like the arguments you pass to the function. The function then processes this input based on its learned internal state and returns an output string (the generated text). The training process is like a very long and complex optimization algorithm that tunes the internal state (parameters) of the function to perform the task of language generation effectively.
Today’s AI models are overwhelmingly developed offshore, trained on global data that doesn’t reflect Australian context, needs, or sovereignty. JoeyLLM is designed to change that.
- ✅ Open Source – Transparent, auditable, and permissively licensed
- ✅ Trusted Infrastructure – Built with a focus on safety, integrity, and national interest
- Build an Australian foundational LLM using high-quality, diverse, and locally relevant datasets
- Support downstream applications across research, public sector, education, and national industries
- Release open-source training pipelines, datasets, and model checkpoints for auditability and reuse
- Contribute to long-term sovereign digital capability through open infrastructure
JoeyLLM is built using modern transformer-based architectures, designed for performance, extensibility, and fine-tuning across multiple domains.
Focus areas include:
- Pretraining on Australian-relevant corpora and curated local knowledge
- Training on secure, Australian-hosted compute infrastructure
- Supporting downstream fine-tuning for health, education, legal, public sector, and more
- Publishing model weights, code, and documentation for transparency and open science
JoeyLLM is driven by a DevRes (developer-researcher) community focused on building real public-interest AI. You can help by contributing to:
-
Model training & evaluation
-
Dataset curation and filtering
-
Tooling, docs, and audits
-
Deployment and benchmarking
-
📢 Join the discussion: GitHub Discussions
-
🛠️ Contribute: See our Contributing Guide
-
📄 Docs & Resources: Visit the Wiki
-
📉 Data Collection: See data collection and how to contribute
JoeyLLM is released under an open-source license to support sovereign AI development in the public interest. You’re welcome to fork it, build on it, and use it to support downstream applications across research, public services, and responsible innovation.
JoeyLLM is being developed to meet Australia’s strategic need for a locally governed foundational model—one that supports national resilience, trusted public-sector adoption, and innovation rooted in Australian values.
This is not just a model—it’s foundational AI infrastructure for a sovereign digital future.