Skip to content

southern-cross-ai/JoeyLLM

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🦘 JoeyLLM | Australia’s Sovereign Foundational AI Model

JoeyLLM is an open-source, developer-led initiative to build Australia’s first sovereign foundational language model (LLM)—designed to reflect national values, understand local language, and serve long-term public and strategic interests.

This is not a fine-tuned version of an overseas model. JoeyLLM is foundational infrastructure—a homegrown base layer for AI innovation, national capability, and digital independence.


What is an LLM ?

Simple Explanation - imagine you're talking to a really, really smart parrot that has read almost the entire internet. That's kind of what a large language model (LLM) is like - a computer program that has learned to understand and generate human language by studying a massive amount of text. When you talk to it, it tries to figure out what you mean based on what it has "read" before and then puts together a response that sounds like a human wrote it.

Extended Explanation - a Large Language Model is a fundamentally complex statistical model built using neural network principles and trained on massive datasets to perform language-related tasks. It takes text as input and produces text as output. Think of the LLM as a function with a vast internal state (the parameters). The input text is like the arguments you pass to the function. The function then processes this input based on its learned internal state and returns an output string (the generated text). The training process is like a very long and complex optimization algorithm that tunes the internal state (parameters) of the function to perform the task of language generation effectively.

🇦🇺 Why JoeyLLM?

Today’s AI models are overwhelmingly developed offshore, trained on global data that doesn’t reflect Australian context, needs, or sovereignty. JoeyLLM is designed to change that.

  • Open Source – Transparent, auditable, and permissively licensed
  • Trusted Infrastructure – Built with a focus on safety, integrity, and national interest

🎯 Project Objectives

  • Build an Australian foundational LLM using high-quality, diverse, and locally relevant datasets
  • Support downstream applications across research, public sector, education, and national industries
  • Release open-source training pipelines, datasets, and model checkpoints for auditability and reuse
  • Contribute to long-term sovereign digital capability through open infrastructure

🧠 Technical Overview

JoeyLLM is built using modern transformer-based architectures, designed for performance, extensibility, and fine-tuning across multiple domains.

Focus areas include:

  • Pretraining on Australian-relevant corpora and curated local knowledge
  • Training on secure, Australian-hosted compute infrastructure
  • Supporting downstream fine-tuning for health, education, legal, public sector, and more
  • Publishing model weights, code, and documentation for transparency and open science

🤝 How to Get Involved

JoeyLLM is driven by a DevRes (developer-researcher) community focused on building real public-interest AI. You can help by contributing to:


📝 License

JoeyLLM is released under an open-source license to support sovereign AI development in the public interest. You’re welcome to fork it, build on it, and use it to support downstream applications across research, public services, and responsible innovation.


🦘 A Sovereign Foundational Model with a National Mission

JoeyLLM is being developed to meet Australia’s strategic need for a locally governed foundational model—one that supports national resilience, trusted public-sector adoption, and innovation rooted in Australian values.

This is not just a model—it’s foundational AI infrastructure for a sovereign digital future.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 11