Kilimandjaro

Kilimandjaro is a program that provides mapping from any labels to medical terminologies.

Currently two sources are used:

the french Snomed CT
the CCAM terminology

In order to provide a useful mapping, we compare text embeddings.

The main way to use Kilimandjaro is to:

produce and store embeddings
query through the UI.

We use uv to manage dependencies and run programs.

To produce the embeddings - here for the CCAM data source:

uv run src/kilimandjaro/indexer.py add ccam

It will fetch the source data, produce and store the embeddings in a local ChromaDB instance.

For this source, the whole process is several minutes long.

Then launch the web UI:

uv run streamlit run src/kilimandjaro/main.py

Architecture

The three main pieces of this application are:

the vector database, using ChromaDB, to produce and store embeddings
the indexer program, which fetch source data and push them to ChromaDB
the web UI, which allows humans to really use the application

graph
    DB[(vector DB)]
    INDEXER([Indexer])
    WEBUI(Web UI)
    SOURCES(Sources)

    INDEXER -->|fetches| SOURCES
    INDEXER -->|indexes| DB
    DB -->|produces embeddings| DB
    WEBUI -->|queries| DB

The indexer is a command line. This:

uv run src/kilimandjaro/indexer.py

will display available commands.

It currently fetches data from a triple store.

Configuration

To be able to fetch data, you must provide a triple store endpoint in the corresponding configuration section:

[kilimandjaro.sources]
triple-store-url = "<ENDPOINT URL>"

Notes

CCAM

when parsing the JSON payload outside with rye run indexer add ccam | yq some errors appears
for example for this acte: {'code':'MBFA001', 'label': 'Résection "en bloc" d\'une extrémité et/ou de la diaphyse de l\'humérus'}
this would be a better encoding: "label":"Résection \"en bloc\" d\\'une extrémité et/ou de la diaphyse de l\\'humérus"?

Name		Name	Last commit message	Last commit date
Latest commit History 37 Commits
src/kilimandjaro		src/kilimandjaro
.gitignore		.gitignore
.python-version		.python-version
LICENSE		LICENSE
README.md		README.md
config-skeleton.toml		config-skeleton.toml
pyproject.toml		pyproject.toml
requirements-dev.lock		requirements-dev.lock
requirements.lock		requirements.lock
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Kilimandjaro

Architecture

Configuration

Notes

CCAM

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

Making-Sense-Info/Kilimandjaro

Folders and files

Latest commit

History

Repository files navigation

Kilimandjaro

Architecture

Configuration

Notes

CCAM

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages