Skip to content

open-and-sustainable/alembica

logo alembica

Open Science Software for Semantic Synthesis and Extraction of Information from Unstructured Sources.

Go Reference Go Report Card DOI


About

alembica simplifies the use of Large Language Models (LLMs) to extract structured datasets from unstructured corpora of text. It provides a flexible and scalable framework to process, synthesize, and transform textual information into structured formats suitable for analysis and further processing.


Installation (Go)

To install alembica in Go, run:

go get github.com/open-and-sustainable/alembica

If you want to use alembica in other programming languages, check out the C-Shared Library in the User Guide.


Documentation

User Guide – Learn how to use alembica in different programming languages. API Reference – Explore the Go package documentation.


Features

  • Validation of Input – Ensures that queries are correctly formatted to support proper interaction with models.
  • Cost Assessment – Calculates token costs based on the requested extraction and different model pricing.
  • Data Extraction – Processes unstructured text and transforms it into structured datasets for further analysis.

Authors & Contributions

Author: Riccardo Boero - ribo@nilu.no

Contributions are welcome!


License

alembica is licensed under the GNU AFFERO GENERAL PUBLIC LICENSE, Version 3.

AGPL License


Citation

Boero, R. (2025). alembica - Open Science Software for Semantic Synthesis and Extraction of Information from Unstructured Sources. Zenodo. https://doi.org/10.5281/zenodo.14899666

About

Open science package for LLM-powered semantic synthesis and precise extraction of information from unstructured texts.

Topics

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Languages