Agent-K: AI-Powered Knowledge Base Assistant

Agent-K is a powerful knowledge base agentic system that uses LLM agents to help you interact with tables and reports.

One example question can be: "What are all the mineral sites with a total tonnage greater than 100,000 million tonnes in Canada? Return mineral site names, total tonnage, and commodity." Traditionally, to answer this question, one would need to extract structured data from the PDF reports, combining with other databases to construct an unified database (e.g. relational database or knowledge graph), and then write a SQL/SPARQL query to answer the question. Futhermore, there are many use cases in healthcare, financial analysis, academic research that involves question answering on hybrid tabular and document data across hundreds of tables and PDFs.

This research aims to automate this E2E process using LLM agents without the need for manual data extraction, entity resolution, database construction, and text-to-SQL generation.

DB Agent

Features

LLM powered SQL assistance
Automatic SQL error correction
Schema introspection and validation
Persistent data storage through DuckDB
CSV export support

Prerequisites

Python 3.12+
OpenAI API key

Installation

Clone the repository:

git clone https://github.com/yourusername/agent-k.git
cd agent-k

Create and activate a virtual environment:

python -m venv .venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate

Install dependencies:

pip install -e .

Set up environment variables:

cp .env.example .env

Edit .env with your settings:

Data Setup

The project includes a comprehensive data setup process that handles all necessary data downloads and initialization. Run the following command to execute all setup steps:

python -m agent_k.setup.setup_all

This will execute the following steps in sequence:

MRDS Data: Downloads and filters Mineral Resources Data System (MRDS) data for the configured commodity
DuckDB Integration: Loads the filtered MRDS data into DuckDB for efficient querying
MinMod Hyper Data: Downloads and enriches MinMod Hyper data (ground truth dataset)
43-101 Reports: Downloads 43-101 mineral reports in PDF format concurrently
Evaluation Dataset: Constructs an match-based question eval dataset in JSONL format

The setup process is configurable through environment variables. Ensure your .env file includes the necessary commodity settings before running the setup.

Using the Database Agent

Here's a simple example of using the database agent:

python -m agent_k.agents.db_agent

PDF Agent

PDF Agent contains two sub-agents:

Fast Extractor: Extracts simple entities from a PDF file in batch structured format
Slow Extractor: Extracts complex entities from a PDF file one by one using dynamic tool calling, parallelized using map-reduce.

Development

Code style: We use ruff for linting
Type checking: We use mypy for type checking
Pre-commit hooks: Run pre-commit install to set up and pre-commit run --all-files to run all hooks before committing
Python version: Make sure to use Python 3.12+

License

This project is licensed under the MIT License - see the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 118 Commits
.dvc		.dvc
agent_k		agent_k
data		data
docker		docker
paper		paper
.dvcignore		.dvcignore
.env.example		.env.example
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.python-version		.python-version
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Agent-K: AI-Powered Knowledge Base Assistant

DB Agent

Features

Prerequisites

Installation

Data Setup

Using the Database Agent

PDF Agent

Development

License

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

yixin0829/agent-k

Folders and files

Latest commit

History

Repository files navigation

Agent-K: AI-Powered Knowledge Base Assistant

DB Agent

Features

Prerequisites

Installation

Data Setup

Using the Database Agent

PDF Agent

Development

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages