Global Nature Watch Agent

Language Interface for Maps & WRI/LCL data APIs.

Project overview

The core of this project is an LLM powered agent that drives the conversations for Global Nature Watch. The project is fully open source and can be ran locally with the appropriate keys for accessing external services.

Agent

Our agent is a simple ReAct agent implemented in Langgraph. It uses tools. The tools at a high level do the following things

Provide information about its capabilities
Retrieve areas of interest
Select appropriate datasets
Retrieve statistics from the WRI analytics api
Generate insights including charts from the data

The LLM to use is plug and play, we rely mostly on Sonnet & Gemini for planning and tool calling.

For detailed technical architecture, see Agent Architecture Documentation.

Infrastructure

To enable that, the project relies on a set of services being deployed with it.

eoAPI to provide access to the LCL data in a STAC catalog and serving tiles
Langfuse for tracing of the agent interactions
PostgreSQL for the API data and geographic search of AOIs
FastAPI deployment for the API

All these services are being managed and deployed throug our deploy repository at project-zeno-deploy

Frontend

The frontend application for this project is a nextjs project that can be found at project-zeno-next

Dependencies

uv
postgresql (for using local DB instead of docker)
docker

Local Development Setup

We use uv for package management and docker-compose for running the sytem locally.

Clone and setup:

git clone git@github.com:wri/project-zeno.git
cd project-zeno
uv sync
source .venv/bin/activate

Environment configuration:

cp .env.example .env
# Edit .env with your API keys and credentials

cp .env.local.example .env.local
# .env.local contains local development overrides (auto-created by make commands)

Build dataset RAG database:

Our agent uses a RAG database to select datasets. The RAG database can be built locally using
```
uv run python src/ingest/embed_datasets.py
```
As an alternative ,the current production table can also be retrieved from S3 if you have the corresponding access permissions.
```
aws s3 sync s3://zeno-static-data/ data/
```

Start infrastructure services:

make up       # Start Docker services (PostgreSQL + Langfuse + ClickHouse)

Ingest data (required after starting database):

After starting the database and infrastructure services, you need to ingest the required datasets. Feel free to run all or just the ones you need.

This downloads ~2 GB of data per dataset except for WDPA which is ~10 GB. It's ok to skip WDPA if you don't need it.

Make sure you're set up with WRI AWS credentials in your .env file to access the S3 bucket.
```
python src/ingest/ingest_gadm.py
python src/ingest/ingest_kba.py
python src/ingest/ingest_landmark.py
python src/ingest/ingest_wdpa.py
```
See src/ingest/ directory for details on each ingestion script.

Start application services:

make api      # Run API locally (port 8000)
make frontend # Run Streamlit frontend (port 8501)

Or start everything at once (after data ingestion):

make dev      # Starts API + frontend (requires infrastructure already running)

Setup Local Langfuse: a. Clone the Langfuse repository outside your current project directory
```
cd ..
git clone https://github.com/langfuse/langfuse.git
cd langfuse
```
b. Start the Langfuse server
```
docker compose up -d
```
c. Access the Langfuse UI at http://localhost:3000
1. Create an account
2. Create a new project
3. Copy the API keys from your project settings
d. Return to your project directory and update your .env.local file
```
cd ../project-zeno
# Update these values in your .env.local file:
LANGFUSE_HOST=http://localhost:3000
LANGFUSE_PUBLIC_KEY=your_public_key_here
LANGFUSE_SECRET_KEY=your_secret_key_here
```
Access the application:
- Frontend: http://localhost:8501
- API: http://localhost:8000
- Langfuse: http://localhost:3000

Development Commands

make help     # Show all available commands
make up       # Start Docker infrastructure
make down     # Stop Docker infrastructure
make api      # Run API with hot reload
make frontend # Run frontend with hot reload
make dev      # Start full development environment

Testing

API Tests

Running make up will bring up a zeno-db_test database that's used by pytest. The tests look for a TEST_DATABASE_URL environment variable (also set in .env.local). You can also create the database manually with the following commands:

createuser -s postgres # if you don't have a postgres user
createdb -U postgres zeno-data_test

Then run the API tests using pytest:

uv run pytest tests/api/

CLI User Management

For user administration commands (making users admin, whitelisting emails), see CLI Documentation.

Environment Files

.env - Base configuration (production settings)
.env.local - Local development overrides (auto-created)

The system automatically loads .env first, then overrides with .env.local for local development.

uv run streamlit run src/frontend/app.py

Setup Database

Using docker:

docker compose up -d
uv run streamlit run frontend/app.py

Using postgresql:

a. Create a new database

createuser -s postgres # if you don't have a postgres user
createdb -U postgres zeno-data-local
alembic upgrade head

# Check if you have the database running
psql zeno-data-local

# Check if you have the tables created
\dt

# Output
#               List of relations
#  Schema |      Name       | Type  |  Owner
# --------+-----------------+-------+----------
#  public | alembic_version | table | postgres
#  public | threads         | table | postgres
#  public | users           | table | postgres

b. Add the database URL to the .env file:

DATABASE_URL=postgresql+asyncpg://postgres:postgres@localhost:5432/zeno-data-local

Configure localhost Langfuse

docker compose up langfuse-server (or just spin up the whole backend with docker compose up)
Open your browser and navigate to http://localhost:3000 to create a Langfuse account.
Within the Langfuse UI, create an organization and then a project.
Copy the API keys (public and secret) generated for your project.
Update the LANGFUSE_PUBLIC_KEY and LANGFUSE_SECRET_KEY environment variables in your docker-compose.yml file with the copied keys.

Dataset lookup RAG

After syncing the data, use the latest version of the zeno data clean csv file to create embeddings that are used for looking up datasets based on queries.

The latest csv reference file currently is

aws s3 cp s3://zeno-static-data/zeno_data_clean_v2.csv data/

then run

python src/ingest/embed_datasets.py

This will update the local database at data/zeno-docs-openai-index.

Name		Name	Last commit message	Last commit date
Latest commit History 1,197 Commits
.github/workflows		.github/workflows
db		db
docs		docs
experiments		experiments
nbs		nbs
rag		rag
scripts		scripts
src		src
stac		stac
tests		tests
.dockerignore		.dockerignore
.editorconfig		.editorconfig
.env.example		.env.example
.env.local.example		.env.local.example
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.python-version		.python-version
Dockerfile		Dockerfile
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
__init__.py		__init__.py
client.py		client.py
docker-compose.dev.yaml		docker-compose.dev.yaml
docker-compose.yaml		docker-compose.yaml
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Global Nature Watch Agent

Project overview

Agent

Infrastructure

Frontend

Dependencies

Local Development Setup

Development Commands

Testing

API Tests

CLI User Management

Environment Files

Setup Database

Configure localhost Langfuse

Dataset lookup RAG

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 13

Uh oh!

Languages

License

wri/project-zeno

Folders and files

Latest commit

History

Repository files navigation

Global Nature Watch Agent

Project overview

Agent

Infrastructure

Frontend

Dependencies

Local Development Setup

Development Commands

Testing

API Tests

CLI User Management

Environment Files

Setup Database

Configure localhost Langfuse

Dataset lookup RAG

About

Resources

License

Code of conduct

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 13

Uh oh!

Languages

Packages