stac-fastapi-elasticsearch-opensearch

Jump to: Project Introduction | Quick Start | Table of Contents

Sponsors & Supporters

The following organizations have contributed time and/or funding to support the development of this project:

Project Introduction - What is SFEOS?

SFEOS (stac-fastapi-elasticsearch-opensearch) is a high-performance, scalable API implementation for serving SpatioTemporal Asset Catalog (STAC) data - an enhanced GeoJSON format designed specifically for geospatial assets like satellite imagery, aerial photography, and other Earth observation data. This project enables organizations to:

Efficiently catalog and search geospatial data such as satellite imagery, aerial photography, DEMs, and other geospatial assets using Elasticsearch or OpenSearch as the database backend
Implement standardized STAC APIs that support complex spatial, temporal, and property-based queries across large collections of geospatial data
Scale to millions of geospatial assets with fast search performance through optimized spatial indexing and query capabilities
Support OGC-compliant filtering including spatial operations (intersects, contains, etc.) and temporal queries
Perform geospatial aggregations to analyze data distribution across space and time

This implementation builds on the STAC-FastAPI framework, providing a production-ready solution specifically optimized for Elasticsearch and OpenSearch databases. It's ideal for organizations managing large geospatial data catalogs who need efficient discovery and access capabilities through standardized APIs.

Common Deployment Patterns

stac-fastapi-elasticsearch-opensearch can be deployed in several ways depending on your needs:

Containerized Application: Run as a Docker container with connections to Elasticsearch/OpenSearch databases
Serverless Function: Deploy as AWS Lambda or similar serverless function with API Gateway
Traditional Server: Run on virtual machines or bare metal servers in your infrastructure
Kubernetes: Deploy as part of a larger microservices architecture with container orchestration

The implementation is flexible and can scale from small local deployments to large production environments serving millions of geospatial assets.

Technologies

This project is built on the following technologies: STAC, stac-fastapi, FastAPI, Elasticsearch, Python, OpenSearch

Documentation & Resources
Package Structure
Examples
Performance
Quick Start
- Installation
- Running Locally
Configuration reference
Interacting with the API
Configure the API
Collection pagination
Ingesting Sample Data CLI Tool
Elasticsearch Mappings
Managing Elasticsearch Indices
- Snapshots
- Reindexing
Auth
Aggregation
Rate Limiting

Documentation & Resources

Online Documentation: https://stac-utils.github.io/stac-fastapi-elasticsearch-opensearch
Source Code: https://github.com/stac-utils/stac-fastapi-elasticsearch-opensearch
API Examples: Postman Documentation - Examples of how to use the API endpoints
Community:
- Gitter Chat - For real-time discussions
- GitHub Discussions - For longer-form questions and answers

Package Structure

This project is organized into several packages, each with a specific purpose:

stac_fastapi_core: Core functionality that's database-agnostic, including API models, extensions, and shared utilities. This package provides the foundation for building STAC API implementations with any database backend. See stac-fastapi-mongo for a working example.
sfeos_helpers: Shared helper functions and utilities used by both the Elasticsearch and OpenSearch backends. This package includes:
- database: Specialized modules for index, document, and database utility operations
- aggregation: Elasticsearch/OpenSearch-specific aggregation functionality
- Shared logic and utilities that improve code reuse between backends
stac_fastapi_elasticsearch: Complete implementation of the STAC API using Elasticsearch as the backend database. This package depends on both stac_fastapi_core and sfeos_helpers.
stac_fastapi_opensearch: Complete implementation of the STAC API using OpenSearch as the backend database. This package depends on both stac_fastapi_core and sfeos_helpers.

Examples

The /examples directory contains several useful examples and reference implementations:

pip_docker: Examples of running stac-fastapi-elasticsearch from PyPI in Docker without needing any code from the repository
auth: Authentication examples including:
- Basic authentication
- OAuth2 with Keycloak
- Route dependencies configuration
rate_limit: Example of implementing rate limiting for API requests
postman_collections: Postman collection files you can import for testing API endpoints

These examples provide practical reference implementations for various deployment scenarios and features.

Performance

Direct Response Mode

The enable_direct_response option is provided by the stac-fastapi core library (introduced in stac-fastapi 5.2.0) and is available in this project starting from v4.0.0.
Control via environment variable: Set ENABLE_DIRECT_RESPONSE=true to enable this feature.
How it works: When enabled, endpoints return Starlette Response objects directly, bypassing FastAPI's default serialization for improved performance.
Important limitation: All FastAPI dependencies (including authentication, custom status codes, and validation) are disabled for all routes when this mode is enabled.
Best use case: This mode is best suited for public or read-only APIs where authentication and custom logic are not required.
Default setting: false for safety.
More information: See issue #347 for background and implementation details.

Quick Start

This section helps you get up and running with stac-fastapi-elasticsearch-opensearch quickly.

Installation

For versions 4.0.0a1 and newer (PEP 625 compliant naming):

pip install stac-fastapi-elasticsearch  # Elasticsearch backend
pip install stac-fastapi-opensearch    # Opensearch backend
pip install stac-fastapi-core          # Core library

For versions 4.0.0a0 and older:

pip install stac-fastapi.elasticsearch  # Elasticsearch backend
pip install stac-fastapi.opensearch    # Opensearch backend
pip install stac-fastapi.core          # Core library

Important Note: Starting with version 4.0.0a1, package names have changed from using periods (e.g., stac-fastapi.core) to using hyphens (e.g., stac-fastapi-core) to comply with PEP 625. The internal package structure uses underscores, but users should install with hyphens as shown above. Please update your requirements files accordingly.

Running Locally

There are two main ways to run the API locally:

Using Pre-built Docker Images

We provide ready-to-use Docker images through GitHub Container Registry:
- ElasticSearch backend
- OpenSearch backend

Pull and run the images:

# For Elasticsearch backend
docker pull ghcr.io/stac-utils/stac-fastapi-es:latest

# For OpenSearch backend
docker pull ghcr.io/stac-utils/stac-fastapi-os:latest

Using Docker Compose

Prerequisites: Ensure Docker Compose or Podman Compose is installed on your machine.

Start the API:

docker compose up elasticsearch app-elasticsearch

Configuration: By default, Docker Compose uses Elasticsearch 8.x and OpenSearch 2.11.1. To use different versions, create a .env file:
```
ELASTICSEARCH_VERSION=8.11.0
OPENSEARCH_VERSION=2.11.1
ENABLE_DIRECT_RESPONSE=false
```
Compatibility: The most recent Elasticsearch 7.x versions should also work. See the opensearch-py docs for compatibility information.

Configuration Reference

You can customize additional settings in your .env file:

Variable	Description	Default	Required
`ES_HOST`	Hostname for external Elasticsearch/OpenSearch.	`localhost`	Optional
`ES_PORT`	Port for Elasticsearch/OpenSearch.	`9200` (ES) / `9202` (OS)	Optional
`ES_USE_SSL`	Use SSL for connecting to Elasticsearch/OpenSearch.	`false`	Optional
`ES_VERIFY_CERTS`	Verify SSL certificates when connecting.	`false`	Optional
`STAC_FASTAPI_TITLE`	Title of the API in the documentation.	`stac-fastapi-<backend>`	Optional
`STAC_FASTAPI_DESCRIPTION`	Description of the API in the documentation.	N/A	Optional
`STAC_FASTAPI_VERSION`	API version.	`2.1`	Optional
`STAC_FASTAPI_LANDING_PAGE_ID`	Landing page ID	`stac-fastapi`	Optional
`APP_HOST`	Server bind address.	`0.0.0.0`	Optional
`APP_PORT`	Server port.	`8080`	Optional
`ENVIRONMENT`	Runtime environment.	`local`	Optional
`WEB_CONCURRENCY`	Number of worker processes.	`10`	Optional
`RELOAD`	Enable auto-reload for development.	`true`	Optional
`STAC_FASTAPI_RATE_LIMIT`	API rate limit per client.	`200/minute`	Optional
`BACKEND`	Tests-related variable	`elasticsearch` or `opensearch` based on the backend	Optional
`ELASTICSEARCH_VERSION`	Version of Elasticsearch to use.	`8.11.0`	Optional
`OPENSEARCH_VERSION`	OpenSearch version	`2.11.1`	Optional
`ENABLE_DIRECT_RESPONSE`	Enable direct response for maximum performance (disables all FastAPI dependencies, including authentication, custom status codes, and validation)	`false`	Optional
`RAISE_ON_BULK_ERROR`	Controls whether bulk insert operations raise exceptions on errors. If set to `true`, the operation will stop and raise an exception when an error occurs. If set to `false`, errors will be logged, and the operation will continue. Note: STAC Item and ItemCollection validation errors will always raise, regardless of this flag.	`false` Optional
`DATABASE_REFRESH`	Controls whether database operations refresh the index immediately after changes. If set to `true`, changes will be immediately searchable. If set to `false`, changes may not be immediately visible but can improve performance for bulk operations. If set to `wait_for`, changes will wait for the next refresh cycle to become visible.	`false`	Optional
`ENABLE_TRANSACTIONS_EXTENSIONS`	Enables or disables the Transactions and Bulk Transactions API extensions. If set to `false`, the POST `/collections` route and related transaction endpoints (including bulk transaction operations) will be unavailable in the API. This is useful for deployments where mutating the catalog via the API should be prevented.	`true`	Optional

Note

The variables ES_HOST, ES_PORT, ES_USE_SSL, and ES_VERIFY_CERTS apply to both Elasticsearch and OpenSearch backends, so there is no need to rename the key names to OS_ even if you're using OpenSearch.

Interacting with the API

Creating a Collection:

curl -X "POST" "http://localhost:8080/collections" \
     -H 'Content-Type: application/json; charset=utf-8' \
     -d $'{
  "id": "my_collection"
}'

Adding an Item to a Collection:

curl -X "POST" "http://localhost:8080/collections/my_collection/items" \
     -H 'Content-Type: application/json; charset=utf-8' \
     -d @item.json

Searching for Items:

curl -X "GET" "http://localhost:8080/search" \
     -H 'Content-Type: application/json; charset=utf-8' \
     -d $'{
  "collections": ["my_collection"],
  "limit": 10
}'

Filtering by Bbox:

curl -X "GET" "http://localhost:8080/search" \
     -H 'Content-Type: application/json; charset=utf-8' \
     -d $'{
  "collections": ["my_collection"],
  "bbox": [-180, -90, 180, 90]
}'

Filtering by Datetime:

curl -X "GET" "http://localhost:8080/search" \
     -H 'Content-Type: application/json; charset=utf-8' \
     -d $'{
  "collections": ["my_collection"],
  "datetime": "2020-01-01T00:00:00Z/2020-12-31T23:59:59Z"
}'

Configure the API

API Title and Description: By default set to stac-fastapi-<backend>. Customize these by setting:
- STAC_FASTAPI_TITLE: Changes the API title in the documentation
- STAC_FASTAPI_DESCRIPTION: Changes the API description in the documentation
Database Indices: By default, the API reads from and writes to:
- collections index for collections
- items_<collection name> indices for items
- Customize with STAC_COLLECTIONS_INDEX and STAC_ITEMS_INDEX_PREFIX environment variables
Root Path Configuration: The application root path is the base URL by default.
- For AWS Lambda with Gateway API: Set STAC_FASTAPI_ROOT_PATH to match the Gateway API stage name (e.g., /v1)

Collection Pagination

Overview: The collections route supports pagination through optional query parameters.
Parameters:
- limit: Controls the number of collections returned per page
- token: Used to retrieve subsequent pages of results
Response Structure: The links field in the response contains a next link with the token for the next page of results.

Example Usage:

curl -X "GET" "http://localhost:8080/collections?limit=1&token=example_token"

Ingesting Sample Data CLI Tool

Overview: The data_loader.py script provides a convenient way to load STAC items into the database.

Usage:

python3 data_loader.py --base-url http://localhost:8080

Options:

--base-url TEXT       Base URL of the STAC API  [required]
--collection-id TEXT  ID of the collection to which items are added
--use-bulk            Use bulk insert method for items
--data-dir PATH       Directory containing collection.json and feature
                      collection file
--help                Show this message and exit.

Example Workflows:

Loading Sample Data:

python3 data_loader.py --base-url http://localhost:8080

Loading Data to a Specific Collection:

python3 data_loader.py --base-url http://localhost:8080 --collection-id my-collection

Using Bulk Insert for Performance:

python3 data_loader.py --base-url http://localhost:8080 --use-bulk

Elasticsearch Mappings

Overview: Mappings apply to search index, not source data. They define how documents and their fields are stored and indexed.
Implementation:
- Mappings are stored in index templates that are created on application startup
- These templates are automatically applied when creating new Collection and Item indices
- The sfeos_helpers package contains shared mapping definitions used by both Elasticsearch and OpenSearch backends
Customization: Custom mappings can be defined by extending the base mapping templates.

Managing Elasticsearch Indices

Snapshots

Overview: Snapshots provide a way to backup and restore your indices.

Creating a Snapshot Repository:

curl -X "PUT" "http://localhost:9200/_snapshot/my_fs_backup" \
     -H 'Content-Type: application/json; charset=utf-8' \
     -d $'{
             "type": "fs",
             "settings": {
                 "location": "/usr/share/elasticsearch/snapshots/my_fs_backup"
             }
}'

This creates a snapshot repository that stores files in the elasticsearch/snapshots directory in this git repo clone
The elasticsearch.yml and compose files create a mapping from that directory to /usr/share/elasticsearch/snapshots within the Elasticsearch container and grant permissions for using it

Creating a Snapshot:

curl -X "PUT" "http://localhost:9200/_snapshot/my_fs_backup/my_snapshot_2?wait_for_completion=true" \
     -H 'Content-Type: application/json; charset=utf-8' \
     -d $'{
  "metadata": {
    "taken_because": "dump of all items",
    "taken_by": "pvarner"
  },
  "include_global_state": false,
  "ignore_unavailable": false,
  "indices": "items_my-collection"
}'

This creates a snapshot named my_snapshot_2 and waits for the action to be completed before returning
This can also be done asynchronously by omitting the wait_for_completion parameter, and queried for status later
The indices parameter determines which indices are snapshotted, and can include wildcards

Viewing Snapshots:

# View a specific snapshot
curl http://localhost:9200/_snapshot/my_fs_backup/my_snapshot_2

# View all snapshots
curl http://localhost:9200/_snapshot/my_fs_backup/_all

These commands allow you to check the status and details of your snapshots

Restoring a Snapshot:

curl -X "POST" "http://localhost:9200/_snapshot/my_fs_backup/my_snapshot_2/_restore?wait_for_completion=true" \
     -H 'Content-Type: application/json; charset=utf-8' \
     -d $'{
  "include_aliases": false,
  "include_global_state": false,
  "ignore_unavailable": true,
  "rename_replacement": "items_$1-copy",
  "indices": "items_*",
  "rename_pattern": "items_(.+)"
}'

This specific command will restore any indices that match items_* and rename them so that the new index name will be suffixed with -copy
The rename_pattern and rename_replacement parameters allow you to restore indices under new names

Updating Collection References:

curl -X "POST" "http://localhost:9200/items_my-collection-copy/_update_by_query" \
     -H 'Content-Type: application/json; charset=utf-8' \
     -d $'{
    "query": {
        "match_all": {}
},
  "script": {
    "lang": "painless",
    "params": {
      "collection": "my-collection-copy"
    },
    "source": "ctx._source.collection = params.collection"
  }
}'

After restoring, the item documents have been restored in the new index (e.g., my-collection-copy), but the value of the collection field in those documents is still the original value of my-collection
This command updates these values to match the new collection name using Elasticsearch's Update By Query feature

Creating a New Collection:
```
curl -X "POST" "http://localhost:8080/collections" \
     -H 'Content-Type: application/json' \
     -d $'{
  "id": "my-collection-copy"
}'
```
- The final step is to create a new collection through the API with the new name for each of the restored indices
- This gives you a copy of the collection that has a resource URI (/collections/my-collection-copy) and can be correctly queried by collection name

Reindexing

Overview: Reindexing allows you to copy documents from one index to another, optionally transforming them in the process.
Use Cases:
- Apply changes to documents
- Correct dynamically generated mappings
- Transform data (e.g., lowercase identifiers)
- The index templates will make sure that manually created indices will also have the correct mappings and settings

Example: Reindexing with Transformation:

curl -X "POST" "http://localhost:9200/_reindex" \
  -H 'Content-Type: application/json' \
  -d $'{
    "source": {
      "index": "items_my-collection-lower_my-collection-hex-000001"
    }, 
    "dest": {
      "index": "items_my-collection-lower_my-collection-hex-000002"
    },
    "script": {
      "source": "ctx._source.id = ctx._source.id.toLowerCase()",
      "lang": "painless"
    }
  }'

In this example, we make a copy of an existing Item index but change the Item identifier to be lowercase
The script parameter allows you to transform documents during the reindexing process

Updating Aliases:

curl -X "POST" "http://localhost:9200/_aliases" \
  -H 'Content-Type: application/json' \
  -d $'{
    "actions": [
      {
        "remove": {
          "index": "*",
          "alias": "items_my-collection"
        }
      },
      {
        "add": {
          "index": "items_my-collection-lower_my-collection-hex-000002",
          "alias": "items_my-collection"
        }
      }
    ]
  }'

If you are happy with the data in the newly created index, you can move the alias items_my-collection to the new index
This makes the modified Items with lowercase identifiers visible to users accessing my-collection in the STAC API
Using aliases allows you to switch between different index versions without changing the API endpoint

Auth

Overview: Authentication is an optional feature that can be enabled through Route Dependencies.
Implementation Options:
- Basic authentication
- OAuth2 with Keycloak
- Custom route dependencies
Configuration: Authentication can be configured using the STAC_FASTAPI_ROUTE_DEPENDENCIES environment variable.
Examples and Documentation: Detailed examples and implementation guides can be found in the examples/auth directory.

Aggregation

Supported Aggregations:
- Spatial aggregations of points and geometries
- Frequency distribution aggregation of any property including dates
- Temporal distribution of datetime values
Endpoint Locations:
- Root Catalog level: /aggregations
- Collection level: /<collection_id>/aggregations
Implementation Details: The sfeos_helpers.aggregation package provides specialized functionality for both Elasticsearch and OpenSearch backends.
Documentation: Detailed information about supported aggregations can be found in the aggregation docs.

Rate Limiting

Overview: Rate limiting is an optional security feature that controls API request frequency on a remote address basis.
Configuration: Enabled by setting the STAC_FASTAPI_RATE_LIMIT environment variable:
```
STAC_FASTAPI_RATE_LIMIT=500/minute
```
Functionality:
- Limits each client to a specified number of requests per time period (e.g., 500 requests per minute)
- Helps prevent API abuse and maintains system stability
- Ensures fair resource allocation among all clients
Examples: Implementation examples are available in the examples/rate_limit directory.

Name		Name	Last commit message	Last commit date
Latest commit History 886 Commits
.github		.github
assets		assets
dockerfiles		dockerfiles
docs		docs
elasticsearch/config		elasticsearch/config
examples		examples
opensearch/config		opensearch/config
sample_data		sample_data
scripts		scripts
stac_fastapi		stac_fastapi
.dockerignore		.dockerignore
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
compose.docs.yml		compose.docs.yml
compose.yml		compose.yml
data_loader.py		data_loader.py
tox.ini		tox.ini

License

stac-utils/stac-fastapi-elasticsearch-opensearch

Folders and files

Latest commit

History

Repository files navigation

stac-fastapi-elasticsearch-opensearch

Sponsors & Supporters

Project Introduction - What is SFEOS?

Common Deployment Patterns

Technologies

Table of Contents

Documentation & Resources

Package Structure

Examples

Performance

Direct Response Mode

Quick Start

Installation

Running Locally

Using Pre-built Docker Images

Using Docker Compose

Configuration Reference

Interacting with the API

Configure the API

Collection Pagination

Ingesting Sample Data CLI Tool

Elasticsearch Mappings

Managing Elasticsearch Indices

Snapshots

Reindexing

Auth

Aggregation

Rate Limiting

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 33

Packages 0

Uh oh!

Uh oh!

Contributors 16

Languages

Packages