Skip to content

allenai/lighthouse

Repository files navigation

Lighthouse

Fast and precise distance to shoreline calculations from anywhere on earth (AoE). See arXiv for details.

Key Features:

  • 10-meter resolution land/water classification
  • millisecond distance-to-coast calculations from anywhere on earth
  • global coverage
  • includes inland bodies of water such as rivers, lakes, bays, etc

Requirements

  • Docker 24.0+
  • 1 CPU
  • 2GB RAM

For streaming/real-time use cases, it is recommended to download the entire dataset (500 GB) to disk.

Quick Start

docker pull ghcr.io/allenai/lighthouse
docker run -d \
  --name lighthouse \
  -p 8000:8000 \
  -v path/to/data:/data \
  ghcr.io/allenai/lighthouse

See Installation for downloading dataset from gcp.

Example Usage

import requests

response = requests.post(
    "http://localhost:8000/detect",
    json={"lat": 47.636895, "lon": -122.334984},
    timeout=30
)

print(response.json())

Expected output:

{
  "distance_to_coast_m": 275,
  "land_cover_class": "Permanent water bodies",
  "nearest_coastal_point": [47.63742, -122.33858],
  "version": "2024-11-12T00:25:16.667195"
}

Installation

Dataset Download

Note that the full dataset requires approximately 500 GB of storage space. The dataset is stored in a public Google Cloud Storage bucket at:

gs://ai2-coastlines/v1/data
mkdir -p data
# using gcloud (see: https://cloud.google.com/sdk/docs/install)
gcloud storage cp --recursive gs://ai2-coastlines/v1/data /path/to/local/data

# using gsutil (see https://cloud.google.com/storage/docs/gsutil_install)
gutil -m cp -r gs://ai2-coastlines/v1/data /path/to/local/data

# using wget (
wget -r -np -nH --cut-dirs=3 -P data https://storage.googleapis.com/ai2-coastlines/v1/data/

The above command will download two types of files:

a. Ball Trees: (ai2-coastlines/v1/data/ball_trees) Example: ai2-coastlines/v1/data/ball_trees/Ai2_WorldCover_10m_2024_v1_N00E006_Map_coastal_points_ball_tree.joblib (1.4 MB)

b. Resampled H5s: (ai2-coastlines/v1/data/resampled_h5s) Example: ai2-coastlines/v1/data/resampled_h5s/Ai2_WorldCover_10m_2024_v1_N00E006_Map.h5 (584.2 KB)

Individual tiles (1 degree by 1 degree) can also be downloaded. The lat/lon in the filename coresponds to the upper left corner of the tile (e.g. N00E006).

Deployment

(requires downloading dataset above) Note that some sample inferences/examples can run without the full dataset.

Option 1: Using Pre-built Image (Recommended)

docker pull ghcr.io/allenai/lighthouse:sha-X
docker run -d \
  --name lighthouse \
  -p 8000:8000 \
  -v path/to/data:/src/data \
  ghcr.io/allenai/lighthouse:sha-X

Option 2: Building from Source

git clone https://github.com/allenai/lighthouse.git
cd lighthouse
docker build -t lighthouse .
docker run -d \
  --name lighthouse \
  -p 8000:8000 \
  -v path/to/data:/src/data \
  lighthouse

Development

Local Setup

# Create virtual environment
python -m venv .venv
source .venv/bin/activate

# Install dependencies
pip install -r requirements/requirements.txt
pip install -r requirements/requirements-dev.txt

# Install pre-commit hooks
pre-commit install

Running Tests

pytest tests

Code Conventions

  • Black for formatting
  • Ruff for linting
  • MyPy for type checking
  • Pre-commit hooks for automated checks

Building Dataset from Scratch

Click to expand
  1. Download ESA WorldCover data:

    bash src/download_worldcover.sh
  2. Download OSM land polygons:

    wget -P data/osm \
      https://osmdata.openstreetmap.de/download/land-polygons-split-4326.zip
    unzip data/osm/land-polygons-split-4326.zip -d data/osm
  3. Process data:

    python src/gen_all_missing_tiles.py
    python src/convert_geotiff_to_h5.py
    python src/extract_coastal_points.py
    python src/convert_coastal_points_to_ball_trees.py

How does this algorithm work?

Lighthouse** (Layered Iterative Geospatial Hierarchical Terrain-Oriented Unified Search Engine) leverages

  1. pre-computed spherical Voronoi tesselation of the whole planet's coastlines (at low resolution) and
  2. ball trees (at high resolution) to produce very fast computations with minimal resources.

The ball trees were generated from a hybrid dataset of satellite imagery based annotations from two sources:

voronoi (1)

^^ that's the Voronoi.

triplet_of_fun ^^ that's a depiction of the method.

See the paper ([todo: add link arXiv]) for details.

License

Code: Apache 2.0

Dataset: Open Database License (ODbL) v1.0

Acknowledgments

We gratefully acknowledge:

  • The European Space Agency (ESA) for creating the WorldCover land cover map and for making it openly accessible
  • The OpenStreetMap community for their invaluable contributions to global mapping

References

ESA WorldCover 2021

@article{zanaga2021esa,
  title={ESA WorldCover 10 m 2021 v200},
  author={Zanaga, D and Van De Kerchove, R and De Keersmaecker, W and Souverijns, N and Brockmann, C and Quast, R and Wevers, J and Grosu, A and Paccini, A and Vergnaud, S and others},
  year={2021},
  publisher={ESA},
  doi={10.5281/zenodo.5571936}
}

OpenStreetMap

Citation

Lighthouse: https://arxiv.org/abs/2506.18842

**Also Lighthouse is an excellent coffee shop in Seattle.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors 3

  •  
  •  
  •