GitHub - ETH-100/HarborX

HarborX is a high-performance data engine for Web3 that enhances data visibility and composability through open distribution.

You simply subscribe to the datasets you need, and HarborX will automatically catch up and stay in sync.

Datasets can be saved into your preferred database and queried with SQL or other familiar tools.

By standardizing all data into a unified intermediate format, any data stream can be published as a dataset — including social graphs, ENS records, token prices, transaction history, and more.

No more writing indexers, cleaning data, or juggling slow, complex API queries — everything happens locally.

👉 Live Demo · Architecture Guide

Core Capabilities

Subscribe & Go: Add a dataset with a single command (just like npm). HarborX fetches and continuously catches up — no need to build collectors, indexers, or multi-API queries.
Query Freely: Datasets are stored locally and can be queried directly using your favorite databases and SQL engines.
One-Click Publish: Turn your results into a new dataset and let the entire ecosystem benefit from your creativity.
Plug Any Upstream: Connect to diverse sources (Rollup Blobs, indexers, off-chain data, APIs, files, etc.) and compose them as needed.
Decentralized by Design: Persist datasets into decentralized storage so anyone can rebuild them — ensuring reliability even if the original source goes offline.

Quickstart (PoC)

Requirements: Python 3.9+

python -m pip install -U pip
pip install -e .[cli]

Add a Dataset

harborx add --base https://play.harborx.tech/data/ --subdir local -m --port 8080

You’ll see output similar to:

harborx add --base https://play.harborx.tech/data/ --subdir local -m --port 8080
[2025-08-13 23:46:03] 🔗 Using Lake base: https://play.harborx.tech/data
[2025-08-13 23:46:03] ⬇️  Fetching manifest.json from https://play.harborx.tech/data
[2025-08-13 23:46:04] 🧭 Rewriting relative paths → absolute URLs
[2025-08-13 23:46:04] 📄 No local manifest found, adopting remote manifest as local
[2025-08-13 23:46:04] 🧱 Materializing → HarborX\apps\web\data\local\objects
[2025-08-13 23:46:06]   ↓ https://play.harborx.tech/data/1456cfb63a334a39a06df3ee120daafb-0.arrow
    24cfd00e87ded65b.arrow done in 7.3s
[2025-08-13 23:46:17]   ↓ https://play.harborx.tech/data/2bb64d90458245e990c29acd605dadce-0.arrow
    db3c961ccec074e1.arrow done in 21.7s
[2025-08-13 23:52:05]   ↓ https://play.harborx.tech/data/chain_id=167001/date=20310/topic=state_diff/2bb64d90458245e990c29acd605dadce-0.parquet
    b2089242e6daa02b.parquet done in 1.7s
[2025-08-13 23:52:12]   ↓ https://play.harborx.tech/data/chain_id=167001/date=20310/topic=state_diff/7721bc175b1a4194a04f2779536f0d15-0.parquet
    4a1aea6b788713e2.parquet done in 1.5s
[2025-08-13 23:52:14] 🔍 Validating a few entries
[2025-08-13 23:52:14]   ✅ local OK: objects/24cfd00e87ded65b.arrow
[2025-08-13 23:52:14]   ✅ local OK: objects/db3c961ccec074e1.arrow
[2025-08-13 23:52:14]   ✅ local OK: objects/360ce304038b70f4.parquet
[2025-08-13 23:52:14]   ✅ local OK: objects/d7e92bfb8c72c3ac.parquet
[2025-08-13 23:52:14] 📊 Merge summary:
  • arrow:   2 items
  • parquet: 2 items
  • files:   0 items
  • segments:0 items
  (Δ added) arrow:+2 parquet:+2 files:+0 segments:+0
  Probed OK:4 FAIL:0
[2025-08-13 23:52:14] 🚀 Starting harborx static server at http://127.0.0.1:8080
[2025-08-13 23:52:14]     Serving directory: HarborX\apps\web (manifest at HarborX\apps\web\data\local\manifest.json)
[serve] http://127.0.0.1:8080 (root=HarborX\apps\web)

This is a PoC example; in production HarborX will automatically catch up and stay in sync.

Use Existing Static Data

Place your dataset (including manifest.json) under apps/web/data/ and run:

harborx serve --dir apps/web --port 8080
# Open http://127.0.0.1:8080

Fetch a Small Real Dataset

harborx blobscan --limit 3 --out apps/web/data --parquet
harborx manifest --root apps/web --data data --include-parquet
harborx serve --dir apps/web --port 8080

On the page, click Rebuild state to load the files, then try queries like:

SELECT COUNT(*) AS rows FROM state;

SELECT value, COUNT(*) AS c
FROM state
GROUP BY value
ORDER BY c DESC
LIMIT 10;

-- Last-write-wins (LWW) simple example
WITH ranked AS (
  SELECT value, timestamp, blob_index, position,
         ROW_NUMBER() OVER (ORDER BY timestamp DESC, blob_index DESC, position DESC) rn
  FROM state
)
SELECT value, timestamp
FROM ranked
WHERE rn = 1
LIMIT 50;

The frontend demo runs entirely in the browser with DuckDB-WASM, reading Arrow/Parquet files over HTTP.

No server-side database is required.

CLI Examples

All commands are provided via the single entrypoint harborx.

Fetch blobs + write Arrow/Parquet + generate manifest

harborx blobscan --limit 3 --out apps/web/data --parquet
harborx manifest --root apps/web --data data --include-parquet

Serve the static web demo

harborx serve --dir apps/web --port 8080

Frontend Demo

Location: apps/web/
Data folder: apps/web/data/
Manifest: apps/web/data/manifest.json

The app resolves file paths based on the manifest.

License

Apache-2.0. See LICENSE.

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
.github/workflows		.github/workflows
apps/web		apps/web
bench		bench
harborx		harborx
legacy		legacy
tests		tests
tools		tools
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
pyproject.toml		pyproject.toml
report_windows.md		report_windows.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Core Capabilities

Quickstart (PoC)

Add a Dataset

Use Existing Static Data

Fetch a Small Real Dataset

CLI Examples

Frontend Demo

License

About

Uh oh!

Releases

Packages

Languages

License

ETH-100/HarborX

Folders and files

Latest commit

History

Repository files navigation

Core Capabilities

Quickstart (PoC)

Add a Dataset

Use Existing Static Data

Fetch a Small Real Dataset

CLI Examples

Frontend Demo

License

About

Resources

License

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages