A blazing-fast, content-addressable, distributed database system designed as a modern, developer-friendly alternative to IPFS.
OpenHashDB is a revolutionary distributed storage system that combines the power of content-addressable storage with modern peer-to-peer networking technologies. Built with Go and leveraging libp2p for networking, OpenHashDB provides both CLI and REST API interfaces for maximum flexibility and integration potential.
- Content-Addressable Storage: Every piece of data is identified by its SHA-256 hash, ensuring data integrity and enabling efficient deduplication
- Distributed Architecture: Peer-to-peer networking via libp2p enables truly decentralized storage without single points of failure
- Dual Interface: Both command-line interface (CLI) and REST API for different use cases and integration scenarios
- Automatic Chunking: Large files are automatically split into manageable chunks for efficient storage and transfer
- Merkle DAG Support: Directory structures are represented as Merkle Directed Acyclic Graphs for efficient verification
- Content Replication: Configurable replication factors ensure data availability across the network
- Pinning System: Explicit content pinning prevents garbage collection of important data
- High Performance: Optimized for speed with concurrent processing and efficient storage backends
-
Prerequisites
- Go 1.21 or later
- Git
-
Build from Source
git clone <repository-url> cd openhash go build -o openhash .
-
Verify Installation
./openhash --help
Add a single file to OpenHashDB:
./openhash add myfile.txt
Add a directory:
./openhash add ./my-directory
List all stored content:
./openhash list
Retrieve content by hash:
./openhash get <hash>
View content metadata:
./openhash view <hash>
Start OpenHashDB with REST API:
./openhash daemon
The REST API will be available at http://localhost:8080
by default.
OpenHashDB follows a modular architecture with distinct layers:
- Hasher: SHA-256 content addressing and integrity verification
- Storage: LevelDB-based persistent storage with metadata management
- Chunker: Configurable file chunking with Merkle tree construction
- DAG Builder: Merkle DAG construction for directory structures
- libp2p Node: Peer-to-peer networking and protocol handling
- Replicator: Content replication and availability management
- Discovery: Peer discovery via mDNS and DHT
- REST Server: HTTP API for web integration
- CLI Interface: Command-line tools for direct usage
--db string
: Database path (default "./openhash.db")--api-port int
: REST API port (default 8080)--p2p-port int
: P2P port (0 for random)--verbose
: Enable verbose output
Add a file or folder to OpenHashDB.
Examples:
./openhash add document.pdf
./openhash add ./website --verbose
Retrieve and display content by hash.
Examples:
./openhash get 8c449c94ffd47419a47710dbcdc2d0ab84e10e8a294fdc4b4d70a5be8608df1e
Display detailed metadata about content.
Examples:
./openhash view 8c449c94ffd47419a47710dbcdc2d0ab84e10e8a294fdc4b4d70a5be8608df1e
List all stored content with metadata.
Start the OpenHashDB daemon with networking and REST API.
Flags:
--enable-rest
: Enable REST API (default true)
Examples:
./openhash daemon
./openhash daemon --api-port 9000
http://localhost:8080
GET /health
Returns server health status.
Response:
{
"status": "healthy",
"timestamp": "2025-07-28T11:23:53.745950516-04:00",
"version": "1.0.0"
}
POST /upload/file
Content-Type: multipart/form-data
Upload a single file.
Parameters:
file
: File to upload (form data)
Response:
{
"hash": "8c449c94ffd47419a47710dbcdc2d0ab84e10e8a294fdc4b4d70a5be8608df1e",
"size": 611,
"filename": "test_file.txt",
"message": "File uploaded successfully"
}
GET /download/{hash}
Download content by hash.
Response: Raw file content with appropriate headers.
GET /view/{hash}
View content inline (CDN-style).
Response: Content with inline viewing headers.
GET /info/{hash}
Get detailed information about content.
Response:
{
"hash": "8c449c94ffd47419a47710dbcdc2d0ab84e10e8a294fdc4b4d70a5be8608df1e",
"filename": "test_file.txt",
"mime_type": "text/plain",
"size": 611,
"mod_time": "2025-07-28T11:14:53.004720466-04:00",
"is_directory": false,
"created_at": "2025-07-28T11:21:49.814410096-04:00",
"ref_count": 1
}
GET /list
List all stored content.
Response: Array of content information objects.
POST /pin/{hash}?priority=1
Pin content to prevent garbage collection.
Query Parameters:
priority
: Pin priority (default 1)
DELETE /unpin/{hash}
Remove pin from content.
GET /pins
List all pinned content.
GET /stats
Get system statistics.
Response:
{
"storage": {
"content_count": 2,
"chunk_count": 0
},
"replication": {
"replication_factor": 3,
"pinned_content": 0,
"pending_requests": 0
},
"timestamp": "2025-07-28T11:24:31.585560549-04:00"
}
OpenHashDB uses LevelDB for persistent storage. The database path can be configured using the --db
flag:
./openhash --db /path/to/database add myfile.txt
Configure networking ports:
./openhash --p2p-port 4001 --api-port 8080 daemon
The default replication factor is 3, meaning content will be replicated to 3 nodes in the network. This can be adjusted in the source code by modifying the DefaultReplicationFactor
constant.
openhash/
├── cmd/openhash/ # CLI command implementations
├── core/
│ ├── dag/ # Merkle DAG implementation
│ ├── chunker/ # File chunking logic
│ ├── hasher/ # SHA-256 hashing utilities
│ └── storage/ # LevelDB storage layer
├── network/
│ ├── libp2p/ # libp2p networking
│ └── replicator/ # Content replication
├── api/
│ └── rest/ # REST API server
├── gateway/
│ ├── cdn/ # CDN-style content delivery
│ └── raw/ # Raw content handlers
└── main.go # Application entry point
go build -o openhash .
Run the test suite:
go test ./...
Key dependencies include:
github.com/spf13/cobra
: CLI frameworkgithub.com/syndtr/goleveldb/leveldb
: LevelDB storagegithub.com/libp2p/go-libp2p
: P2P networkinggithub.com/gorilla/mux
: HTTP routing
OpenHashDB can serve as a decentralized CDN for static websites and assets:
-
Upload website files:
./openhash add ./website
-
Start daemon:
./openhash daemon
-
Access content via REST API:
http://localhost:8080/view/{hash}/index.html
Use OpenHashDB for distributed backup storage:
-
Add files to be backed up:
./openhash add ./important-documents
-
Pin critical content:
curl -X POST http://localhost:8080/pin/{hash}?priority=10
Share files across a distributed network:
-
Add file to local node:
./openhash add shared-file.pdf
-
Share the hash with others
-
Others can retrieve the file:
./openhash get {hash}
All content is verified using SHA-256 hashes, ensuring data integrity. Any tampering with content will result in hash mismatches that are automatically detected.
- All peer-to-peer communications use libp2p's built-in security features
- Content is immutable once stored
- No authentication is required for read operations (by design)
- Write operations are local to each node
- Content hashes may reveal information about stored data
- All content is stored unencrypted
- Network traffic is visible to peers
- Consider encryption at the application layer for sensitive data
- Use SSD storage for better performance
- Configure appropriate chunk sizes based on content type
- Regular garbage collection to remove unreferenced content
- Ensure good network connectivity between peers
- Use appropriate replication factors (higher for critical content)
- Monitor peer connections and network health
- LevelDB uses configurable write buffers (default 64MB)
- In-memory caching improves access times for frequently used content
- Monitor memory usage in production deployments
- Ensure only one OpenHashDB instance is running per database
- Check file permissions on the database directory
- Verify firewall settings allow P2P traffic
- Check if ports are available and not blocked
- Ensure proper network connectivity between peers
- Reduce write buffer size in storage configuration
- Implement regular garbage collection
- Monitor chunk sizes and adjust if necessary
Enable verbose logging:
./openhash --verbose add myfile.txt
Check daemon logs:
./openhash daemon > openhash.log 2>&1
We welcome contributions to OpenHashDB! Please follow these guidelines:
- Fork the repository
- Create a feature branch
- Make your changes with appropriate tests
- Submit a pull request with a clear description
- Install Go 1.21+
- Clone the repository
- Install dependencies:
go mod download
- Run tests:
go test ./...
- Build:
go build -o openhash .
[License information would go here]
For support and questions:
- Create an issue on the project repository
- Check the documentation and troubleshooting guide
- Review existing issues for similar problems
Future enhancements planned:
- IPFS compatibility bridge
- Enhanced storage backends (S3, distributed filesystems)
- Web-based management interface
- Advanced analytics and monitoring
- Smart prefetching based on usage patterns
- Enhanced security features and access controls
OpenHashDB now includes advanced peer discovery capabilities and a Distributed Hash Table (DHT) for efficient content lookup across the network. These enhancements significantly improve the system's ability to connect to peers and locate content, even when it's not directly stored on connected nodes.
Bootnodes serve as initial connection points for nodes joining the OpenHashDB network. This feature provides reliable network bootstrapping beyond local mDNS discovery.
You can specify bootnodes when starting any OpenHashDB command using the --bootnode
flag:
# Single bootnode
./openhash daemon --bootnode /ip4/192.168.1.100/tcp/4001/p2p/12D3KooWExample...
# Multiple bootnodes (comma-separated)
./openhash daemon --bootnode /ip4/192.168.1.100/tcp/4001/p2p/12D3KooWExample...,/ip4/192.168.1.101/tcp/4001/p2p/12D3KooWAnother...
# Add content with bootnode connection
./openhash add myfile.txt --bootnode /ip4/192.168.1.100/tcp/4001/p2p/12D3KooWExample...
When no bootnodes are specified, OpenHashDB will attempt to connect to a set of hardcoded, well-known bootnodes maintained by the project. This ensures out-of-the-box connectivity to the OpenHashDB network without requiring manual configuration.
The integrated Kademlia-based DHT enables efficient content discovery across the network. When content is added to OpenHashDB, the node announces itself as a provider for that content's hash. Other nodes can then query the DHT to find providers for specific content.
- Content Announcement: When you add content, your node automatically announces to the DHT that it can provide that content
- Content Discovery: When requesting content not available locally, your node queries the DHT to find providers
- Automatic Retrieval: Once providers are found, your node automatically retrieves the content from available peers
- Local Caching: Retrieved content is automatically cached locally for future access
You can monitor DHT status through the REST API:
curl http://localhost:8080/network
This returns detailed network information including:
- Connected peers
- DHT status and peer count
- Node addresses and peer ID
- K-bucket information
The enhanced OpenHashDB network architecture now includes:
- libp2p Host: Core networking layer for peer-to-peer connections
- Kademlia DHT: Distributed hash table for content discovery
- Bootnode Integration: Reliable network bootstrapping
- mDNS Discovery: Local network peer discovery
- Content Replication: Automatic content distribution across peers
The DHT integration provides several performance improvements:
- Efficient Content Lookup: O(log N) lookup complexity instead of exhaustive peer searches
- Network Resilience: Content remains accessible even if original providers go offline
- Automatic Load Distribution: Popular content is automatically replicated across multiple nodes
- Reduced Network Traffic: Targeted queries instead of broadcast searches
The DHT implementation includes several security features:
- Content Verification: All retrieved content is verified against its hash before use
- Peer Authentication: libp2p's built-in peer authentication prevents impersonation
- Rate Limiting: DHT queries are rate-limited to prevent abuse
- Tamper Detection: Merkle DAG structure ensures content integrity
OpenHashDB now supports an --api
flag for CLI commands that resolves database access conflicts when a daemon is running. This feature allows you to interact with a running OpenHashDB daemon via its REST API instead of directly accessing the database files.
When the OpenHashDB daemon is running, it maintains an exclusive lock on the database files to ensure data consistency. This means that CLI commands like add
, get
, list
, and view
cannot directly access the database and will fail with errors like:
Error: failed to initialize storage: failed to open database: The process cannot access the file because it is being used by another process.
The --api
flag enables CLI commands to communicate with the running daemon through its REST API instead of directly accessing the database. This provides several benefits:
- No database conflicts: Commands work seamlessly while daemon is running
- Network operations: Can interact with remote OpenHashDB instances
- Consistent behavior: Same commands work in both direct and API modes
- Automatic fallback: Uses default localhost URL when no URL is specified
# Start the daemon
./openhash daemon
# In another terminal, use API mode for commands
./openhash add myfile.txt --api http://localhost:8080
./openhash list --api http://localhost:8080
./openhash get <hash> --api http://localhost:8080
./openhash view <hash> --api http://localhost:8080
When using --api
with an empty string, OpenHashDB automatically uses the default localhost URL based on the --api-port
setting:
# These are equivalent when --api-port is 8080 (default)
./openhash list --api ""
./openhash list --api http://localhost:8080
You can interact with OpenHashDB instances running on other machines:
# Connect to remote OpenHashDB instance
./openhash add myfile.txt --api http://192.168.1.100:8080
./openhash list --api http://remote-server:8080
# Daemon running on custom port
./openhash daemon --api-port 9090
# CLI commands using custom port
./openhash add myfile.txt --api http://localhost:9090
- Uploads files via multipart form data to
/upload/file
endpoint - Displays upload confirmation with hash, size, and filename
- Folder uploads via API are not yet implemented (use daemon mode)
- Retrieves content information via
/info/<hash>
endpoint - Downloads and displays small text files automatically
- Shows file metadata including size, MIME type, and content preview
- Fetches content list via
/list
endpoint - Displays all stored content with metadata
- Shows file/folder icons and creation timestamps
- Currently behaves identically to
get
command in API mode - Retrieves and displays content information and preview
API mode includes comprehensive error handling:
$ ./openhash list --api http://localhost:8080
Error: API connection failed: failed to connect to API at http://localhost:8080: dial tcp: connect: connection refused
$ ./openhash get invalid-hash --api http://localhost:8080
Error: content not found: invalid-hash
API requests include reasonable timeouts to prevent hanging operations.
Aspect | Direct Mode | API Mode |
---|---|---|
Speed | Fastest (direct DB access) | Slightly slower (HTTP overhead) |
Concurrency | Single process only | Multiple concurrent clients |
Network | Local only | Local and remote access |
Daemon Required | No | Yes |
Database Locks | Exclusive access needed | No conflicts |
Use Direct Mode when:
- No daemon is running
- Maximum performance is required
- Working with local files only
- Single-user scenarios
Use API Mode when:
- Daemon is already running
- Multiple users need access
- Remote access is required
- Building applications that integrate with OpenHashDB
The API mode makes OpenHashDB CLI commands suitable for integration with scripts and applications:
#!/bin/bash
# Example script using API mode
# Start daemon if not running
if ! curl -s http://localhost:8080/health > /dev/null; then
./openhash daemon &
sleep 3
fi
# Upload files using API mode
for file in *.txt; do
./openhash add "$file" --api ""
done
# List all content
./openhash list --api ""
- Daemon not running: Ensure the daemon is started before using API mode
- Wrong port: Verify the daemon is running on the expected port
- Firewall blocking: Check that the API port is accessible
- Network connectivity: Ensure network connection to remote instances
# Check if daemon is running
curl http://localhost:8080/health
# Check daemon logs
./openhash daemon --verbose
# Test API connectivity
./openhash list --api http://localhost:8080 --verbose