Description
I have attached to this ticket a WIT file that describes a generic interface for graph database operations. This interface can be implemented by various providers, either by emulating features not present in a given provider, utilizing the provider's native support for a feature, or indicating an error if a particular combination is not natively supported by a provider.
The intent of this WIT specification is to allow developers of WASM components (on wasmCloud, Spin, or Golem) to leverage graph database capabilities to build graph-powered applications, knowledge graphs, and relationship analysis services in a portable and provider-agnostic fashion.
This ticket involves constructing implementations of this WIT interface for the following providers:
- Neo4j: The leading property graph database with comprehensive Cypher query language support, ACID transactions, and rich schema management capabilities.
- ArangoDB: A multi-model database with strong graph capabilities, collection-based data organization, and AQL query language support.
- JanusGraph: A distributed, highly scalable graph database with pluggable storage backends and comprehensive Gremlin traversal support.
These implementations must be written in Rust and compilable to WASM Components (WASI 0.23 only, since Golem does not yet support WASI 0.3). The standard Rust toolchain for WASM component development can be employed (see cargo component and the Rust examples of components in this and other Golem repositories).
Additionally, these implementations should incorporate custom durability semantics using the Golem durability API and the Golem host API. This approach ensures that durability is managed at the level of individual graph operations (create-vertex, create-edge, query execution, transaction commit/rollback), providing a higher-level and clearer operation log, which aids in debugging and monitoring. See golem:llm and golem:embed for more details and durable implementations in this same repository.
The final deliverables associated with this ticket are:
- Neo4j implementation: A WASM Component (WASI 0.23), named
graph-neo4j.wasm
, with a full test suite and custom durability implementation at the level of graph operations. - ArangoDB implementation: A WASM Component (WASI 0.23), named
graph-arangodb.wasm
, with a full test suite and custom durability implementation at the level of graph operations. - JanusGraph implementation: A WASM Component (WASI 0.23), named
graph-janusgraph.wasm
, with a full test suite and custom durability implementation at the level of graph operations.
Note: If you have a strong recommendation to swap out one or two of these with other popular / common graph databases, then as long as you get permission beforehand, that's okay with me. However, we definitely need Neo4j and ArangoDB.
These components will require runtime configuration, notably connection strings, authentication credentials, database names, and endpoint URLs. For configuring this information, the components can use environment variables for now (in the future, they will use wasi-runtime-config, but Golem does not support this yet, whereas Golem has good support for environment variables).
Moreover, the Rust components need to be tested within Golem to ensure compatibility with Golem 1.2.x.
This WIT has been designed by examining and comparing the APIs of Neo4j, ArangoDB, NebulaGraph, Amazon Neptune, TigerGraph, and JanusGraph. However, given there are no implementations, it is possible the provided WIT is not the optimal abstraction across all these providers. Therefore, deviations from the proposed design can be made. However, to be accepted, any deviation must be fully justified and deemed by Golem core contributors to be an improvement from the original specification.
Implementation Guidelines
Each provider implementation should handle the following key mapping considerations:
- Vertex Types: Map the
vertex-type
field appropriately (to labels for Neo4j, collections for ArangoDB, vertex labels for JanusGraph, etc.) - Transaction Semantics: Implement native transactions where supported, or emulate via sequential operations with appropriate error handling
- Schema Management: Utilize native schema capabilities where available, or return
unsupported-operation
errors for unsupported schema operations - Query Language Support: Route queries through the generic query interface using each provider's native query language (Cypher, AQL, Gremlin, etc.)
- Error Mapping: Map provider-specific errors to the unified
graph-error
enumeration - Property Type Conversion: Handle conversion between the unified property type system and provider-specific type systems
Testing Requirements
Each implementation must include comprehensive test suites covering:
- Basic CRUD operations (vertex/edge creation, retrieval, update, deletion)
- Transaction lifecycle (begin, commit, rollback)
- Schema operations (type definition, index creation, constraint management)
- Query execution with various complexity levels
- Traversal operations (pathfinding, neighborhood exploration)
- Error handling for unsupported operations
- Connection management and configuration
- Durability semantics verification
package golem:graph@1.0.0;
/// Core data types and structures unified across graph databases
interface types {
/// Universal property value types that can be represented across all graph databases
variant property-value {
null-value,
boolean(bool),
int8(s8),
int16(s16),
int32(s32),
int64(s64),
uint8(u8),
uint16(u16),
uint32(u32),
uint64(u64),
float32(f32),
float64(f64),
string(string),
bytes(list<u8>),
// Temporal types (unified representation)
date(date),
time(time),
datetime(datetime),
duration(duration),
// Geospatial types (unified GeoJSON-like representation)
point(point),
linestring(linestring),
polygon(polygon),
// Collection types
list(list<property-value>),
map(list<tuple<string, property-value>>),
set(list<property-value>),
}
/// Temporal types with unified representation
record date {
year: u32,
month: u8, // 1-12
day: u8, // 1-31
}
record time {
hour: u8, // 0-23
minute: u8, // 0-59
second: u8, // 0-59
nanosecond: u32, // 0-999,999,999
}
record datetime {
date: date,
time: time,
timezone-offset-minutes: option<s16>, // UTC offset in minutes
}
record duration {
seconds: s64,
nanoseconds: u32,
}
/// Geospatial types (WGS84 coordinates)
record point {
longitude: f64,
latitude: f64,
altitude: option<f64>,
}
record linestring {
coordinates: list<point>,
}
record polygon {
exterior: list<point>,
holes: option<list<list<point>>>,
}
/// Universal element ID that can represent various database ID schemes
variant element-id {
string(string),
int64(s64),
uuid(string),
composite(list<property-value>),
}
/// Property map - consistent with insertion format
type property-map = list<tuple<string, property-value>>;
/// Vertex representation
record vertex {
id: element-id,
vertex-type: string, // Primary type (collection/tag/label)
additional-labels: list<string>, // Secondary labels (Neo4j-style)
properties: property-map,
}
/// Edge representation
record edge {
id: element-id,
edge-type: string, // Edge type/relationship type
from-vertex: element-id,
to-vertex: element-id,
properties: property-map,
}
/// Path through the graph
record path {
vertices: list<vertex>,
edges: list<edge>,
length: u32,
}
/// Direction for traversals
enum direction {
outgoing,
incoming,
both,
}
/// Comparison operators for filtering
enum comparison-operator {
equal,
not-equal,
less-than,
less-than-or-equal,
greater-than,
greater-than-or-equal,
contains,
starts-with,
ends-with,
regex-match,
in-list,
not-in-list,
}
/// Filter condition for queries
record filter-condition {
property: string,
operator: comparison-operator,
value: property-value,
}
/// Sort specification
record sort-spec {
property: string,
ascending: bool,
}
}
/// Error handling unified across all graph database providers
interface errors {
/// Comprehensive error types that can represent failures across different graph databases
variant graph-error {
// Feature/operation not supported by current provider
unsupported-operation(string),
// Connection and authentication errors
connection-failed(string),
authentication-failed(string),
authorization-failed(string),
// Data and schema errors
element-not-found(element-id),
duplicate-element(element-id),
schema-violation(string),
constraint-violation(string),
invalid-property-type(string),
invalid-query(string),
// Transaction errors
transaction-failed(string),
transaction-conflict,
transaction-timeout,
deadlock-detected,
// System errors
timeout,
resource-exhausted(string),
internal-error(string),
service-unavailable(string),
}
}
/// Connection management and graph instance creation
interface connection {
use errors.{graph-error};
/// Configuration for connecting to graph databases
record connection-config {
// Connection parameters
hosts: list<string>,
port: option<u16>,
database-name: option<string>,
// Authentication
username: option<string>,
password: option<string>,
// Connection behavior
timeout-seconds: option<u32>,
max-connections: option<u32>,
// Provider-specific configuration as key-value pairs
provider-config: list<tuple<string, string>>,
}
/// Main graph database resource
resource graph {
/// Create a new transaction for performing operations
begin-transaction: func() -> result<transaction, graph-error>;
/// Create a read-only transaction (may be optimized by provider)
begin-read-transaction: func() -> result<transaction, graph-error>;
/// Test connection health
ping: func() -> result<_, graph-error>;
/// Close the graph connection
close: func() -> result<_, graph-error>;
/// Get basic graph statistics if supported
get-statistics: func() -> result<graph-statistics, graph-error>;
}
/// Basic graph statistics
record graph-statistics {
vertex-count: option<u64>,
edge-count: option<u64>,
label-count: option<u32>,
property-count: option<u64>,
}
/// Connect to a graph database with the specified configuration
connect: func(config: connection-config) -> result<graph, graph-error>;
}
/// All graph operations performed within transaction contexts
interface transactions {
use types.{vertex, edge, path, element-id, property-map, property-value, filter-condition, sort-spec, direction};
use errors.{graph-error};
/// Transaction resource - all operations go through transactions
resource transaction {
// === VERTEX OPERATIONS ===
/// Create a new vertex
create-vertex: func(vertex-type: string, properties: property-map) -> result<vertex, graph-error>;
/// Create vertex with additional labels (for multi-label systems like Neo4j)
create-vertex-with-labels: func(vertex-type: string, additional-labels: list<string>, properties: property-map) -> result<vertex, graph-error>;
/// Get vertex by ID
get-vertex: func(id: element-id) -> result<option<vertex>, graph-error>;
/// Update vertex properties (replaces all properties)
update-vertex: func(id: element-id, properties: property-map) -> result<vertex, graph-error>;
/// Update specific vertex properties (partial update)
update-vertex-properties: func(id: element-id, updates: property-map) -> result<vertex, graph-error>;
/// Delete vertex (and optionally its edges)
delete-vertex: func(id: element-id, delete-edges: bool) -> result<_, graph-error>;
/// Find vertices by type and optional filters
find-vertices: func(
vertex-type: option<string>,
filters: option<list<filter-condition>>,
sort: option<list<sort-spec>>,
limit: option<u32>,
offset: option<u32>
) -> result<list<vertex>, graph-error>;
// === EDGE OPERATIONS ===
/// Create a new edge
create-edge: func(
edge-type: string,
from-vertex: element-id,
to-vertex: element-id,
properties: property-map
) -> result<edge, graph-error>;
/// Get edge by ID
get-edge: func(id: element-id) -> result<option<edge>, graph-error>;
/// Update edge properties
update-edge: func(id: element-id, properties: property-map) -> result<edge, graph-error>;
/// Update specific edge properties (partial update)
update-edge-properties: func(id: element-id, updates: property-map) -> result<edge, graph-error>;
/// Delete edge
delete-edge: func(id: element-id) -> result<_, graph-error>;
/// Find edges by type and optional filters
find-edges: func(
edge-types: option<list<string>>,
filters: option<list<filter-condition>>,
sort: option<list<sort-spec>>,
limit: option<u32>,
offset: option<u32>
) -> result<list<edge>, graph-error>;
// === TRAVERSAL OPERATIONS ===
/// Get adjacent vertices through specified edge types
get-adjacent-vertices: func(
vertex-id: element-id,
direction: direction,
edge-types: option<list<string>>,
limit: option<u32>
) -> result<list<vertex>, graph-error>;
/// Get edges connected to a vertex
get-connected-edges: func(
vertex-id: element-id,
direction: direction,
edge-types: option<list<string>>,
limit: option<u32>
) -> result<list<edge>, graph-error>;
// === BATCH OPERATIONS ===
/// Create multiple vertices in a single operation
create-vertices: func(vertices: list<vertex-spec>) -> result<list<vertex>, graph-error>;
/// Create multiple edges in a single operation
create-edges: func(edges: list<edge-spec>) -> result<list<edge>, graph-error>;
/// Upsert vertex (create or update)
upsert-vertex: func(
id: option<element-id>,
vertex-type: string,
properties: property-map
) -> result<vertex, graph-error>;
/// Upsert edge (create or update)
upsert-edge: func(
id: option<element-id>,
edge-type: string,
from-vertex: element-id,
to-vertex: element-id,
properties: property-map
) -> result<edge, graph-error>;
// === TRANSACTION CONTROL ===
/// Commit the transaction
commit: func() -> result<_, graph-error>;
/// Rollback the transaction
rollback: func() -> result<_, graph-error>;
/// Check if transaction is still active
is-active: func() -> bool;
}
/// Vertex specification for batch creation
record vertex-spec {
vertex-type: string,
additional-labels: option<list<string>>,
properties: property-map,
}
/// Edge specification for batch creation
record edge-spec {
edge-type: string,
from-vertex: element-id,
to-vertex: element-id,
properties: property-map,
}
}
/// Schema management operations (optional/emulated for schema-free databases)
interface schema {
use types.{property-value};
use errors.{graph-error};
use connection.{edge-type-definition};
/// Property type definitions for schema
enum property-type {
boolean,
int32,
int64,
float32,
float64,
string,
bytes,
date,
datetime,
point,
list,
map,
}
/// Index types
enum index-type {
exact, // Exact match index
range, // Range queries (>, <, etc.)
text, // Text search
geospatial, // Geographic queries
}
/// Property definition for schema
record property-definition {
name: string,
type: property-type,
required: bool,
unique: bool,
default-value: option<property-value>,
}
/// Vertex label schema
record vertex-label-schema {
label: string,
properties: list<property-definition>,
/// Container/collection this label maps to (for container-based systems)
container: option<string>,
}
/// Edge label schema
record edge-label-schema {
label: string,
properties: list<property-definition>,
from-labels: option<list<string>>, // Allowed source vertex labels
to-labels: option<list<string>>, // Allowed target vertex labels
/// Container/collection this label maps to (for container-based systems)
container: option<string>,
}
/// Index definition
record index-definition {
name: string,
label: string, // Vertex or edge label
properties: list<string>, // Properties to index
type: index-type,
unique: bool,
/// Container/collection this index applies to
container: option<string>,
}
/// Schema management resource
resource schema-manager {
/// Define or update vertex label schema
define-vertex-label: func(schema: vertex-label-schema) -> result<_, graph-error>;
/// Define or update edge label schema
define-edge-label: func(schema: edge-label-schema) -> result<_, graph-error>;
/// Get vertex label schema
get-vertex-label-schema: func(label: string) -> result<option<vertex-label-schema>, graph-error>;
/// Get edge label schema
get-edge-label-schema: func(label: string) -> result<option<edge-label-schema>, graph-error>;
/// List all vertex labels
list-vertex-labels: func() -> result<list<string>, graph-error>;
/// List all edge labels
list-edge-labels: func() -> result<list<string>, graph-error>;
/// Create index
create-index: func(index: index-definition) -> result<_, graph-error>;
/// Drop index
drop-index: func(name: string) -> result<_, graph-error>;
/// List indexes
list-indexes: func() -> result<list<index-definition>, graph-error>;
/// Get index by name
get-index: func(name: string) -> result<option<index-definition>, graph-error>;
/// Define edge type for structural databases (ArangoDB-style)
define-edge-type: func(definition: edge-type-definition) -> result<_, graph-error>;
/// List edge type definitions
list-edge-types: func() -> result<list<edge-type-definition>, graph-error>;
/// Create container/collection for organizing data
create-container: func(name: string, container-type: container-type) -> result<_, graph-error>;
/// List containers/collections
list-containers: func() -> result<list<container-info>, graph-error>;
}
/// Container/collection types
enum container-type {
vertex-container,
edge-container,
}
/// Container information
record container-info {
name: string,
type: container-type,
element-count: option<u64>,
}
/// Get schema manager for the graph
get-schema-manager: func() -> result<schema-manager, graph-error>;
}
/// Generic query interface for database-specific query languages
interface query {
use types.{vertex, edge, path, property-value};
use errors.{graph-error};
use transactions.{transaction};
/// Query result that maintains symmetry with data insertion formats
variant query-result {
vertices(list<vertex>),
edges(list<edge>),
paths(list<path>),
values(list<property-value>),
maps(list<list<tuple<string, property-value>>>), // For tabular results
}
/// Query parameters for parameterized queries
type query-parameters = list<tuple<string, property-value>>;
/// Query execution options
record query-options {
timeout-seconds: option<u32>,
max-results: option<u32>,
explain: bool, // Return execution plan instead of results
profile: bool, // Include performance metrics
}
/// Query execution result with metadata
record query-execution-result {
result: query-result,
execution-time-ms: option<u32>,
rows-affected: option<u32>,
explanation: option<string>, // Execution plan if requested
profile-data: option<string>, // Performance data if requested
}
/// Execute a database-specific query string
execute-query: func(
transaction: borrow<transaction>,
query: string,
parameters: option<query-parameters>,
options: option<query-options>
) -> result<query-execution-result, graph-error>;
}
/// Graph traversal and pathfinding operations
interface traversal {
use types.{vertex, edge, path, element-id, direction, filter-condition};
use errors.{graph-error};
use transactions.{transaction};
/// Path finding options
record path-options {
max-depth: option<u32>,
edge-types: option<list<string>>,
vertex-types: option<list<string>>,
vertex-filters: option<list<filter-condition>>,
edge-filters: option<list<filter-condition>>,
}
/// Neighborhood exploration options
record neighborhood-options {
depth: u32,
direction: direction,
edge-types: option<list<string>>,
max-vertices: option<u32>,
}
/// Subgraph containing related vertices and edges
record subgraph {
vertices: list<vertex>,
edges: list<edge>,
}
/// Find shortest path between two vertices
find-shortest-path: func(
transaction: borrow<transaction>,
from: element-id,
to: element-id,
options: option<path-options>
) -> result<option<path>, graph-error>;
/// Find all paths between two vertices (up to limit)
find-all-paths: func(
transaction: borrow<transaction>,
from: element-id,
to: element-id,
options: option<path-options>,
limit: option<u32>
) -> result<list<path>, graph-error>;
/// Get k-hop neighborhood around a vertex
get-neighborhood: func(
transaction: borrow<transaction>,
center: element-id,
options: neighborhood-options
) -> result<subgraph, graph-error>;
/// Check if path exists between vertices
path-exists: func(
transaction: borrow<transaction>,
from: element-id,
to: element-id,
options: option<path-options>
) -> result<bool, graph-error>;
/// Get vertices at specific distance from source
get-vertices-at-distance: func(
transaction: borrow<transaction>,
source: element-id,
distance: u32,
direction: direction,
edge-types: option<list<string>>
) -> result<list<vertex>, graph-error>;
}