-
Notifications
You must be signed in to change notification settings - Fork 63
Description
This ticket involves implementing the golem:search
interface for several major document and vector search providers. This WIT interface defines a unified abstraction over full-text and metadata-based search functionality, enabling developers to interact with a consistent and well-typed API regardless of provider differences.
The interface supports indexed document storage, structured schema definition, full-text and filtered search, result highlighting, faceting, and pagination. It is designed to degrade gracefully when providers do not support a particular capability, using optional fields and structured error variants.
This task is to implement the interface as a series of WASM components (WASI 0.23) in Rust, following Golem conventions for component development, durability integration, and structured error handling.
Note: The
golem:search
interface was created after analyzing the WIT-based APIs of leading search systems. If you find improvements or simplifications that could be made, you’re encouraged to propose them with justification.
Target Providers
The following providers are prioritized for implementation:
-
ElasticSearch
Popular distributed search engine, powerful full-text capabilities, supports scroll-based pagination and flexible schema. -
OpenSearch
AWS-backed fork of ElasticSearch with extended features including index lifecycle management, snapshots, and cluster APIs. -
Algolia
Developer-friendly hosted search API optimized for instant search and relevance tuning, with support for filters, pagination, and ranking. -
Typesense
Lightweight open-source search engine focused on simplicity and speed, supports schema enforcement, vector fields, and filters. -
Meilisearch
Modern, fast, and open-source search engine with support for faceting, typo tolerance, and ranked search, now includes vector support.
Deliverables
Each provider must be implemented as a standalone WASM component with full test coverage and integration with Golem’s durability APIs.
Component Artifacts
search-elastic.wasm
search-opensearch.wasm
search-algolia.wasm
search-typesense.wasm
search-meilisearch.wasm
Each component must:
- Fully implement the
golem:search
interface per the WIT spec - Compile cleanly with
cargo component
targeting WASI 0.23 - Use environment variables for configuration and authentication
- Integrate Golem durability for consistent and resumable execution
- Handle unsupported features using
search-error.unsupported
oroption<T>
fields
Testing Requirements
All components must be tested for:
- Index creation and deletion (if supported)
- Document insert, update, delete, and retrieval
- Full-text search with filters, sorting, and pagination
- Highlighted results and facet metadata (where supported)
- Schema inspection and validation
- Search streaming behavior and pagination correctness
- Graceful fallback for unsupported operations
- Error mappings: invalid input, rate limiting, timeouts, network failures
- Integration with Golem durability and config handling
Configuration via Environment Variables
Common
SEARCH_PROVIDER_ENDPOINT
SEARCH_PROVIDER_TIMEOUT
(default: 30)SEARCH_PROVIDER_MAX_RETRIES
(default: 3)SEARCH_PROVIDER_LOG_LEVEL
Provider-Specific Examples
ALGOLIA_APP_ID
,ALGOLIA_API_KEY
MEILISEARCH_API_KEY
ELASTIC_PASSWORD
,ELASTIC_CLOUD_ID
Graceful Degradation Strategy
The interface leverages option<T>
and search-error.unsupported
to enable partial implementations:
- Providers that don’t support index creation can return
unsupported
- Schema-inspecting APIs may return empty or inferred schema info
- Facets, highlights, or document scores may be omitted if not available
- Streaming search can fallback to paginated batches internally
- Provider-specific features can be safely ignored unless explicitly declared in
provider-params
This work enables robust and interoperable search functionality across multiple ecosystems, paving the way for plug-and-play search capabilities in the Golem platform.
package golem:search@1.0.0;
/// Core types and error handling for universal search interfaces
interface types {
/// Common structured errors for search operations
variant search-error {
index-not-found,
invalid-query(string),
unsupported,
internal(string),
timeout,
rate-limited,
}
/// Identifier types
type index-name = string;
type document-id = string;
type json = string;
/// Document payload
record doc {
id: document-id,
content: json,
}
/// Highlight configuration
record highlight-config {
fields: list<string>,
pre-tag: option<string>,
post-tag: option<string>,
max-length: option<u32>,
}
/// Advanced search tuning
record search-config {
timeout-ms: option<u32>,
boost-fields: list<tuple<string, f32>>,
attributes-to-retrieve: list<string>,
language: option<string>,
typo-tolerance: option<bool>,
exact-match-boost: option<f32>,
provider-params: option<json>,
}
/// Search request
record search-query {
q: option<string>,
filters: list<string>,
sort: list<string>,
facets: list<string>,
page: option<u32>,
per-page: option<u32>,
offset: option<u32>,
highlight: option<highlight-config>,
config: option<search-config>,
}
/// Search hit
record search-hit {
id: document-id,
score: option<f64>,
content: option<json>,
highlights: option<json>,
}
/// Search result set
record search-results {
total: option<u32>,
page: option<u32>,
per-page: option<u32>,
hits: list<search-hit>,
facets: option<json>,
took-ms: option<u32>,
}
/// Field schema types
enum field-type {
text,
keyword,
integer,
float,
boolean,
date,
geo-point,
}
/// Field definition
record schema-field {
name: string,
type: field-type,
required: bool,
facet: bool,
sort: bool,
index: bool,
}
/// Index schema
record schema {
fields: list<schema-field>,
primary-key: option<string>,
}
}
/// Unified search interface
interface core {
use types.{
index-name, document-id, doc, search-query, search-results,
search-hit, schema, search-error
};
// Index lifecycle
create-index: func(name: index-name, schema: option<schema>) -> result<_, search-error>;
delete-index: func(name: index-name) -> result<_, search-error>;
list-indexes: func() -> result<list<index-name>, search-error>;
// Document operations
upsert: func(index: index-name, doc: doc) -> result<_, search-error>;
upsert-many: func(index: index-name, docs: list<doc>) -> result<_, search-error>;
delete: func(index: index-name, id: document-id) -> result<_, search-error>;
delete-many: func(index: index-name, ids: list<document-id>) -> result<_, search-error>;
get: func(index: index-name, id: document-id) -> result<option<doc>, search-error>;
// Query
search: func(index: index-name, query: search-query) -> result<search-results, search-error>;
stream-search: func(index: index-name, query: search-query) -> result<stream<search-hit>, search-error>;
// Schema inspection
get-schema: func(index: index-name) -> result<schema, search-error>;
update-schema: func(index: index-name, schema: schema) -> result<_, search-error>;
}