This project is a gRPC server written in Rust.
It implements the ability to store multiple text fragments inside audio files, and lightning-fast semantic search on them.
Among other things, it can be used as a retriever
part in a RAG
system.
UPD: The project is not finished, improvements will be added as soon as possible.
- Index building: creating an HNSW search index from vector representations.
- Semantic search: performing fast vector similarity search on stored text fragments.
- Parallel processing: optimized search using parallel processing.
- No database required: all data is stored locally in WAV audio files and JSON metadata.
A preloaded local model is used to create embeddings (vector representations of text) (without connecting to external AI APIs). For example, you can use the following models:
- EN
To use them, you need to download and put the following files into the ./model
directory of the project:
model.onnx
,config.json
,tokenizer.json
,tokenizer_config.json
,special_tokens_map.json
.
In config.yaml
the values for the fields are set:
Auth
username
- username (for basic auth).password
- password (for basic auth).
Server
host
- host to run the gRPC server.port
- port to run the gRPC server.
Logging
log_level
- log/trace level.
RateLimit
capacity
- maximum number of tokens (bucket capacity).refill_rate
- number of tokens added per time interval (refill_interval_ms).refill_interval_ms
- duration of refill interval (in milliseconds).
App
model_dir
- directory to store model files (example./model
).audio_dir
- directory to store audio files (example./output/audio
).output_dir
- directory (parent) for indexing results (example./output
).index_path
- path to the search index file (example./output/hnsw.idx
).storage_path
- path to the file with metadata (example./output/storage.json
).
The structure of the /output
directory after building the index:
output/
├── audio/
│ ├── batch_0.wav # First 500 text chunks
│ ├── batch_1.wav # Next 500 text chunks
│ └── ...
├── hnsw.idx # Search index with embeddings
└── storage.json # Metadata and batch information
- Audio encoding: 16-bit WAV files (mono), 48 kHz sampling rate.
- Batch size: 500 text fragments per audio file (configurable in audio.rs).
- Embedding model: any embedding model can be used (examples above).
- Search algorithm: HNSW with cosine similarity + fallback parallel linear search.
Audio file batch_n
structure (after encoding)
- First, the number of chunks (4 bytes)
- Then for each chunk:
- Length (4 bytes)
- Data (N bytes)
- Trailing zeros (500)
RateLimiting uses the Token Bucket
algorithm.
It is worth considering that this algorithm can allow a burst when tokens are accumulated (the bucket is full).
Currently implemented via a third-party crate rater.
The rate limit is applied to all routes in total.
To calculate RPS, use the formula refill_rate * 1000 / refill_interval_ms
.
To send a request to the server, take text_indexer.proto
(from the ./proto
directory), and use it in your client.
You can check the functionality, for example, via Postman
.
authorization: Basic <base64_token>
- username:password authentication token in base64 format.correlation-id: <id>
- identifier for request tracing (if not specified, the server will generate its own).
If the text file is located on the server, you can specify the path to it:
{
"id": "123e4567-e89b-12d3-a456-426614174000",
"file_path": "./articles_on_various_topics.txt",
"chunk_size": 150
}
Otherwise, you can pass text in the request.
{
"id": "123e4567-e89b-12d3-a456-426614174000",
"content": "text in base64 format",
"chunk_size": 150
}
As a result, the server will return JSON of the following type:
{
"id": "123e4567-e89b-12d3-a456-426614174000"
}
authorization: Basic <base64_token>
- username:password authentication token in base64 format.correlation-id: <id>
- identifier for request tracing (if not specified, the server will generate its own).
{
"id": "123e4567-e89b-12d3-a456-426614174001",
"query": "Scientific discoveries of the Hubble Space Telescope",
"top_k": 5,
"min_similarity": 0.3
}
As a result, the server will return JSON of the following type:
{
"id": "123e4567-e89b-12d3-a456-426614174001",
"results": [
{
"text": "One of the Hubble Space Telescope's major discoveries is evidence that the expansion of the Universe is accelerating, driven by dark energy.",
"score": 0.9029404520988464
},
{
"text": "The Hubble Space Telescope has captured amazing star formation in nebulae such as Orion and Aquila.",
"score": 0.8357565402984619
},
{
"text": "Launched in 1990, the Hubble Space Telescope has become one of the most important instruments in the history of astronomy.",
"score": 0.7911539673805237
},
{
"text": "In three decades of operation, Hubble has helped to clarify the age of the Universe and prove the existence of dark matter.",
"score": 0.7869136929512024
},
{
"text": "In some cultures, crows are considered a symbol of wisdom, and science backs up that reputation.",
"score": 0.3231840753555298
}
]
}
- To install
Rust
on Unix-like systems (MacOS, Linux, ...) - run the command in the terminal. After the download is complete, you will get the latest stable version of Rust for your platform, as well as the latest version of Cargo.
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
- Run the following command in the terminal to verify.
If the installation is successful (step 1), you will see something likecargo 1.88.0 ...
.
cargo --version
- We clone the project from GitHub, open it, and execute the following commands.
Check the code to see if it can be compiled (without running it).
cargo check
Build + run the project (in release mode with optimizations).
cargo run --release
UDP: If you have Windows, see Instructions here.
To deploy a project locally in Docker
, you need to:
- Make sure
Docker daemon
is running. - Make sure
embedding model
is present in the./model
directory of the project (files downloaded and added). - Open a terminal in the root of the project, and run the command (for example
docker build -t disorder-server .
). - After the project is built, run the command (for example
docker run -rm -p 9090:9090 disorder-server
). - Enjoy using the service.
This project was inspired by memau, a project that stores data in audio files.
This project is licensed under the MIT License or Apache License 2.0, your choice.