A high-performance autocomplete library implemented in Rust, inspired by the Python fast-autocomplete library.
- Directed Word Graph (DWG): Efficient data structure for fast prefix matching
- Levenshtein Edit Distance: Fuzzy matching with configurable distance thresholds
- LFU Cache: Least Frequently Used caching for improved performance
- Synonyms Support: Handle word synonyms and partial matches
- Unicode Support: Full Unicode character support
- Thread-Safe: Built with Rust's safety guarantees
- High Performance: Optimized for speed and memory efficiency
Add this to your Cargo.toml
:
[dependencies]
fast-autocomplete-rs = "0.1.0"
use fast_autocomplete_rs::{AutoComplete, WordInfo};
use std::collections::HashMap;
fn main() {
// Create a dictionary of words
let mut words = HashMap::new();
words.insert("book".to_string(), WordInfo::new());
words.insert("burrito".to_string(), WordInfo::new());
words.insert("pizza".to_string(), WordInfo::new());
words.insert("pasta".to_string(), WordInfo::new());
// Create autocomplete instance
let autocomplete = AutoComplete::new(words, None, None, None, None, None);
// Search for autocomplete suggestions
let results = autocomplete.search("b", 3, 3);
println!("Results: {:?}", results);
}
use fast_autocomplete_rs::loader;
fn main() -> Result<(), Box<dyn std::error::Error>> {
// Load words from CSV file
let words = loader::load_words_from_csv("path/to/words.csv")?;
// Load synonyms from JSON file
let synonyms = loader::load_synonyms_from_json("path/to/synonyms.json")?;
// Create autocomplete instance
let autocomplete = AutoComplete::new(
words,
Some(synonyms),
None,
None,
None,
None,
);
let results = autocomplete.search("query", 3, 5);
println!("Results: {:?}", results);
Ok(())
}
After installing, you can use the Python bindings:
from fast_autocomplete_rs import AutoComplete, WordInfo, create_simple_autocomplete
ac = AutoComplete()
results = ac.search("hello", max_cost=2, size=10)
print(results)
The main autocomplete struct.
pub fn new(
words: HashMap<String, WordInfo>,
synonyms: Option<HashMap<String, Vec<String>>>,
full_stop_words: Option<Vec<String>>,
valid_chars_for_string: Option<&str>,
valid_chars_for_integer: Option<&str>,
valid_chars_for_node_name: Option<&str>,
) -> Self
pub fn search(&self, word: &str, max_cost: usize, size: usize) -> Vec<Vec<String>>
word
: The search querymax_cost
: Maximum Levenshtein edit distance for fuzzy matchingsize
: Maximum number of results to return
Represents metadata for a word in the dictionary.
pub struct WordInfo {
pub context: HashMap<String, String>,
pub count: usize,
pub original_key: Option<String>,
}
load_words_from_csv(path: &str)
: Load words from a CSV fileload_synonyms_from_json(path: &str)
: Load synonyms from a JSON fileload_stop_words_from_json(path: &str)
: Load stop words from a JSON fileautocomplete_factory(...)
: Factory function to create AutoComplete from filescreate_simple_autocomplete()
: Create a simple autocomplete with basic words
use fast_autocomplete_rs::{AutoComplete, WordInfo};
use std::collections::HashMap;
let mut words = HashMap::new();
words.insert("book".to_string(), WordInfo::new());
words.insert("burrito".to_string(), WordInfo::new());
words.insert("pizza".to_string(), WordInfo::new());
words.insert("pasta".to_string(), WordInfo::new());
let autocomplete = AutoComplete::new(words, None, None, None, None, None);
// Search for 'b'
let results = autocomplete.search("b", 3, 3);
// Returns: [["book"], ["burrito"]]
// Search for 'bu'
let results = autocomplete.search("bu", 3, 3);
// Returns: [["burrito"]]
// Search for 'barrito' (misspelling)
let results = autocomplete.search("barrito", 3, 3);
// Returns: [["burrito"]] (fuzzy match)
The Rust implementation provides significant performance improvements over the Python version:
- Memory Efficiency: Rust's zero-cost abstractions and memory safety
- Concurrent Access: Thread-safe design with Arc<Mutex<>> for shared state
- Optimized Algorithms: Efficient DWG traversal and fuzzy matching
- Caching: LFU cache for frequently accessed results
Feature | Python | Rust |
---|---|---|
DWG | ✔ | ✔ |
Fuzzy Matching | ✔ | ✔ |
LFU Cache | ✔ | ✔ |
Synonyms | ✔ | ✔ |
Unicode | ✔ | ✔ |
Thread-Safe | ✗ | ✔ |
Performance | Good | Excellent |
Python Bindings | — | ✔ |
Contributions are welcome! Please open issues or pull requests. See CONTRIBUTING.md if available.
This project is licensed under the MIT License - see the LICENSE file for details.
This project is inspired by the excellent fast-autocomplete Python library by Sep Dehpour.
AutoComplete::new(words, synonyms, stop_words, valid_chars_for_string, valid_chars_for_integer, valid_chars_for_node_name)
AutoComplete::search(word, max_cost, size)
AutoComplete::get_word_context(word)
AutoComplete::update_count_of_word(word, count, offset)
AutoComplete::get_count_of_word(word)
WordInfo
: Metadata for words (context, count, original key)
RsAutoComplete
: Python class for autocomplete.search(word, max_cost, size)
.get_word_context(word)
.update_count_of_word(word, count, offset)
.get_count_of_word(word)
RsWordInfo
: Python class for word metadatacreate_simple_autocomplete()
: Python function for a simple instanceload_words_from_csv(path)
,load_synonyms_from_json(path)
: Python functions for loading data
You can load words, synonyms, and stop words from files:
from fast_autocomplete_rs import load_words_from_csv, load_synonyms_from_json
words = load_words_from_csv('words.csv')
synonyms = load_synonyms_from_json('synonyms.json')
Or in Rust:
let words = loader::load_words_from_csv("words.csv").unwrap();
let synonyms = loader::load_synonyms_from_json("synonyms.json").unwrap();
- The library is thread-safe (uses
Arc<Mutex<...>>
for concurrency). - Full Unicode support and normalization for all input and search terms.