Skip to content

AstroSayan/fast-autocomplete-rs

Repository files navigation

Fast Autocomplete Rust

Build Status PyPI version Crates.io License: MIT

A high-performance autocomplete library implemented in Rust, inspired by the Python fast-autocomplete library.

Features

  • Directed Word Graph (DWG): Efficient data structure for fast prefix matching
  • Levenshtein Edit Distance: Fuzzy matching with configurable distance thresholds
  • LFU Cache: Least Frequently Used caching for improved performance
  • Synonyms Support: Handle word synonyms and partial matches
  • Unicode Support: Full Unicode character support
  • Thread-Safe: Built with Rust's safety guarantees
  • High Performance: Optimized for speed and memory efficiency

Installation

Add this to your Cargo.toml:

[dependencies]
fast-autocomplete-rs = "0.1.0"

Quick Start

Basic Usage

use fast_autocomplete_rs::{AutoComplete, WordInfo};
use std::collections::HashMap;

fn main() {
    // Create a dictionary of words
    let mut words = HashMap::new();
    words.insert("book".to_string(), WordInfo::new());
    words.insert("burrito".to_string(), WordInfo::new());
    words.insert("pizza".to_string(), WordInfo::new());
    words.insert("pasta".to_string(), WordInfo::new());

    // Create autocomplete instance
    let autocomplete = AutoComplete::new(words, None, None, None, None, None);
    
    // Search for autocomplete suggestions
    let results = autocomplete.search("b", 3, 3);
    println!("Results: {:?}", results);
}

Loading from CSV Files

use fast_autocomplete_rs::loader;

fn main() -> Result<(), Box<dyn std::error::Error>> {
    // Load words from CSV file
    let words = loader::load_words_from_csv("path/to/words.csv")?;
    
    // Load synonyms from JSON file
    let synonyms = loader::load_synonyms_from_json("path/to/synonyms.json")?;
    
    // Create autocomplete instance
    let autocomplete = AutoComplete::new(
        words,
        Some(synonyms),
        None,
        None,
        None,
        None,
    );
    
    let results = autocomplete.search("query", 3, 5);
    println!("Results: {:?}", results);
    
    Ok(())
}

Python Usage

After installing, you can use the Python bindings:

from fast_autocomplete_rs import AutoComplete, WordInfo, create_simple_autocomplete

ac = AutoComplete()
results = ac.search("hello", max_cost=2, size=10)
print(results)

API Reference

AutoComplete

The main autocomplete struct.

Constructor

pub fn new(
    words: HashMap<String, WordInfo>,
    synonyms: Option<HashMap<String, Vec<String>>>,
    full_stop_words: Option<Vec<String>>,
    valid_chars_for_string: Option<&str>,
    valid_chars_for_integer: Option<&str>,
    valid_chars_for_node_name: Option<&str>,
) -> Self

Search Method

pub fn search(&self, word: &str, max_cost: usize, size: usize) -> Vec<Vec<String>>
  • word: The search query
  • max_cost: Maximum Levenshtein edit distance for fuzzy matching
  • size: Maximum number of results to return

WordInfo

Represents metadata for a word in the dictionary.

pub struct WordInfo {
    pub context: HashMap<String, String>,
    pub count: usize,
    pub original_key: Option<String>,
}

Loader Functions

  • load_words_from_csv(path: &str): Load words from a CSV file
  • load_synonyms_from_json(path: &str): Load synonyms from a JSON file
  • load_stop_words_from_json(path: &str): Load stop words from a JSON file
  • autocomplete_factory(...): Factory function to create AutoComplete from files
  • create_simple_autocomplete(): Create a simple autocomplete with basic words

Examples

Example 1: Basic Autocomplete

use fast_autocomplete_rs::{AutoComplete, WordInfo};
use std::collections::HashMap;

let mut words = HashMap::new();
words.insert("book".to_string(), WordInfo::new());
words.insert("burrito".to_string(), WordInfo::new());
words.insert("pizza".to_string(), WordInfo::new());
words.insert("pasta".to_string(), WordInfo::new());

let autocomplete = AutoComplete::new(words, None, None, None, None, None);

// Search for 'b'
let results = autocomplete.search("b", 3, 3);
// Returns: [["book"], ["burrito"]]

// Search for 'bu'
let results = autocomplete.search("bu", 3, 3);
// Returns: [["burrito"]]

// Search for 'barrito' (misspelling)
let results = autocomplete.search("barrito", 3, 3);
// Returns: [["burrito"]] (fuzzy match)

Performance

The Rust implementation provides significant performance improvements over the Python version:

  • Memory Efficiency: Rust's zero-cost abstractions and memory safety
  • Concurrent Access: Thread-safe design with Arc<Mutex<>> for shared state
  • Optimized Algorithms: Efficient DWG traversal and fuzzy matching
  • Caching: LFU cache for frequently accessed results

Comparison with Python Version

Feature Python Rust
DWG
Fuzzy Matching
LFU Cache
Synonyms
Unicode
Thread-Safe
Performance Good Excellent
Python Bindings

Contributing

Contributions are welcome! Please open issues or pull requests. See CONTRIBUTING.md if available.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

This project is inspired by the excellent fast-autocomplete Python library by Sep Dehpour.

API Summary

Rust API

  • AutoComplete::new(words, synonyms, stop_words, valid_chars_for_string, valid_chars_for_integer, valid_chars_for_node_name)
  • AutoComplete::search(word, max_cost, size)
  • AutoComplete::get_word_context(word)
  • AutoComplete::update_count_of_word(word, count, offset)
  • AutoComplete::get_count_of_word(word)
  • WordInfo: Metadata for words (context, count, original key)

Python API

  • RsAutoComplete: Python class for autocomplete
    • .search(word, max_cost, size)
    • .get_word_context(word)
    • .update_count_of_word(word, count, offset)
    • .get_count_of_word(word)
  • RsWordInfo: Python class for word metadata
  • create_simple_autocomplete(): Python function for a simple instance
  • load_words_from_csv(path), load_synonyms_from_json(path): Python functions for loading data

Loader Utilities

You can load words, synonyms, and stop words from files:

from fast_autocomplete_rs import load_words_from_csv, load_synonyms_from_json
words = load_words_from_csv('words.csv')
synonyms = load_synonyms_from_json('synonyms.json')

Or in Rust:

let words = loader::load_words_from_csv("words.csv").unwrap();
let synonyms = loader::load_synonyms_from_json("synonyms.json").unwrap();

Thread Safety & Unicode

  • The library is thread-safe (uses Arc<Mutex<...>> for concurrency).
  • Full Unicode support and normalization for all input and search terms.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published