Memento Filter is the first range filter to support dynamic datasets, constant time operations, and a theoretically optimal false positive rate under any of workload, all at the same time.
The master branch in the repository contains an evaluate.sh script. This
script downloads the required datasets, compiles Memento filter, sets the
benchmarks up, runs them, and plots the results. To ease the reproducibility
process, this branch also provides a Dockerfile that spins up a Ubuntu Docker
container with the requirements installed and runs evaluate.sh to gather the
results. One may choose to run the experiments in Docker, or to run the
experiments natively. We provide instructions on how to do both. We advise
using a Linux machine with at least 32GB of RAM.
Simply run the following commands:
git clone https://github.com/n3slami/Memento_Filter.git
cd Memento_Filter
docker build -t memento_eval .
mkdir -p ../paper_results/figures/ && docker run -v ../paper_results/figures:/usr/local/paper_results/figures -it memento_evalThese commands first clone the repository and builds the required Docker image.
Then, it creates a directory next to the cloned repository where the
generated figures will be placed and spins up a Docker container to run the
experiments. One the container is done running, the generated figures will be
placed in the paper_results/figures directory. The figures will be numbered
to match the paper.
Important: By default, this Docker container takes multiple days to finish
running. To run the experiments on a smaller scale that completes in ~10
minutes, comment out the final line in the Dockerfile and repeat the build
and run process above.
To use the evaluate.sh script, ensure that the following dependencies are
satisfied:
- CMake >=3.30.8 (Used to compile the baselines)
- Boost >=1.67.0 (Used in some of the baselines, notably Grafite)
- libssl-dev >=1:3.2.4 (Used in WiredTiger)
- python3-dev >=3.13.5 (Used in WiredTiger)
- python3-env >=3.13.5 (Used to plot the experimental results only)
- Python 3 >=3.7 (Used to plot the experimental results only)
- Latex full (Used to plot the experimental results only) Then, running
git clone https://github.com/n3slami/Memento_Filter.git
cd Memento_Filter
bash evaluate.sh -s largewill reproduce the results in the paper, creating a paper_results directory
next to the cloned repository with a figures subdirectory containing the
generated figures. The figures will be numbered to match the paper. The script
may take several days to run due to the large scale of the experiments. To
conduct a smaller-scale evaluation, the evaluate.sh script accepts the -s
position argument with the following options:
tiny # (The default. Takes ~10 minutes to run. Works with datasets of size 200K.)
small # (Takes ~1 hour to run. Works with datasets of size 2M.)
medium # (Takes ~1 day to run. Works with datasets of size 20M.)
large # (Takes multiple days to run. Works with datasets of size 200M.)For example, executing bash evaluate.sh -s tiny runs the experimental suite
with datasets that are 1000 times smaller than the original. This option is the
default so that one can ensure the suite is working properly without having to
wait a long amount of time for the execution to finish.
The evaluate.sh also allows one to only generate a subset of the figures
using the -f option. Running bash evaluate.sh -h provides more information
on how to use this script.
The reproducibility.md file explains the details of
how the evaluate.sh script works internally, as well as how one can manually
emulate the process.
This is a header-only library and does not need to be installed. Just clone the repository with
git clone https://github.com/n3slami/Memento_Filter.git
cd Memento_Filterand switch to the header-only branch. You can use Memento filter by including
include/memento.cpp in your project.
The examples/simple.cpp file shows how to index and query a vector of random
integers with Memento filter.
This repository has the following branches:
- The
masterbranch hosts the dynamic only implementation of Memento filter, and does not contain the test suite for the expandability and B-Tree experiments. - The
expandablebranch hosts the expandable implementation of Memento filter, as well as the test suite for the expandability and B-Tree experiments. - The
header-onlybranch hosts a C++ header file with a templated implementation of Memento filter in theincludedirectory. The template allows one to choose between a standard dynamic Memento filter or an expandable one.
After cloning the repository and all its submodules with
git clone --recurse-submodules -j8 https://github.com/n3slami/Memento_Filter.git
cd Memento_Filterbuild the project with CMAKE
mkdir build && cd build
cmake .. -DCMAKE_BUILD_TYPE=Release
make -j8The benchmarks will be placed in build/bench/. Use the evaluate.sh script
as described above to reproduce the results in the paper, and see
reproducibility.md for details on its inner
workings.
This project is licensed under the GPLv3 License - see the LICENSE file for details.