Skip to content

masroore/pybloomer

Repository files navigation

pybloomer

pybloomer is a Python 3 compatible fork of pybloomfiltermmap by @axiak.

The goal of pybloomer is simple: to provide a fast, simple, scalable, correct library for Bloom filters in Python.

Documentation Status PyPI PyPI PyPI

Why pybloomer?

This module implements a Bloom filter in Cython (ANSI C) that's fast and uses memory-mapped files for better scalability.

There are a couple reasons to use this module:

  • It natively uses mmaped files.
  • It is fast (see benchmarks).
  • It natively does the set things you want a Bloom filter to do.

Installation

To install pybloomer, use the Python 3 version of pip:

    $ pip install pybloomer

Quickstart

Here’s a quick example:

>>> import pybloomer
>>> fruits = pybloomer.BloomFilter(capacity=10000000, error_rate=0.01, filename='/tmp/fruits.bloom')
>>> fruits.update(('apple', 'pear', 'orange', 'apple'))
>>> len(fruits)
3
>>> 'mike' in fruits
False
>>> 'orange' in fruits
True

To create an in-memory filter, simply omit the file location:

cake_ingredients = pybloomer.BloomFilter(capacity=1000, error_rate=0.1)

Caveat: in-memory filters cannot be persisted to disk.

Documentation

Current docs are available at pybloomer.rtfd.io.

Contributions and development

Suggestions, bug reports, and / or patches are welcome!

When contributing, you should set up an appropriate Python 3 environment and install the dependencies listed in requirements-dev.txt.

Package installation depends on a generated pybloomer.c file, which requires Cython module to be in your current environment.

Maintainers

License

See the LICENSE file. It's under the MIT License.

About

Fast Python Bloom Filter using memory-mapped files

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Contributors 17