Skip to content

A Bloom filter is a probabilistic data structure used to test whether an element is a member of a set. It is highly space-efficient and allows for quick insertions and membership tests. However, it has a small probability of false positives but no false negatives

Notifications You must be signed in to change notification settings

souravkayal/BloomFilter

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

46 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

BloomFilter



Bollm filter is space efficient probablistic data structure which is invented in 1970 by Burton Howard Bloom. It comes with risk of false positive result but not false negative. I.e It can indicate that the value is present where it is not actually present but it never otherwise.

A Few real systems which use bloom filter.

In distributed system :

Distributed system like Cassandra uses bloom filter to avoid checking data on SSTable for a partition being requested. The detail can be read here - https://cassandra.apache.org/doc/latest/cassandra/operating/bloom_filters.html

Hbase uses bloom filter to check whether a line or data information is present inside a file or not - https://hbase.apache.org/2.2/devapidocs/org/apache/hadoop/hbase/util/BloomFilter.html

In Cache:

Redis use bloom filter and it's capability to implemented probabilistic data structure called RedisBloom. - https://redis.io/docs/stack/bloom/

In Spell check:

A full compiled bloom filter can be used to check existence of word.

Implementation using c# .NET

Adding value to bloom filter:

Add() method is implemented to insert value in bloom filter. The algorithm will generate hash of input value by using SHA1 hashing algorithm and set specific bit in filter.

Time Complexciy : O(n)
Space Complexcity O(1)

void Add(string item)

Check existence of value in filter:

Need to call Contains (string item) method to check existence of value against with filter. It will return true if there is chance to presence otherwise false.

bool Contains(string item)

Time Complexciy : O(n)
Space Complexcity O(1)

Add and check presence of value in bloom filter

    public static void Main(String[]args)
    {

        // Create a Bloom filter with an expected item count of 1000 and false positive rate of 0.01
        BloomFilter bloomFilter = new BloomFilter(1000, 0.01);

        // Add an item to the filter
        bloomFilter.Add("item1"); 
        bloomFilter.Add("item2");

        // Check if an item exists in the filter
        bool item1Exists = bloomFilter.Contains("item1"); 
        bool item3Exists = bloomFilter.Contains("item3");
        
        Console.WriteLine("Item 1 exists: " + item1Exists);
        Console.WriteLine("Item 3 exists: " + item3Exists);

        Console.ReadLine();
    }


Inject hashing algoritam while creating object of bloom filter

  BloomFilter bloomFilterWithSHA = new BloomFilter(1000, 0.01, new HSHA1());

About

A Bloom filter is a probabilistic data structure used to test whether an element is a member of a set. It is highly space-efficient and allows for quick insertions and membership tests. However, it has a small probability of false positives but no false negatives

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages