Magnus Skjegstad

A stand-alone Java Bloom filter

Some years ago I needed a Bloom filter implementation in Java. At the time, most implementations where either written for a special purpose or relied on other libraries. I decided to write my own Bloomfilter-class which did not depend on anything outside the standard JDK.

This post gives some examples of how the implementation can be used. If you want to go straight to the source code, it is available here. You can also download BloomFilter.java directly if you don't need the whole project.

To create an empty Bloom filter, call the constructor with the required false positive probability and the number of elements you expect to add to the Bloom filter.

double p_fp = 0.1; // False positive probability
int n = 100; // Expected number of elements

BloomFilter<String> bf = new BloomFilter<String>(p_fp, n);

The constructor chooses a length and number of hash functions which will provide the given false positive probability (approximately). Note that if you insert more elements than the number of expected elements you specify, the actual false positive probability will rapidly increase.

There are several other constructors available which provide different levels of control of how the Bloom filter is initialized. You can also specify the Bloom filter parameters directly (bits per element, number of hash functions and number of elements).

After the Bloom filter has been created, new elements may be added using the add()-method.

bf.add("foo");

To check whether an element has been stored in the Bloom filter, use the contains()-method.

bf.contains("foo"); // returns true

Keep in mind that the accuracy of this method depends on the false positive probability. It will always return true for elements which have been added to the Bloom filter, but it may also return true for elements which have not been added. The accuracy can be estimated using the expectedFalsePositiveProbability()-method.

Put together, here is the full example.

double p_fp = 0.1; // False positive probability
int n = 100; // Expected number of elements

BloomFilter<String> bf = new BloomFilter<String>(p_fp, n);

bf.add("foo");

if (bf.contains("foo")) { // Always returns true
    System.out.println("Bloom filter contains foo!"); 
    System.out.println("Probability of a false positive: " + 
         bf.expectedFalsePositiveProbability());
}

if (bf.contains("bar")) { // Should return false, but could return true
    System.out.println("There was a false positive.");
}

The source code is currently hosted at GitHub.