C++ Bloom Filter Library

 www.partow.net  .: Home :.   .: Links :.   .: Search :.   .: Contact :. 


Description

The C++ Bloom Filter Library is a simple to use, easy to integrate and efficient implementation of the probabilistic Bloom filter data structure. The bloom_filter class provides the following set of capabilities:

  • Optimal parameter selection based on expected false positive rate.
  • Union, intersection and difference operations between bloom filters.
  • Compression of in-use table (increase of false positive probability vs space)
  • Portable and efficient source code implementation.
  • Single header implementation, no building required. No external dependencies

Bloom Filter Library License

Free use of the C++ Bloom Filter Library is permitted under the guidelines and in accordance with the MIT License.

Compatibility

The C++ Bloom Filter Library implementation is fully compatible with the following C++ compilers:

  • GNU Compiler Collection (3.5+)
  • Clang/LLVM (1.1+)
  • Microsoft Visual Studio C++ Compiler (7.1+)
  • Intel® C++ Compiler (8.x+)
  • AMD Optimizing C++ Compiler (1.2+)
  • Nvidia C++ Compiler (19.x+)
  • PGI C++ (10.x+)
  • IBM XL C/C++ (9.x+)
  • C++ Builder (XE4+)

Download


Simple Bloom Filter Example

The following examples will demonstrate the following aspects of the library:

  • Instantiate and configure a Bloom filter
  • Add some strings and integers to the Bloom filter
  • Query the Bloom filter for membership of the previously added strings and integers
  • Query the Bloom filter for membership of string and integers that were NOT previously added to the Bloom filter (potential false positives)
#include <iostream>
#include <string>
#include <vector>

#include "bloom_filter.hpp"

int main()
{

   bloom_parameters parameters;

   // How many elements roughly do we expect to insert?
   parameters.projected_element_count = 1000;

   // Maximum tolerable false positive probability? (0,1)
   parameters.false_positive_probability = 0.0001; // 1 in 10000

   // Simple randomizer (optional)
   parameters.random_seed = 0xA5A5A5A5;

   if (!parameters)
   {
      std::cout << "Error - Invalid set of bloom filter parameters!" << std::endl;
      return 1;
   }

   parameters.compute_optimal_parameters();

   //Instantiate Bloom Filter
   bloom_filter filter(parameters);

   const std::vector<std::string> str_list =
   {
      "KNWz", "nVoC", "rPT1", "HxSZ", "sVHa", "cQZH", "F4ch", "8iif", "GSUN", "kYYN",
      "dwew", "kc7u", "aeIL", "V9WP", "s7Ac", "uXar", "o0y8", "VWvW", "HRJ0", "pKRG",
      "OMsl", "bDEq", "kRhP", "L5UM", "TRXh", "EFQ2", "vn7J", "QWpO", "oyOZ", "7NkU",
      "BaeV", "WMn7", "7bo9", "Oj1x", "Tq5d", "Og1v", "5l4N", "BPa7", "pkMo", "MsnR",
      "Ix2p", "Sr79", "wKCj", "Hfls", "r9wN", "UjUC", "wrap", "0Ryg", "fbyC", "g3HO"
   };

   // Insert into Bloom Filter
   {
      // Insert some strings
      for (const auto& str : str_list)
      {
         filter.insert(str);
      }

      // Insert some numbers
      for (std::size_t i = 0; i < 100; ++i)
      {
         filter.insert(i);
      }
   }

   // Query Bloom Filter
   {
      // Query the existence of already inserted strings
      for (const auto& str : str_list)
      {
         if (filter.contains(str))
         {
            std::cout << "BF contains: " << str << std::endl;
         }
         else
         {
            std::cout << "BF erroneously does not contain: " << str << std::endl;
         }
      }

      // Query the existence of already inserted numbers
      for (std::size_t i = 0; i < 100; ++i)
      {
         if (filter.contains(i))
         {
            std::cout << "BF contains: " << i << std::endl;
         }
         else
         {
            std::cout << "BF erroneously does not contain: " << i << std::endl;
         }
      }

      // Query the existence of invalid strings
      for (const auto& str : str_list)
      {
         if (filter.contains(str + "xyz"))
         {
            std::cout << "BF falsely contains: " << str + "xyz" << std::endl;
         }
      }

      // Query the existence of invalid numbers
      for (int i = -1; i > -100; --i)
      {
         if (filter.contains(i))
         {
            std::cout << "BF falsely contains: " << i << std::endl;
         }
      }
   }

   return 0;
}




© Arash Partow. All Rights Reserved.