(This work can be paid for by the BUIP062 (devpool) development, please contact me first to work out details)
The giga_perf branch in the github BitcoinUnlimited repo contains a class similar to a bloom filter but optimized for speed, located in fastfilter.cpp and fastfilter.h. To move this work into the dev and release branch the following needs to be done:
Please read https://en.wikipedia.org/wiki/Bloom_filter to understand the context of these requirements.
1. Add constructor where "k" and "m" (# of bits in the bloom filter array) are specified. "m" must be a power of 2.
2. Add constructor where the # of items to add to the bloom filter and the probability of a collision are specified. Calculate "k" and "m" from those as described in: https://en.wikipedia.org/wiki/Bloom_filter. Round "m" up to a power of 2.
3. Fix the "insert", "contains", and "checkAndSet" functions to handle a user-defined "k" value. As is done for k=1, efficiently pull each "hash" from a different location in the input data (nothing fancy is needed, it could be just the next few bytes) rather than running a hash function to generate it.
4. Add an option in the constructor that enables the following functionality: XOR each "hash" with a value randomly chosen at object construction time. This stops an attacker from being able to deliberately create collisions.
5. The current fastfilter uses bitwise operations like & rather than %, since "m" is a power of 2. Don't lose this efficiency!
6. Make the needed changes to CRollingFastFilter to support different sized filters (should be easy, its a 25 line class).
7. add a c++ boost test-framework-compatible unit test for CFastFilter and CRollingFastFilter. See src/test for tons of examples of unit tests.
The giga_perf branch in the github BitcoinUnlimited repo contains a class similar to a bloom filter but optimized for speed, located in fastfilter.cpp and fastfilter.h. To move this work into the dev and release branch the following needs to be done:
Please read https://en.wikipedia.org/wiki/Bloom_filter to understand the context of these requirements.
1. Add constructor where "k" and "m" (# of bits in the bloom filter array) are specified. "m" must be a power of 2.
2. Add constructor where the # of items to add to the bloom filter and the probability of a collision are specified. Calculate "k" and "m" from those as described in: https://en.wikipedia.org/wiki/Bloom_filter. Round "m" up to a power of 2.
3. Fix the "insert", "contains", and "checkAndSet" functions to handle a user-defined "k" value. As is done for k=1, efficiently pull each "hash" from a different location in the input data (nothing fancy is needed, it could be just the next few bytes) rather than running a hash function to generate it.
4. Add an option in the constructor that enables the following functionality: XOR each "hash" with a value randomly chosen at object construction time. This stops an attacker from being able to deliberately create collisions.
5. The current fastfilter uses bitwise operations like & rather than %, since "m" is a power of 2. Don't lose this efficiency!
6. Make the needed changes to CRollingFastFilter to support different sized filters (should be easy, its a 25 line class).
7. add a c++ boost test-framework-compatible unit test for CFastFilter and CRollingFastFilter. See src/test for tons of examples of unit tests.