This is an exciting project! Congratulations
@Peter Tschipper!!
I had always assumed that Bloom filters would be really complicated and hurt my brain, so I didn't actually dig into them until I saw this thread on Reddit. I realize now that they are a very easy-to-understand and elegant data structure!
Just to make sure I understand, am I correct that this is how Node A gets the thin block and any missing transactions from Node B?:
- Node A encodes the transactions in its mempool using a Bloom filter.
- A Bloom filter is just an array that contains
m (initially-empty) slots.
- Node A figures out which slots in the array to set to "1" by passing each transaction in its mempool through a special hash function that "maps" each transaction to
k of the slots.
- For example, if the node's mempool contained Transactions
x,
y and
z, then these transactions might get mapped into the bloom filter as shown below (from Wikipedia):
- When Node A makes its "get_data()" request to download the thin block from Node B, it sends this Bloom filter at the same time.
[OK, let's imagine that the block actually contains Transactions x, y and w....]
- Node B hashes Transaction
x and confirms that all of the bits that it "maps to" have been set in the Bloom filter. It now knows for sure that Node A has Transaction
x.
- Node B hashes Transaction
y and again confirms that Node B has Transaction
y as well (all its bits were set).
- Node B hashes Transaction
w but this time finds that one of the bits that
w maps to is NOT set. This means that Node A must NOT know about Transaction
w.
- Node B thus sends the thin block (containing 64-bit hashes of all the TXs) to Node A along with Transaction
w in full.
- Node A receives the thin block and the missing transaction and can verify that it has the complete block information!
Questions:
1. [theory] Am I correct in assuming that Node B needs to cross-check each TX in the block against Node A's Bloom filter?
[I.e., there is no way to take the intersection of two Bloom filters or something faster than checking for elements one by one]
2. [implementation] How many bits long is the Bloom filter used for Xtreme thin blocks (i.e., what is
m)?
3. [implementation] What are the hash functions used?
4. [implementation] How many bits in the Bloom filter does each transaction map to (i.e., what is
k)?