Xthin is Robust to Hash Collisions

What are your thoughts on Xthin?

  • Xthin is nearly perfect.

    Votes: 9 24.3%
  • Xthin would significantly improve network propagation but further improvements are possible.

    Votes: 28 75.7%
  • Xthin would neither help nor hinder the network.

    Votes: 0 0.0%
  • Xthin requires important changes before it would even be safe for use on the network (as per Gmax).

    Votes: 0 0.0%
  • Xthin is fundamentally broken.

    Votes: 0 0.0%

  • Total voters
    37

Peter R

Well-Known Member
Aug 28, 2015
1,398
5,595
Xthin is Robust to Hash Collisions

We have been thrilled by the popularity of Bitcoin Unlimited's article series on Xtreme Thin Blocks (Part 1, Part 2, Part 3, Part 4). To date, the articles have received over 14,700 unique views and over 5,500 reads. A byproduct of this excitement was confusion in the social media about the way the Xthin protocol deals with collisions of the short-hashes it uses to create its thin blocks. The purpose of this post is to explain that the protocol has always dealt with hash collisions in a safe way and that the purported "attack" described by Gregory Maxwell is only a minor nuisance that would neither hurt Bitcoin Unlimited nodes nor give any meaningful advantage to the perpetrator.

TL/DR: Optimistic Mining completely breaks Maxwell's 'attack.' But even without that, the other arguments in this post leave it busted and bruised.

On behalf of the Bitcoin Unlimited Research & Development Team, I invite Mr. Maxwell to make his "Xthin Collision Tool" public [1,2,3]. Both theory and empirical testing show that the Xthin protocol is robust to such hash collisions.

Hash Collisions and the Birthday Paradox

Currently, Xthin builds thin blocks using the first 8 bytes (64-bits) of each transaction ID. In the future, we intend to dynamically reduce this by using feedback to target a certain collision rate. But let's perform our analysis assuming 64-bit hashes.



The probability that any two transactions share the same first 64 bits is 1 / 2^64, or 1 part in about 18,000,000,000,000,000,000. To get a feel for the size of this number, it is approximately equal to the total number of grains of sand on earth, according to this estimate from the University of Hawaii.

Despite this enormous improbability, it is actually feasible (although difficult) to find two bitcoin transactions that share the same first 64-bits using only a consumer-grade computer, custom software, and some patience. What makes it feasible is related to something known as the Birthday Paradox. An example of this paradox is illustrated in the figure below: in Mathematica I generated 1,000 (10^3) numbers between 1 and 1,000,000 (10^6) at random and found that two of those numbers were the same. What are the odds of that!? Well if you repeat this experiment, you'll find a "collision" like this about 40% of the time. The fact that the odds are greater than what you might have expected is the reason it's referred to as a paradox.



The first part of the purported attack is to put your computer to work finding pairs of colliding double-spent transactions. To find a pair, you don't have to generate 2^64 transactions, but only about 10 billion (2^32/0.4) instead, as per the birthday paradox. 10 billion transactions consumes ~3 TB of disk space; producing 10 billion signed transactions and sorting through them to find a colliding pair--although not fast--is not impossibly-slow either. Indeed, finding a pair of transactions with the same first 64 bits is feasible.

So what is the purported attack?

The attack described by Maxwell imagines that a miner would generate a bunch of double spent colliding transaction pairs and continually propagate them across the network. Other miners would include up to one transaction from each pair in their blocks (they are valid, after all) and then--when their blocks are propagated outwards across the network--Bitcoin Unlimited nodes with the other half of the double-spent pair in mempool would encounter two transactions with the same short hash and wouldn't immediately know which one to pick.

But that's not easy

Even with the colliding transactions in hand, it is still difficult to pull off because you can't easily control which miners/nodes have which halves of the pairs. The attacker doesn't have a "God's-eye view" of the network.

But let's assume the attacker does have a God's-eye view and can successfully pull of this attack. It still doesn't hurt a Bitcoin Unlimited node. The node simply requests a "thicker" thin block built from longer hashes. Our testing described in Part 2 of our Xthin article series showed that an extra round trip adds another 0.6 sec to the mean propagation time for a Bloom filter false positive. An extra round trip due to a hash collision would add a similar amount of time; but let's call it 1 second to be conservative. Therefore, if a BU node encountered a collision, it would take perhaps 1.7 seconds rather than 0.7 seconds to download the block from its peer.

This is not a problem for non-mining nodes. As pointed out in the Cornell position paper, what matters is that non-mining nodes are able to consistently receive blocks within the 600 second block time--an extra second on a few hops does not meaningfully threaten a node's ability to keep up with the blockchain.

[continued below]
 
Last edited:

Peter R

Well-Known Member
Aug 28, 2015
1,398
5,595
[continued from above]

Does the miner gain an advantage?

Not really. Xthin is intended for block relay across the P2P network. Miners presently rely mostly on the centralized Corallo Relay Network, and in they future we hope they use a decentralized method built into the client such as Bitcoin Unlimited's eXpedited service (which is a service that results in faster block propagation times than Xthin but at the expense of increased bandwidth consumption). Adding a second to the per-hop propagation time between certain pairs of non-mining nodes does not significantly affect the time it takes blocks to propagate to miners--thus a miner has little motivation to conduct this "attack."

But what if the effect is not zero?

Instead of assuming that slowing block propagation to nodes has zero effect on block propagation to miners, let's assume it has some effect (perhaps miners begin to use Xthin between each other prior to the release of eXpedited): let's assume that adding 1 second to the per-hop block propagation times between non-mining nodes also adds 1 second to the block relay time to miners (although I suspect it would be significantly less than that).

How much advantage does the miner gain by slowing down his competitors' blocks by 1 second? To answer that, we use the Andresen orphaning factor (also explored here):

1 - e^-(1 sec / 600 sec) = 0.167%.
The attacking miner would have a 0.167% advantage. A miner controlling 10% of the hash rate earns 10%*6*24*25 = 360 BTC per day. At a price of $500 / BTC, that's $180,000 per day in revenue. By successfully pulling off this attack, the expectation value of the miner's revenue would increase by 0.167%*180,000 = $300 per day. Including variance (luck), instead of averaging $180,000 +/- $45,000 per day, he'll average $180,300 +/- $45,000 per day instead. If the cost of the attack [1] is less than $300 per day (unlikely IMO), it might be worthwhile to attempt (although it still wouldn't hurt the network).

Iterative Prisoner's Dilemma Games

Maxwell's collision attack against Bitcoin Unlimited is unlikely to be profitable even if it succeeds; however, other types of attacks (the most well-known being the Selfish Mining Attack carefully described by Eyal and Sirer in their influential paper) actually could be profitable. Yet for some reason, even these attacks do not seem to occur.

There is a general class of attacks against Bitcoin that appear to be valid if (a) a single miner is considered in isolation, and (b) a narrow definition of rational behaviour is assumed. To first order, this class of attack can be modelled as an iterative prisoner's dilemma. What these attacks have in common is that it is in the group's best interested for every miner to cooperate, yet each miner can increase his expected revenue regardless of what the other miners do by defecting. This is illustrated in the decision table below.



If the miner defects and the other miners do not, then he is ahead by 1 - C where C is the cost of the attack (assuming C is less than 1). Similarly, if he defects and the other miners also defect, he is still ahead by 1 - C. Therefore, a rational miner will always defect if the attack cost is less than the gain, and the Nash Equilibrium is that all miners defect. The result is that all miners are worse off than if none had defected, while Bitcoin chugs along fine.

So why doesn't this happen? A large body of research (e.g., see here) has shown that humans act super-rationally instead of rationally in these types of iterative prisoner's dilemma games. That is, each member of the group learns that by cooperating they are better off than by attacking. Members that do attack on a given "round" of the game are punished by the others on subsequent rounds. Cooperative players tend to beat "nasty players." Bitcoin mining adds another data point here: to date there has not been a documented case of Selfish Mining.

Optimistic Mining Kills the 'Attack' Completely

If the arguments above haven't convinced you that the attack is silly and pointless, this argument (courtesy of Tom Zander) will: Any miner can already gain a greater advantage by simply mining off the quickly-propagated block headers. Conversely, any miner can overcome the 'attack' by optimistic mining off the block headers.

Incremental Progress

In the unlikely case that this "attack" is seen in the wild (and in the even unlikelier case that it causes even a small problem), it is straightforward to eliminate it completely by salting the transactions hashes. However, this is a tradeoff as it comes at the cost of increased code complexity and CPU bandwidth.

Let's make incremental progress and deal with obstacles as they arise.

[1] The cost of the attack includes the cost to find the collisions, the transactions fees consumed, and the change to the net present value of the miner's goodwill and its future revenue streams.
 
Last edited:

cypherdoc

Well-Known Member
Aug 26, 2015
5,257
12,994
Crickets
 

solex

Moderator
Staff member
Aug 22, 2015
1,558
4,693
@Peter R
Well done on a thorough and considered treatment of the hash-collision attack.
 
  • Like
Reactions: bluemoon

theZerg

Moderator
Staff member
Aug 28, 2015
1,012
2,327
Is that your blog post?
 

AdrianX

Well-Known Member
Aug 28, 2015
2,097
5,797
bitco.in
So what is the purported attack?

The attack described by Maxwell imagines that a miner would generate a bunch of double spent colliding transaction pairs and continually propagate them across the network. Other miners would include up to one transaction from each pair in their blocks (they are valid, after all) and then--when their blocks are propagated outwards across the network--Bitcoin Unlimited nodes with the other half of the double-spent pair in mempool would encounter two transactions with the same short hash and wouldn't immediately know which one to pick.
After reading Maxmells comment on the blog post referenced below I just wanted to further my understanding.
Am I correct in my understanding that almost all blocks today propagate in under 30 seconds as an general approximation?

If so am I correct in assuming even if Maxwell's attack couldn't be circumvented, a miner still need to mine and propagate a block to the majority of the network in order to make a double spend in the proceeding block to invalidate it?

What's the risk amusing miners are using Xthin and BU and the statistical research on block propagation times is accurate, this attack would be valid only for blocks that were mined withing 1.8 seconds of each other provided the attacker was constantly transmitting a bunch of double spent colliding transaction pairs across the network.

am I understanding this correctly?
 
Last edited:

Peter Tschipper

Active Member
Jan 8, 2016
254
357
@AdrianX Yes you are right...to come at it from another angle. Consider that the first time we put out the idea, we were going to just use the full Tx hash, a normal thinblock anyway. Had we done that, I think most people would have thought that was pretty good and certainly much better than what we have now. So under the attack scenario, that's all we do, we request a normal thinkblock. The only thing is we lose a little bit of time for the extra round trip. But we could still do, what, 10 or 20 such round trips over the GFC before a normal block could be downloaded!
 
  • Like
Reactions: AdrianX