Gold collapsing. Bitcoin UP.

Peter R

Well-Known Member
Aug 28, 2015
1,398
5,595
@Mengerian @MarkBLundeberg

I think we're getting somewhere.

BLOCK VALIDATION

Let N be the number of transactions in a block and M be the number of workers. Let's think about block validation as three different tasks:
  1. Checking that the transactions in the block are valid and not already spent
  2. Checking that the transactions are ordered correctly
  3. Verifying the Merkle root in the block header
Task 1 is highly parallelizable regardless of the order. Please let me know if you disagree, because this seems really obvious to me. The complexity class is O(N) and can be easily split up over M workers. We could validate millions of transactions per second today just by throwing more hardware (in parallel) at the task.

Task 2 is obviously parallelizable for LTOR, with only boundary conditions communicated between the workers. The complexity class is O(N) and can be split up over M workers.

Task 2 for TTOR is still O(N) I believe, but it is not obvious to me how to parallelize it, but I suspect it can be (I think @jtoomim said something about this).

Task 3 is parallelizable for both TTOR and LTOR. The complexity class is O(N log N).

BLOCK CREATION

For block creation, we need to

A. Continually accept transactions into mempool
B. Maintain the required order
C. Update the Merkle root for each new block candidate​

Task A has complexity class is O(1) (per insertion)

Task B has complexity O(log N) per insertion for LTOR and I would argue O(1) for TTOR. I argue it could be made O(1) for TTOR because the transactions have to enter your mempool in topological order, and so you can probably get the required ordering for nearly free.

Task C has complexity O(N) per insertion for LTOR and O(log N) for CTOR. It's O(log n) for TTOR because you just tack new leaves on the the end of the Merkle tree; you don't need to rebuild the entire tree. We discussed the yesterday (@shadders, @imaginary_username).

----------

TTOR wins on Task B and C, and possibly loses on Task 2. It ties with LTOR on the rest.

But LTOR + Merklix would tie with TTOR on Task B and C and possibly win on Task 2. I also think LTOR + Merklix would be simpler to implement and faster by a constant factor than TTOR.

I still think LTOR is not worth it, but I think LTOR + Merklix might be worth it. Does Merklix actually work like @deadalnix thinks? That sounds very interesting.
 
Last edited:

Mengerian

Moderator
Staff member
Aug 29, 2015
536
2,597
@Peter R OK Great, glad we are getting somewhere!

I'll have to read through all your categorizations more carefully, but it looks largely in line with my understanding.
Does Merklix actually work like @deadalnix thinks? That sounds very interesting.
Regarding Merklix, yes, I think it should work as advertised.

I'll explain it based on my understanding.. hopefully someone can jump in a correct me if I'm wrong about some detail.

Basically, the "Merklix tree" is almost the same as a Merkle tree. It's might actually be closer to how one imagines the Merkle tree when you first hear about it.

Let's start with what's the same:
  • It's a binary tree. Each node has two sub-nodes
  • The nodes are hashes of the concatenation of the two sub-node hashes
So this part is exactly the same as a Merkle tree. What this means is that Merkle proofs leading to transactions would be exactly the same as currently. In other words, SPV wallets should continue to function with Merklix tree the same as they do with the current Merkle tree, without needing any modification.

Now what's different:
  • The transaction are arranged in the tree according to the first digits of their hashes. This is a rough description, but basically you can imagine it like this: At the first bifurcation at top level of the tree, everything to the left leads to transactions beginning with "0", and everything to right leads to transactions begining in "1". If you go to the left branch, at the next bifurcation everything to the left begins in "00", and everything to the right begins in "01". And so on down the tree. (this description is slightly inaccurate, because if there aren't any transactions beginning in a particular digit, that path won't exist. So, for example, if there are no transactions beginning with "1", then the top-level bifurcation would represent the split of paths to "00", and "01".)
  • Because the tree is built based on what transaction hashes are in it, the path lengths can vary. This is different that the current Merkle tree, built from the bottom up, where all path lengths are the same. You also also see that the maximum possible path length would be determined by how many transactions have matching SHA256 prefix-digits. (This makes is possible to figure out the longest path lengths that could be generated in an attack based on grinding Tx Hashes. It would be something less than the number of matching prefix digits that match in Bitcoin mining, which is around 70. So attack-created Merklix paths would be less than ~70 deep.)
  • As another by-product of the layout, every Merklix path ends in a transaction. This is different that the current Merkle tree, where the right side of the tree can have many "dangling" branches, if there aren't exactly a power-of-2 number of transactions in the block. This also means that the average Merklix path will be shorter than the average Merkle path length, for a given number of transactions in a block.
Anyway, hope that helps give an intuitive understanding of how the Merklix tree is constructed. Now that I have written it down, it's more words that I imagined. So let me know if some part is unclear, I'll try to clarify as best I can!

The net effect of all this is that the cost any transaction insertion anywhere in the tree is just updating one Merkle path. And the work of updating the paths is sub-dividable, since you can separate the tree into sections based on the transaction hash prefixes, and then easily combine the sections.
 
Last edited:
Feb 27, 2018
30
94
An aside on CTOR+Merklix trees ... the current CTOR for ABC always places the coinbase first, regardless of its txid. Only the non-coinbase transactions get lexically sorted.

Does this mean that the left branch under the merkle root will be just the coinbase, and the right branch will be all other transactions (as merklix)?

If so, it sounds to me like this would be a very happy situation for covert asicboost, since the extranonce would be as close as possible to the root.
 

Mengerian

Moderator
Staff member
Aug 29, 2015
536
2,597
@MarkBLundeberg You mean how it would be with Merklix?

Yeah, if Merklix were to be implemented, I expect the coinbase would be in a special branch to the left, to preserve compatibility with any other software that expects that. It would also probably make sense to include some other information in that left-branch (left-then-right path), such as block-size, and transaction count. These are both currently communicated in the network protocol, so the Merklix-metadata would be buildable by receiving nodes without any change to the network protocol (https://en.bitcoin.it/wiki/Block).

The Merklix tree with all the non-coinbase transactions would be under the right branch.

Caveat: These are all just speculative ideas based on what I personally imagine would make sense.
 

cypherdoc

Well-Known Member
Aug 26, 2015
5,257
12,995
The net effect of all this is that the cost any transaction insertion anywhere in the tree is just updating one Merkle path. And the work of updating the paths is sub-dividable, since you can separate the tree into sections based on the transaction hash prefixes, and then easily combine the sections.
IOW, this is to facilitate parallel processing? can you draw out an example of how a Merklix tree would be subdivided btwn 4 core CPU processors? i was under the impression that this subdivision would be relatively in equal parts and am having trouble visualizing how such a tree could be divided into 4 parts as an example.
[doublepost=1536470251,1536469351][/doublepost]as predicted, mining continues to decentralize across the world:

https://news.bitcoin.com/russia-developing-own-mining-pools/
 
  • Like
Reactions: AdrianX
Feb 27, 2018
30
94
@MarkBLundeberg
Yeah, if Merklix were to be implemented, I expect the coinbase would be in a special branch to the left, to preserve compatibility with any other software that expects that. It would also probably make sense to include some other information in that left-branch (left-then-right path), such as block-size, and transaction count. These are both currently communicated in the network protocol, so the Merklix-metadata would be buildable by receiving nodes without any change to the network protocol (https://en.bitcoin.it/wiki/Block).

The Merklix tree with all the non-coinbase transactions would be under the right branch.

Caveat: These are all just speculative ideas based on what I personally imagine would make sense.
It's good to hear that such a new structure could retain compatibility with SPV and light wallet clients.

I hope a lot of thought goes into this hypothetical change, because adding a new system for calculating the merkle root will mean permanent technical debt for full nodes and anyone else reading the full blockchain. They will always need to support building old-style merkle trees (for checking old blocks) but also must always support building merklix trees (for new blocks)... hopefully there will not be a third format for a very long time.

(In contrast, with a mere change in ordering rules like removal of TTOR, there is no permanent debt. After a few months later we can effectively declare 'the blockchain was AOR all along' and there will be no need to keep the code that checks the old order.)
 

Peter R

Well-Known Member
Aug 28, 2015
1,398
5,595
@Mengerian

Thanks for the info about the Merklix trees. That all makes sense.

What is not obvious to me, however, is that insertions and deletions are guaranteed to be O(log N). Deadalnix says they are in his article, and gives an example, but I guess I'm just not convinced that this property always holds. Another thing that is not obvious to me is that exactly one representation of the Merklix tree exists for a given set of transactions. I'd like to see proofs of both.

What is nice about TTOR (or AOR) is that insertions in the Merkle tree are O(log N) when transactions are only appended. This property is lost with LTOR (at least until a change from Merkle tree -> Merklix tree is made).
 
Last edited:
  • Like
Reactions: sickpig
Feb 27, 2018
30
94
@Peter R

Do you think the gradual building approach is a necessity?

Consider the following approach where you just build blocks from scratch every time:
  1. Miner receives a block from network, validates it, and then decides to build on that block.
  2. The miner immediately makes an empty block candidate just to give his ASIC farm something useful to chew on.
  3. The miner starts preparing a filled block candidate from scratch, taking N log N time. He then sends it to the ASIC farm.
  4. As new transactions roll in, the miner periodically repeats #3 -- making new bigger block candidates from scratch.
I'm not sure what downsides this has compared to the gradual building approach, aside from the gradual approach being smoother (transactions can be always included even if received just 1 second before block is found).

(For pools: since ABC's CTOR always puts the coinbase transaction first, it only takes log(N) time to calculate a new merkle path for each change in the coinbase's extranonce.)
 
  • Like
Reactions: adamstgbit

Peter R

Well-Known Member
Aug 28, 2015
1,398
5,595
@MarkBLundeberg

Yes, with LTOR it becomes more costly to add new transactions to the block template, so this will be done less frequently compared to TTOR, all else held constant.

Is the difference significant? Does it matter? I don't know.

But I think it's important to understand how LTOR leads to different properties than TTOR (or AOR). The fact that it's more costly to rebuild the Merkle tree as new transactions are added to a block candidate with LTOR is one of these differences.
 
  • Like
Reactions: MarkBLundeberg
Feb 27, 2018
30
94
I had another thought regarding the interaction of: graphene, ordering, and concurrency.

Let's say we stayed with TTOR and allowed miners to opt-in to a TTOR-respecting canonical ordering (like Gavin's order), so that for these special blocks, ordering info could be communicated in one byte. OK great, ordering bandwidth problem solved. But, it's not clear to me that these weird orderings will scale due to the weird interaction of two sort priorities. Gavin's sorting algorithm for example seems to only be possible in single-threaded mode, and from what I can see it has worst case performance of O(N^2). (consider a block that contains a single long chain of dependent transactions)
 

Peter R

Well-Known Member
Aug 28, 2015
1,398
5,595
Yes, there seems to be a lot of unanswered questions! On the topic of Graphene, a lot of people take it for granted that Graphene will be the block propagation technique of the future. But I'm not even convinced of that. For example, Xthin takes more bandwidth, but the computations needed to reconstruct the full block are simple and fast. I can imagine a bandwidth vs block reconstruction trade-off existing. How costly is the IBLT set reconciliation calculation? Can this calculation be easily parallelized?
 

Mengerian

Moderator
Staff member
Aug 29, 2015
536
2,597
Thanks for the info about the Merklix trees. That all makes sense.
No Problem :)
What is not obvious to me, however, is that insertions and deletions are guaranteed to be O(log N). Deadalnix says they are in his article, and gives an example, but I guess I'm just not convinced that this property always holds. Another thing that is not obvious to me is that exactly one representation of the Merklix tree exists for a given set of transactions. I'd like to see proofs of both.
It shouldn't be too difficult to provide proofs of those things. As a start, maybe I will try to write down a "specification" to describe the Merklix tree precisely. I think it's fairly simple conceptually, but describing it precisely might also help me understand it better.
What is nice about TTOR (or AOR) is that insertions in the Merkle tree are O(log N) when transactions are only appended. This property is lost with LTOR (at least until a change from Merkle tree -> Merklix tree is made).
Yeah, one of the nice things about Merklix is that it has the same property, but for inserting or removing a transaction anywhere in the tree. The other interesting property is that when you provide an inclusion proof (ie, a Merkle path leading to a transaction), that inclusion proof also includes the necessary information to remove the transaction for the Merklix tree. And if you provide and exclusion proof (proof that a transaction is absent from the Merklix tree), that proof contains the information needed to add it to the tree.
 

AdrianX

Well-Known Member
Aug 28, 2015
2,097
5,797
bitco.in
Overall I generally agree with how you described things, but there are a few points where I see it a bit differently. These could maybe explain why we have come to different conclusions on LTOR.
Whatever is happening, why can't people build prototypes and test them in a variety of ways?

Why do we need to commit to an irreversible path now before we have data to analyze?
 

Norway

Well-Known Member
Sep 29, 2015
2,424
6,410
@solex
Do you have a tentative date for the next BU vote?

I want to create a debate around BUIP 101 in time before the vote and see the arguments from all sides play out.
 

jtoomim

Active Member
Jan 2, 2016
130
253
Task C has complexity O(N) per insertion for LTOR and O(log N) for CTOR. It's O(log n) for TTOR because you just tack new leaves on the the end of the Merkle tree; you don't need to rebuild the entire tree.
If you insert a transaction at position 1 (after the coinbase), that will change every node in the tree. The amount of hashing needed to do this is equal to 1 + 0.5 + 0.25 + 0.125 + ... = 2 times the summed length of the txids, multiplied by two because we do two rounds of SHA256. For a 1 GB block with 2.5 million transactions, that's 320 MB of hashing. SHA256 can do around 300 MB/s on a single core, so this will take about 1 second without any parallelization. This is a highly parallelizeable problem, so we can easily get that down to 0.1 seconds for a 1 GB block if we want to. Mining hardware generally prefers to switch jobs every 10-60 seconds, so 0.1-1.0 seconds out of a 15 second typical job is insignificant.

This discussion of adding transactions to an existing block template is mostly irrelevant. This is not part of the latency-critical code path and will not affect orphan rates at all. It's also not at all how the current code works, and I'm not aware of anybody having seriously proposed implementing it. A full template reconstruction needs to be fast enough to handle the latency-critical path after a new block is published on the network. If it's fast enough for that situation, it will also be fast enough that you can re-run it a few dozen times during a 10 minute block interval.
 

adamstgbit

Well-Known Member
Mar 13, 2016
1,206
2,650
Whatever is happening, why can't people build prototypes and test them in a variety of ways?

Why do we need to commit to an irreversible path now before we have data to analyze?
if Bitcoin ABC had someone like CSW going around saying things like "i dont give a shit! we did it, it works, its the best, i'm the best, CTOR ftw, don't like it??? piss off!!!"
everyone would feel alot better about CTOR... seriously, that kinda thing inspires confidence!

but, if your not going to BS a bit to put everyone at ease, then you'll need ALOT of hard concrete data.
and i bet there has been a bunch of testing and the results are good, they just haven't done a good job advertising that data. probably they ran tests said "fuck ya it work wonderfully" and threw away the results and kept coding!

anyway... there is some time still to provide a good analyst, and convince everyone that CTOR is the way to go.
 

Peter R

Well-Known Member
Aug 28, 2015
1,398
5,595
If you insert a transaction at position 1 (after the coinbase), that will change every node in the tree. The amount of hashing needed to do this is equal to 1 + 0.5 + 0.25 + 0.125 + ... = 2 times the summed length of the txids, multiplied by two because we do two rounds of SHA256. For a 1 GB block with 2.5 million transactions, that's 320 MB of hashing. SHA256 can do around 300 MB/s on a single core, so this will take about 1 second without any parallelization. This is a highly parallelizeable problem, so we can easily get that down to 0.1 seconds for a 1 GB block if we want to. Mining hardware generally prefers to switch jobs every 10-60 seconds, so 0.1-1.0 seconds out of a 15 second typical job is insignificant.

This discussion of adding transactions to an existing block template is mostly irrelevant. This is not part of the latency-critical code path and will not affect orphan rates at all. It's also not at all how the current code works, and I'm not aware of anybody having seriously proposed implementing it. A full template reconstruction needs to be fast enough to handle the latency-critical path after a new block is published on the network. If it's fast enough for that situation, it will also be fast enough that you can re-run it a few dozen times during a 10 minute block interval.
I was talking about adding transactions to the end of the block, in which case you do not need to rebuild the entire Merkle tree. However, if you insert the transaction in a random location in the block (like you would with LTOR), then you do need to rebuild most of the tree.

That said, I agree with you that this will probably never matter. I was worried for a while that it might matter at high throughput levels, but I'm not worried anymore. There is no need to create a new block template every M new transactions (as you pointed out). Instead, a new block template can be created every T seconds, and then the O(N^2) scaling term I was worried about never occurs.