Gold collapsing. Bitcoin UP.

Peter Tschipper

Active Member
Jan 8, 2016
254
357
There is a lot of good argument for and against, but my own main concern is that we really don't know, other than theoretical, what the performance hit will be. I haven't seen any real test results, and in particular with very large blocks. And not only block processing but block generation (CreateNewBlock()) and mempool behavior. If we're going to lose performance in one area it would be good to know what any gains are in others (if any), in order to make a sound assessment.
 

Richy_T

Well-Known Member
Dec 27, 2015
1,085
2,741
I disagree. We should actually leave the rules just exactly as they right now. Think about it: Basically Bitcoin acts as this global distributed notary that just irreguarly stamps anything that comes in. The transactions coming in are in natural order. If not, that can only because of a double spent or because the endpoints have withheld, for whatever reason, a bunch of dependent ones and are too lazy to bring them into their natural order again.
Note that in my suggestion, the default would likely be the natural order, as Tom suggests, simply because that's what arises from doing things the mempool way. So "in the meantime" nothing has changed. All that needs to happen is that the node needs to be able to cope with different ordering of transactions which is a little extra code but if everyone continues passing around natural order blocks, the strategy for processing those blocks is no more computationally expensive.

So what I am saying is that if we are going to make a change, it should be to remove the requirement for any particular ordering rather than changing from one to the other. Then we can have multiple options in the marketplace and miners will eventually move to the one that proves most advantageous to them rather than one handed down by devs. Control by dev is the Core way. Allowing choice and the market to act is the BU way.

Since Tom is in the natural ordering camp (which I am also softening to FWIW), perhaps he can answer these questions:
  1. How much extra work would it be to add the ability to process non-naturally ordered blocks to your code?
  2. Would that code significantly affect processing time for a naturally ordered block?
I think we can make the path forward easier without throwing the baby out with the bath water. Ironically, this may even make retaining natural order easier as with ABC's current steamroller effect, natural ordering could lose the dev argument but win in the market.
 
Last edited:

awemany

Well-Known Member
Aug 19, 2015
1,387
5,054
@Peter Tschipper : I think it should be a requirement then by those who want to do the change to provide some data. And I really have to concur with @Tom Zander here, that is seriously lacking. And what you say is my view as well: We're changing something for which we don't have a clear picture yet where it will go. So we should rather stick with what we have. Plus I think there's decent arguments it makes it worse.


@Richy_T :

All that needs to happen is that the node needs to be able to cope with different ordering of transactions which is a little extra code but if everyone continues passing around natural order blocks, the strategy for processing those blocks is no more computationally expensive.
I see. My answer here would be: "Let's not add code that has little to no benefit". It would only make things more complex, break assumptions and less assumptions also might mean less optimizations that are possible down the road.

So what I am saying is that if we are going to make a change, it should be to remove the requirement for any particular ordering rather than changing from one to the other.
I see, understood.

---

I took the freedom to also make this a reddit post to get some wider feedback:

https://old.reddit.com/r/btc/comments/96p95k/a_critique_of_canonical_transaction_ordering/
 

Richy_T

Well-Known Member
Dec 27, 2015
1,085
2,741
I see. My answer here would be: "Let's not add code that has little to no benefit". It would only make things more complex, break assumptions and less assumptions also might mean less optimizations that are possible down the road.
I'm fine with that but it seems that things are pushing ahead. I just think some parties are hyper-focused on their answer rather than widening the scope and looking for the "correct" answer and thus we will end up with "mistakes were made".
 

Tom Zander

Active Member
Jun 2, 2016
208
455
WRT ordering, I think it's unfair to say that dependency order is a natural order that comes for free because that assumes a particular evaluation algorithm. In particular a sequential 0 to N processing order is required to get ordering validation for "free".

But it's not free when 3 to 16 other cores are sitting idle.
The natural ordering is the ordering we get from the mempool. So what the miner puts in a block. We can call it natural because no sorting needs to be applied since we are required in the mempool to already establish relations between zero-conf transactions that spend each other.
So if B spends A, then the natural order is to put A first. And a miner creating a block doesn't spend any CPU sorting the output of a mempool. The mempool already has this info.

The part you talk about is validation. And validation requires you to process tx A before you process Tx B.

Now, we know that transactions in a block are mostly not spending each other. Lets say that 99% of the transactions are not spending outputs from the same block, they only spend outputs from prior blocks. What is pretty trivial to do is to find those 99% transactions and process them in parallel. Making those 16 or 32 cores all work hard.
The rest of the transactions just need to be done on the same core because they need to be done serially.

This is implemented and works pretty nice.
[doublepost=1534092954][/doublepost]
@theZerg, @Tom Zander : Speaking about more efficient algorithms. I don't know whether you talked about it, Tom, but it seems like an approach that does 'partial order first and then secondarily sorts by TXID' would allow as far as I can see for a quite beneficial validation algorithm:
This is basically what I suggested to the ABC people;
https://github.com/bitcoincashorg/bitcoincash.org/pull/94#discussion_r208819983

They seem to ignore me.
 

Tom Zander

Active Member
Jun 2, 2016
208
455
Since Tom is in the natural ordering camp (which I am also softening to FWIW), perhaps he can answer these questions:
  1. How much extra work would it be to add the ability to process non-naturally ordered blocks to your code?
  2. Would that code significantly affect processing time for a naturally ordered block?
You would have to sort them so transactions creating outputs are before transactions spending those outputs.
The problem is that you will end up hitting one of various bottle-necks in order to do this fast for a really big block. The main result is that your memory usage goes up (all the transactions txids have to be created and made searchable) and your validation time goes up due to this sort.

I have experience putting data into large hashmaps (I worked with openstreetmap data), there is a cutoff point where working with a hashmap gets exponentially slower the more data you insert. This is why if you have natural sorting you will be able to get the data out much much faster than if you have no idea about sorting yet.

Maybe a different approach to explain this better;

in the case of a naturally ordered block, the only thing you need to do is pick out all the transactions that DONT spend outputs of the transactions that came before.

In the case of a [other] sorted block (i.e. it removed natural sorting) you can't just iterate over the block since tx 1 might spend tx end-1. Or vice versa. So you first need to create an txid index of the entire block. Then you need to iterate over the block and check if a transaction is spending one of the other transactions. And if it is, you need to sort it into a list you process serially.
Whats left at the end you can process in parallel.

So in the naturally sorted way you can start validating and updating the UTXO immediately, over all your threads. When not sorted that way you need to sort them first.

Notice that the paper from Vermorel suggests to skip the sorting and just do a loop over the block first and do all inserts. There are various speed problems with this and I am pretty certain that they never actually coded such a solution. It doesn't sound like it will be fast.
Also here it is important to note that the levelDB classes being used in ABC today can't be used for this as blocks grow because they store the entire blocks changes in memory before you finally flush it to disk. At one point there will be too many changes in one block to fit in memory and if your protocol is designed with the idea that everything will happen in memory, you will get royally fucked once that no longer appears possible.
 
  • Like
Reactions: awemany

SPAZTEEQ

Member
Apr 16, 2018
40
24
awemany on reddit: "But the issue here is that the blockchain is meant for validation first, and comfortable viewing second, especially when we want to scale this up." If everybody might agree with this, is this not reason alone to take a step back?
 
  • Like
Reactions: Richy_T and awemany

awemany

Well-Known Member
Aug 19, 2015
1,387
5,054
Ok, so for fairness, /u/tomtomtom7 has a great point that there is a validation scheme (that he mentioned and that I forgot about) that can avoid dealing with the partial transaction order in an explicit sense. By doing the outputs first and then the inputs, explicitly dealing with checking dependency can be avoided and I was wrong when I said that checking seems to be required in any scheme. However, this partial order checking, as far as I can see, also comes (basically) free with his validation scheme (a simple less-than check per input consumed in the second stage), so removing the transaction order is a moot point here as well.

Now, the natural order is there and it is very cheap to have as it arises, well, naturally in Bitcoin.

So with the outs-then-ins, scheme are we sure that throwing away this information won't harm us either?

Things like cache locality come to my mind here. Basically, if you keep the transaction partial ordering in place, long strings of transactions, those mentioned as a desirable case in Vermorel's case, might consume each others outputs and inputs within a single validation worker's local memory and thus not cause traffic on the shared memory bus. Tom Zander's analysis seem to hint at effects like this as well, if I understand him correctly?

With a random order, that is not the case anymore. The outputs and inputs come from anywhere and correspondingly, traffic on a shared memory bus (whether more permanent solid state or ephemeral memory) might become a major issue!

And I like to also state that the partial order isn't merely a heuristic. It is a currently enforced and still enforcible property of the blockchain!

I really thing we should think this through another couple times. I also think this is an interesting field to think about hardware validation, e.g. FPGAs, @Peter R. have you looked into this?
 

Tom Zander

Active Member
Jun 2, 2016
208
455
> Things like cache locality come to my mind here. Basically, if you keep the transaction partial ordering in place, long strings of transactions, those mentioned as a desirable case in Vermorel's case, might consume each others outputs and inputs within a single validation worker's local memory and thus not cause traffic on the shared memory bus. Tom Zander's analysis seem to hint at effects like this as well, if I understand him correctly?

Yes, in my parallel validation scheme an output consumed in the same block never even hits the UTXO, so we never flush it to disk and we never have to retrieve it again from disk. The ultimate cache-locality breakage. And exactly that disk roundtrip is what you will create with the scheme proposed by Vermorel and now again by tomtomtom7.

Notice that actual validation of transactions is practically speaking not an issue. Schemes like xthin and graphene depend on the fact that you already have a locally validated transaction in your mempool. Which means you skip the actual validation of most of the transactions. The only thing that will really be causing you pain in parallel validation is parallel access to your UTXO. And this is the mother of all cache-locality problems as it is guaranteed that most of the utxo will live on disk (hopefully SSD).
 

awemany

Well-Known Member
Aug 19, 2015
1,387
5,054
So, to recap the discussion from today:

Yes, in my parallel validation scheme an output consumed in the same block never even hits the UTXO, so we never flush it to disk and we never have to retrieve it again from disk. The ultimate cache-locality breakage. And exactly that disk roundtrip is what you will create with the scheme proposed by Vermorel and now again by tomtomtom7.
But this could be avoided by creating a temporary table with the outputs of the block and then, when consuming the inputs, talking to this table as a cache first and then on a miss, to the UTXO set on disk.

But what cannot be avoided is cross-worker traffic if the workers all have their own output tables. With TXID order randomization, they suddenly have to synchronize each other on a common table.

Which, again, could be in main memory and be a shared, cached view on top of the real UTXO set. But it requires quite a lot of common traffic.

I think it should be the onus of those wanting to drop and change the transaction order to demonstrate that this dropping and changing results in less resource usage than keeping it.

Given the feedback from today, I still think and stand by my point that the paper has brought forward all the wrong reasons why lexicographic transaction order is supposedly important and the point that there's absolutely no need to create the natural order as it is naturally simply there stands as before.

However, one should definitely keep the outputs-then-inputs scheme in mind when discussing implementations and how they might be impacted by a change of the tx ordering. And I definitely have to concede the point that there's no explicit checking of order necessary in tomtomtom7's scheme. But adding such a check seems to be quite low cost and, again, the benefits from being able to assume partial ordering and/or access patterns due to the partial ordering could from my perspective very well be eating up the small extra cost to do the explicit check.

IMO, we need more data and maybe a simulation with data from real blocks (along the lines of "How many main memory accesses can be saved for which worker chunk size and block size?") to see the impact of either approach.
 
Last edited:

jtoomim

Active Member
Jan 2, 2016
130
253
Hi @awemany and @Tom Zander. Reading your comments and objections on the topic of CBO inspired me to write this piece:

Canonical block order, or:
How I learned to stop worrying and love the DAG

https://medium.com/@j_73307/canonical-block-order-or-dbe3ac48bcd3

As for needing to sort the transactions at some point in order to generate the CBO, we get that for free with Graphene. Sorting is extremely cheap with well-distributed data like hashes, as a radix sort will give us better-than-theoretically-possible sorting performance there. Many unsorted hashtables even give us sorting for free.
 

Peter R

Well-Known Member
Aug 28, 2015
1,398
5,595
I was happy with the vigorous yet polite technical debate today and yesterday. It's almost like old times :)

The pro-lexicographical ordering side does not seem to be consistent on the reason for this change.
  1. Some say the change is to make block validation faster because one does not need to ensure that transactions are in a physically-causal order.*
  2. Some say the change is to improve block propagation because the order information does not need to be transmitted (e.g., graphene is improved).
  3. Some say the change is to make it easier to prove that a given transaction is not in a given block (instead of downloading the entire block and scanning for the transaction, one needs only to confirm that a transaction with a smaller ordering index is adjacent in a block to a transaction with a larger ordering index).
If this were a truly important change, I think its proponents would be consistent on the reason for the change. Take the block size limit hard fork as an example: the reason we went to bigger block was to allow for more transaction per second. There was one simple reason that everyone could understand.




I'm skeptical that #1 is an obvious reason. If it were, I think it would be easy to provide convincing data that shows how removing the causality requirement would massively improve block validation time. What I've seen so far is that the improvements would be small at best, and would make block validation slower for large blocks at worst (cf. @Tom Zander's arguments).

I'm skeptical that #2 is an obvious reason. 95% (or maybe 99%?) of the transactions in a block can already be sorted lexicographically by miners if they choose. This means that miners already have the ability to transmit blocks with graphene very nearly as efficiently as they could with enforced lexicographic ordering. I can do the math if people are interested, but I'm confident the bandwidth savings are not significant in the big picture. (Further, we don't yet know whether Graphene will out perform Xthin in the real world).

I'm skeptical that #3 is an obvious reason. I had never heard anyone clamouring for this "feature" until it turned out to be a byproduct of the change to lexicographic ordering.

Then what is the reason for this change?

@jtoomim tells us what the reason is in his Medium article:

"First, it’s a more elegant and beautiful system. Being able to delete a bunch of code because you did some math on paper is one of the most beautiful things an engineer can do."
In other words, the primary reason for this change is aesthetics.

Like I said a few days ago, if developers played God they would redesign the human eye to remove the blind spot. It's a change with a small upside (a more "elegant design") and one with an unknown downside. It's also a change that Nature would never do: our eyes work good enough already.


* By "physical causality" I mean that if Tx1 has Coin A as its input and Coin B as its output, and if Tx2 has Coin B as its input and Coin C as its output, that Tx1 must come before Tx2 in the block.

 

jtoomim

Active Member
Jan 2, 2016
130
253
The primary reason is definitely #2. #1 was originally thought to be an equally important reason, but we recently (this week?) realized that the same parallel algorithm can be used without the fork, just with a little more code complexity. #3 is a gee-that's-kinda-neat-I-wonder-if-we-can-use-that thing. Nobody has a use for it yet, and it might never be used, but it's still a small theoretical improvement.

I was not listing those reasons in terms of importance, but rather in terms of how technical the explanations were. This was intended to be an easy-to-read piece, which means I had to be careful with how I spent my audience's attention. Frontloading heavy details tends to make people's eyes glaze over.

That said, aesthetics are not to be underestimated. Beautiful and elegant systems tend to be the most maintainable. If a system has a huge number of arbitrary details that simply need to be memorized in order to be used, then it's very difficult for new developers to join the community and start to become productive.

With Graphene, encoding the order of the remaining 5% of transactions takes as much data as encoding the unordered set of all 100%. And that's only when the miner is cooperating. What happens when you get a selfish miner in adversarial conditions? They might want to have a block that propagates well to some people but not to others. In this circumstance, they can just randomize the order to make it as difficult as possible to transmit, and preload it on special servers near their friends.

If Bitcoin only works when nobody is attacking it, it will never survive a major international war, much less internet trolls.
 
Last edited:

cypherdoc

Well-Known Member
Aug 26, 2015
5,257
12,995
as @solex inferred, what i see happening is that the BCH protocol is in the active process of ossifying down; already. and this is in spite of the scheduled q6mo hard forks; which i think are necessarily going to be abandoned. i know that comes as bad news to the more eager beaver BCH devs around here wanting to insert their favored code changes but that is what i see happening. everyone is disagreeing with everyone as to how to move forward and i don't think this is a bad thing. in contrast, it's a good thing. especially if you understand Bitcoin as Sound Money. if you don't, then you need to understand that money users/merchants need confidence via stability, reliability, and predictability that the rules won't change out from under them suddenly unless absolutely necessary to protect it's money function. imo, now is the time to move on removing the block limit entirely. and not via some complicated mechanism as coded in BU with EB, AD, BV, FG, or MG. just remove it. it's probably our last chance to do so as the last major change to concensus rules. as to @Peter R's skepticism that the protocol can't handle it, i think the market will ensure that it can. we've already resoundingly disproved Greg's "unlimited elastic demand theory" as it applies to larger blocksize supply over the last year. in fact, neither the spammers nor the data dumpers came out to fill/bloat blocks as he predicted as both the minfee and miner soft limits, not to mention lack of economic demand, have deterred any such attempts to attack BCH via bloat blocks. this is great news. the one or two attempts to fill 8MB blocks lasted just a few hours and were resoundingly defeated by the miners via their soft limits (and only lined their pockets with fees as predicted). any technical limitations of the current protocol under an unlimited blocksize scenario will be handled by the above while over time tech improvements will progressively allow larger and larger blocks. i think the time to to remove the limit is closing quickly as ossification is actively occurring. simply removing the limit is the simplest way to do this as opposed to more complicated mathematical adaptive methods, such as Bitpay's, that might be gamed.

edit: some of you may scream "the 32MB message limit!". sure, that needs to be improved/raised and certainly will, but who's going to risk the tx fees to try and spam a 33MB block while also having the minfee and miner soft limits cut them off? the protocol needn't enforce limits when the miners and even non-mining full nodes are certainly capable themselves of doing so.
 
Last edited:

Peter R

Well-Known Member
Aug 28, 2015
1,398
5,595
@cypherdoc I agree with most of your post, but why not just program in something like BIP101 to automatically lift the limit at an exponential pace (in BCH's case, we wouldn't stop at 8 GB but would keep doubling the limit forever). I don't think you're going to get miners to agree to accept and build on top of arbitrarily-large blocks.