Gold collapsing. Bitcoin UP.

theZerg

Moderator
Staff member
Aug 28, 2015
1,012
2,327
When considering block construction, it needs to be argued that some change C gives miner A a big advantage over miner B. Otherwise, the work to construct the block just becomes part of the block's POW and difficulty will adjust accordingly.

Since block construction is in the pools, hashers have no advantage. Do pools gain an advantage given C? First, understand that pools can mine empty blocks so we are talking about an expected loss of a fraction of the average transaction fees (not block reward) per block.

Generally speaking the cost per performance of machines are a fast exponential -- what I mean is that once you go beyond your high end mass produced machine it costs a fortune to achieve marginal gains. But relative to the cost of hashing power, the cost of a high end mass produced machine is loose change.

This idea is more important than transaction ordering. In general, I think we should pack more processing during the block construction if this reduces processing or provides functionality for block consumers. For example, think about UTXO commitments. We should choose an expensive commitment construction if doing so is very beneficial to SPV wallets and full nodes.

tldr; I have a hard time seeing how additional block construction time benefits one pool over another, even if it did the other pool could upgrade their hardware, and finally such benefit would be marginal -- (advantage in seconds/600) * tx fee per block.
 

Mengerian

Moderator
Staff member
Aug 29, 2015
536
2,597
There's lots to say about the Canonical Order (aka lexographical order) issue. But to start, I just want to step back and talk about the context of the whole discussion. The purpose of moving to Canonical Order is to design a system that can handle massive scaling in the future. The fact that it may help with block propagation (eg. Graphene), or block validation, are just surface manifestations of solid fundamental design.
The pro-lexicographical ordering side does not seem to be consistent on the reason for this change.
  1. Some say the change is to make block validation faster because one does not need to ensure that transactions are in a physically-causal order.*
  2. Some say the change is to improve block propagation because the order information does not need to be transmitted (e.g., graphene is improved).
  3. Some say the change is to make it easier to prove that a given transaction is not in a given block (instead of downloading the entire block and scanning for the transaction, one needs only to confirm that a transaction with a smaller ordering index is adjacent in a block to a transaction with a larger ordering index).
If this were a truly important change, I think its proponents would be consistent on the reason for the change. Take the block size limit hard fork as an example: the reason we went to bigger block was to allow for more transaction per second. There was one simple reason that everyone could understand.
These different issues are not a random grab-bag of assorted justifications. The real reason to move to Canonical Order is that it's a better design for the data structure. The different benefits listed are just symptoms that flow from good design. This is a common characteristic of well-designed systems: you create a clean and logical design for one reason, and then it turns out that it also improves many seemingly unrelated areas.

The relationship between transactions in a causal DAG, and the timestamping of transactions via assembly into blocks are two orthogonal concepts. So it makes the design inelegant to have a spurious dependency on causal order within the timestamping structure. When you start looking at how to create a canonical order while also keeping causal order, you end up with complicated arrangements. Then if you want to add other components to the system, for example parallelized block construction, you end up with an explosion in complexity.

On the other hand, it you look at how one would design the algorithm to handle block validation in non-causally ordered blocks, you find that it performs far better not only for scaling, which we expect, but also for things like long chains of unconfirmed transactions in block (which has become as issue for memo.cash).

I will try to spend some time to address some of the specific arguments that have been raised in the last couple weeks, but for now I just wanted to lay out my general view on the topic.

The benefits of moving to Canonical Order are myriad. But the immediate benefits are just symptoms of the fact that it's a better design which will provide a solid foundation for growth far into the future.
 

Tom Zander

Active Member
Jun 2, 2016
208
455
@Mengerian

You state this is to make it long term better for scaling. Ignoring that it in actual tested (coded) fact as well as in theoretical design the opposite is proven.

Can you please follow due diligence before making such statements? Or at minimum work with the person (me) that has been giving you examples of actual code where your theory just doesn't work. The "better datastructure" is demonstratively not better.

What does it take for you to start challenging your own assumptions?

For instance this kind of statements;

> So it makes the design inelegant to have a spurious dependency on causal order within the timestamping structure

This "spurious dependency" is nothing of the sort. There is no dependency and the ordering is a natural consequence of the flow of time.
Calling the design inelegant is clearly in the eyes of the beholder, in Satoshi's and my world (an Einstein quote comes to mind too) there is a beauty in leaving as much as possible unspecified. Keep the ruleset as simple as possible, but not simpler.

As quotes go; this one is relevant too;
Try not to become a man of success, but rather try to become a man of value. -- Albert Einstein
 
  • Like
Reactions: lunar

jtoomim

Active Member
Jan 2, 2016
130
253
@Tom Zander
You have not understood proof of work and Bitcoin in general if you think a slow to propagate block is an attack on Bitcoin.
In two recent posts plus some follow up and preceding discussion, Peter R and I went through the math for the outcast attack, a variant of selfish mining that uses slow-to-propagate blocks in conjunction with accelerated propagation to unwitting collaborators in order to gain extra revenue by effectively isolating a small portion of the network and exposing them to orphan risk. Please review the math there before being dismissive of the concept. If you need a conceptual overview of how it works, I suggest this post.

@awemany

Changes to the block order do not affect mempool ingress (AcceptToMemoryPool, or ATMP). The same algorithms we use now would also be used after a canonical block order fork, whether that order be lexical or topological.

All you do is you receive a transaction, and *first* check/spend its inputs with atomic swaps. Afterward you check that the outputs' don't spend more than the sum of the inputs. If the inputs or the outputs don't check out, you undo any utxo spends (swaps) that you might have made, and then make a note of what missing txids this transaction depends on and set it aside in your orphan/doublespend pool. If everything checks out, you finally add the outputs to the UTXO cache overlay, and then add the to the memory pool and block priority queues and also check the orphan pool for transactions which might depend on it. Lastly, for each input, you check mempool itself to see whether this input's parent is itself unconfirmed. If so, you make a note of that somewhere so that you can efficiently assemble transaction packages, do CPFP calcs, and quickly assemble blocks.

This algorithm should be concurrency-safe. The shared cache can have two major types of intermediate states:

(1) Some of the input spends have been executed, but not others, and no outputs have been inserted yet. These spends may be reversed later and the UTXOs restored. This state is not a problem for ATMP. This situation only has conflicts if two transactions are being processed in parallel that are also double-spends -- that is, their inputs spend the same UTXOs. Since atomic swaps are being used, no UTXO can be spent twice, even in parallel situations. It is possible to craft two transactions such that either transaction would be valid if processed by itself but both could be marked invalid if processed simultaneously, such as if tx A's inputs were in order 1,2,3, but tx B's inputs were in order 3,2,1. Marking both as a double-spend would be slightly incorrect, but this could be checked for if needed in a later serial processing step, or maybe just ignored. (Just ignoring these errors allows for block propagation outcast attacks by reducing mempool synchrony between nodes, but I'm not sure how exploitable this is.) In summary, (1)'s concurrency conflicts only affect double-spends, and only will result in false negatives. No invalid transactions will be admitted.

(2) Some of the outputs have been inserted, but not others; these outputs are known to be for valid transactions that are being committed. Parallel insert conflicts aren't possible, since you would need to insert two UTXOs into the same (txid, offset) pair, and that can only happen if it's the same txid. The only possible failure is if a child's outputs have not yet been added by the time that child's inputs are being processed in another thread. This would have the same effect as if the child were processed before the parent, so it presents no new obstacles -- just stick the tx in the orphan pool and trigger it to be reprocessed once its parents are done.

As for the terminology of canonical: I have always been using "canonical order" to mean "in some universally agreed-upon order," and have used "lexical" or "lexicographic" to refer to sorting by txid. All lex proposals are canonical, but not all canonical proposals are lexical. I don't think avoiding the CBO term is a good idea. I think it's just important to distinguish between topological CBOs and lexical CBOs, and make sure that when you're criticizing non-topological sorting, you're clear that you're criticizing lexical order and not necessarily CBOs in general.
 
Last edited:
  • Like
Reactions: throwaway

sickpig

Active Member
Aug 28, 2015
926
2,541
Bitcoin Unlimited - Bitcoin Cash edition 1.4.0.0 has just been released

https://www.reddit.com/r/btc/comments/98ajic/bitcoin_unlimited_bitcoin_cash_edition_1400_has/

no consensus changes, this is the list of relevant features/things done:


  • Graphene Relay: A protocol for efficiently relaying blocks across a blockchain's network (experimental, turned off by default, set use-grapheneblocks=1 to turn it on, spec draft )
  • blocksdb: Add leveldb as an alternative storage method for blocks and undo data (experimental, on-disk blocksdb data formats may change in subsequent releases, turned off by default)
  • Double Spend Relaying
  • BIP 135: Generalized version bits miners voting (
  • Clean up shadowing/thread clang warn
  • Update depends libraries
  • Rework of the Bitcoin fuzzer command line driver tool
  • Add stand alone cpu miner to the set of binaries (useful to showcase the new mining RPC calls, provides a template for development of mining pool software, and is valuable for regtest/testnet mining)
  • Cashlib: create a shared library to make creating wallets easier (experimental, this library factors useful functionality out of bitcoind into a separate shared library that is callable from higher level languages. Currently supports transaction signing, additional functionality TBD)
  • Improve QA machinery (travis mainly)
  • Port Hierarchical Deterministic wallet (BIP 32)
  • add space-efficient mining RPC calls that send only the block header, coinbase transaction, and merkle branch: getminingcandidate, submitminingsolution
 

Tom Zander

Active Member
Jun 2, 2016
208
455
Peter R and I went through the math for the outcast attack, a variant of selfish mining that uses slow-to-propagate blocks
The Selfish Mining (SM) concept is anything but new, and while it was very insightful many years ago we today know how it actually affects miner incentives (it doesn't have a substantial effect).

Coming up with a new technical usage of the same SM concepts doesn't add anything. It certainly doesn't require you to break further Bitcoin features just to avoid an attack we have theoretically been able to have for 10 years, but still have not seen.

If you use a theoretical miner collusion attack in the shape of selfish mining as a rationale to do support Bitcoin Cash hardfork, you don't understand Bitcoin.
Basic fact: miner collusion can not be fixed in the protocol. For the simple reason that miners can freely choose their communication channels.


In the farce that the reordering transactions HF code is becoming we've now see these excuses why its somehow "good";

  • Selfish Mining
  • Faster for parallel validation
  • Faster for parallel block creation
  • Faster block relay
  • Elegance
  • (I probably missed some)
Followed up with armchair algorithms that use terminology like "should be fast".

Without actual running code and empirical (reproducible) results on these claims your arguments are hollow. Because the actual empirical data we have doesn't support your statements. So obviously your ideas for how things work is lacking experience and insight.

Don't hard fork Bitcoin Cash based on mere ideas of what may work (and is actually proven wrong). That's just plainly irresponsible.

Edit; and its a bit sad, me and several others have given alternative solutions that are giving you just what you want and improve life for all. But there has so far not been an iota of compromise or even acknowledgement that improvements are needed.
 
Last edited:
  • Like
Reactions: Richy_T and lunar

jtoomim

Active Member
Jan 2, 2016
130
253
The outcast attack variant of selfish mining we described:

- Does not require collusion. You can manipulate other miners into assisting your selfish mining efforts without their consent or knowledge by controlling who receives your blocks quickly and who does not.
- Has no minimum hashrate threshold for effectiveness.
- Can be addressed with a soft fork, and is unrelated to the topological vs lexical sort discussion.

The outcast attack is significantly more effective than old-style block withholding selfish mining (BWSM), since with BWSM you have to pit the selfish miner against the entire network at once, but with the outcast attack the miner can effectively partition the network however they see fit. This eliminates the 33% hashrate threshold needed for BWSM to be effective; with the outcast attack, you just need to target a group of pools that together have less hashrate than you (ideally half as much).

It *is* a new issue insofar that old block propagation protocols had worst-case performance that was closer to typical performance, transaction fees discouraged attacks using unpublished transactions, and the blocksize limit kept the worst-case scenario from being very bad. In particular, the ability to confound Graphene with transaction reordering is a zero-cost method of manipulating block propagation. Now that those things are changing, we ought to more carefully evaluate whether it is a risk. To me, the ability of a large pool to increase their profits by 7% to 23% at the expense of everyone else's seems significant and worthy of concern.

It is also an old issue. Back in 2015-2016, we saw unequal orphan rate distributions due to the use of spy mining in China. Although it was not intentional, it had the same effect as the outcast attack. Even though China's internet connectivity was the cause of the problem, it was the non-Chinese pools (mostly Slulsh, Eligius, and Ghash.io) that suffered. Some of the evidence of this can be seen in Organ of Corti's blog posts (e.g. this table). Better evidence used to exist on Lightsword's https://poolbench.antminer.link, but no archives exist.
 

Tom Zander

Active Member
Jun 2, 2016
208
455
@jtoomim the outcast attack is trivial to fix using header-first mining (That removes all advantages for individual miners). Which is actually used by many pools today, I've been told.

Also you ignored the part of my post that put forward the point that communication channels are chosen by the miners. To use or not to use a specific propagation technique for certain miners (friends or foe) is their prerogative. This means, and I repeat this, that you can't fix the protocol as they can always just NOT use the "fixed" relay method.

Both of these means your attack is both already fixed and second that the hardfork will not have avoid any cheating in the first place.

And you also keep ignoring the facts about the long list of "advantages" for reordering transactions that every single one been clarified by more experienced developers as not actually an advantage, even in theory. You keep ignoring that there is no actual implementation that benefits from those advantages, and as such the advantages themselves are unproven and can not be compared with alternatives.

The burden of proof on a hard fork lies on the side of the people making the suggestion to change the protocol.

And last it would be really nice to get an actual conversation going on the suggested ways to improve the proposal.
 

jtoomim

Active Member
Jan 2, 2016
130
253
> To use or not to use a specific propagation technique for certain miners (friends or foe) is their prerogative.

Transaction reordering affects the second hop, and the third, and the fourth, etc. This gives the block assembler and miner the ability to control which communication channels are effective for far longer than otherwise would be the case. A miner can make a block that only is transmitted efficiently on the first hop between their servers, by using a non-standard protocol that accounts for their block's strange order, and then is transmitted inefficiently forever thereafter. The second hop can be between the attacker's servers and other poolservers on the same LAN, preventing the inefficiency from slowing things. But after that, delays occur.

> @jtoomim the outcast attack is trivial to fix using header-first mining

For now, yes, but it comes at the expense of transaction fees. Transaction fees should eventually be the dominant source of miner revenue, making header-first mining a temporary stopgap. Effective, to be sure.

> You keep ignoring that there is no actual implementation that benefits from those advantages,

I AM NOT ADVOCATING FOR HARD FORKING BEFORE SEEING IMPLEMENTATIONS. I do not like the November timeline, and I want to see benchmarks before we commit to a fork. All I'm arguing is that it's worth looking into a lexical CBO, because I think it would be neat and efficient and would solve a few potential issues (UTXO inserts become sequential, sharding becomes a little easier, block propagation becomes faster). If the lexical sorting makes a good implementation too difficult or slower, then we shouldn't do it.

> and its a bit sad, me and several others have given alternative solutions that are giving you just what you want and improve life for all. ... And last it would be really nice to get an actual conversation going on the suggested ways to improve the proposal.

Are you referring to Gavin's proposed sorting order? By lowest txid spent by an input in each tx? I responded to that, and said that it's overall okay, though I noted what would likely be a few minor performance disadvantages for Gavin's sorting versus lexical sorting.

I don't have infinite time, so I haven't been able to respond to every last comment that people have made. Sorry. If there's anything specific you want addressed that you think wasn't addressed, feel free to raise it again and I'll try to address it when I get a chance.

> The burden of proof on a hard fork lies on the side of the people making the suggestion to change the protocol.

Yep, totally.
 
Last edited:

Mengerian

Moderator
Staff member
Aug 29, 2015
536
2,597
Hello,
You state this is to make it long term better for scaling. Ignoring that it in actual tested (coded) fact as well as in theoretical design the opposite is proven.
Are you referring to the Flowee code you showed me? Or awemany's model? In either case, I don't think this constitutes being "proven". It's just a demonstration of behavior under a particular scenario, with a particular set of assumptions.

Canonical ordering is a change that lets algorithms be re-worked to make best use of the fact that transactions are always ordered, which has many advantages.

I'm not sure where tomtomtom7 is these days, but apparently Bitcrust greatly benefits from canonical ordering. So it seems that the immediate benefit is based on the code architecture. The idea of canonical ordering is that it enables code design that is suitable for massive (many orders of magnitude) future scaling.

What does it take for you to start challenging your own assumptions?

For instance this kind of statements;

> So it makes the design inelegant to have a spurious dependency on causal order within the timestamping structure

This "spurious dependency" is nothing of the sort. There is no dependency and the ordering is a natural consequence of the flow of time.
I try to always challenge my own assumptions and think about things from as many angles as I can.

Here you talk about the "flow of time". But that is only relevant if you are processing thing sequentially in order. When you want to deal with things in parallel, having to deal with the causal order is a huge stumbling block that makes things more difficult. The timestamping in blocks conceptually treats transactions as a set, and constraining that set according to causal dependencies is not necessary. The causal dependencies are already captured in the transaction inputs pointing to previous outputs.
 

awemany

Well-Known Member
Aug 19, 2015
1,387
5,054
@Mengerian, @Tom Zander:

The whole block chain like it is now forms a poset. Due to the nature of posets, any fixed interval that you extra from it, whether it is a block or not, will again be a poset.

If you are convinced that lexical order is the way to go with validation and most efficient, you should be able to sort the whole Blockchain (BCH or BTC) into lexical order and then apply your outer OTI validation algorithm and be faster with your algorithm than doing the same with the current partial order.

I believe this won't happen, due to the nature of the chain. Which is a poset, with outputs consuming inputs, and thus locality that can be exploited to reduce UTXO set traffic in this partial, but not in lexical order.

But I am happy to be proven wrong. I think this is a great test for you guys to make with your code.
 

Tom Zander

Active Member
Jun 2, 2016
208
455
> Are you referring to the Flowee code you showed me? Or awemany's model? In either case, I don't think this constitutes being "proven".

Interesting point of view.

Let me quote the scientific method;

The principles and empirical processes of discovery and demonstration considered characteristic of or necessary for scientific investigation, generally involving the observation of phenomena, the formulation of a hypothesis concerning the phenomena, experimentation to test the hypothesis, and development of a conclusion that confirms, rejects, or modifies the hypothesis.

The fact is that we have created actual scientific evidence. We have done the investigation. And to scientifically prove any hypothesis these steps are required.
Stating you don't think this constitutes prove means you should either start to gather scientific evidence of your own, or you should make it clear you are biased and/or don't want to do scientific based research. There really isn't a 3rd option.

> The idea of canonical ordering is that it enables code design that is suitable for massive (many orders of magnitude) future scaling.

In case this is news, I'll share this again. As a developer I've been upgrading the codebase (now in Flowee) to do exactly that. I have shared the results on occasion and many orders of magnitude are being reached. I'm just unlike people like tomtomtom7 a person that doesn't make presentations until I actually have reached my goals. (nothing bad against tomas, I'm just different)
I do write the occasional blog post;
You can benefit from this research as its all open source. I don't know much other projects that do this (yes, tomtomtom7 has done some interesting stuff, but its slow going) so maybe you can not just discard the results we actually have.
 

awemany

Well-Known Member
Aug 19, 2015
1,387
5,054
I like to add to my above post one observation which just shows to me that we're operating on a very complex landscape with many unknown unknowns still:

If you do std::sort(..) on a million transactions and compare by TXID (lexical ordering), this will take about half a second on a decent CPU.

Which, if you do it before you destroy the order with Graphene transmission, and then do it again on the other end to be able to validate the block, becomes 1s.

That's not a lot, but consider this: The order information of a million transactions takes up 20bit x 1 Million = 20 Megabit.

So if your network link has a bandwidth of >20MBit/s, it becomes faster to just transmit the block order along rather than destroy and recreate it.

If you assume sorting for larger sets becomes superlinearly (e.g. O ( n log n) ) more difficult, sending the order along becomes relatively more attractive for larger blocks.

Of course, the sorting can be improved (e.g. parallel sorting). But then, there's also the question of validation speed due to outputs being consumed, which I have explored a bit in http://www.github.com/awemany/valorder. And whether poset or lexical ordering becomes faster in parallel.

For a single CPU, it is O(n log n) for lexical vs. O(n) (albeit with higher constants) for poset order.

Which isn't exactly a strong argument for lexical ordering either.

This altogether seems like a pretty complex arrangement with many different veriables to me and I think we should be careful to not change something which we haven't sufficiently understood yet.

I freely admit that I don't understand all the issues around using different sortings yet.

But I also decidedly assert that this goes for the proponents of CTOR as well.
 

bitsko

Active Member
Aug 31, 2015
730
1,532
If the relevant ecosystem participants do not move toward removal of the limit as the protocol slowly ossifies (as it grows) it increases the risk of another sustained chain split.


There will always be a market for an unrestricted in scale blockchain. There may not be a market for multiple restricted blockchains.

What are the technical issues with miner soft limits and orphaning?

#nolimit
 

Tom Zander

Active Member
Jun 2, 2016
208
455
The conclusion of awemany mirrors my own, and I like that @jtoomim agrees as does @Peter R and many others.

The conclusion is simply that while its awesome more people are concerned and occupied by scaling, the fact remains that it is far to early to make hard fork changes. Much research is needed to see the effects of the various possible algorithms. Actual real life data is needed to see if the market actually needs a hard consensus rule on this topic.
 

awemany

Well-Known Member
Aug 19, 2015
1,387
5,054
@jtoomim:

Just briefly on the block withholding / slow-to-propagate attacks: I think with improvements in parallel validation and being aware as a miner that one's connectivity is very important, this seems all solvable. I am absolutely agree on being careful.

On the issue of mempool validation: I am very aware that mempool ingress is not block propagation. But that's the origin of my 'adapter' comment in the whole debate, if you necessarily have partial order on ingress and lexical on validation, somewhere the resorting has to happen. It was also one of my key criticisms, that Vermorel et all. basically want to get around partial ordering, argues all as if that is possible in general and then in the end argues for CTOR in blocks, pretty much. Red herrings all around.

Furthermore, here's something that came to my mind today which I think is a very strong argument for rather dealing with the partial order on that frontend of memory ingress as well:

The block chain is in state of machine consensus whenever a certain poset of the many possible posets have been chosen. Independent of the orderings, the chain, taken as a whole, has to form a poset.

Now, you can apply the OTI to mempool validation. You can take in chunks and parallel-process them. But as I said above, this will only give you information on validity with a certain granularity.

And here it becomes very interesting: Assume you get transactions "A" to "T" on mempool ingress like this (it actually doesn't matter whether it is mempool ingress or block validation, but the question I am posing here is rather interesting in the mempool ingress situation):

QBOPSAJLFECGNIMDTKRH

Assume that this is in partial order, except for C being dependent on D and being out of order. Assume (just for the sake of the argument, that they're all depending on their respective preceding transaction):
With lexical order validation and five workers, they'd see something like this:
[ABCDE][FGHIJ][KLMNO][PQRST]

if you do OTI validation on this, you'll get a failure, but you do not know where exactly the problem is. You need to backtrack through the transaction set to find it.

With any (including OTI) validation on the poset, you'd fill your workers with:

[QBOPS][AJLFE][CGNIM][DTKRH]

And out of the four workers, the third diff to the UTXO set will fail while the first two succeed. Which will tell you right away that the problem is somewhere in [CGNIM].

And now it becomes very interesting: You can do an O(log n) search to drill down into your offending set and exactly find the offending transaction. Try that with lexical ordering!

This means that ingress into partially ordered mempools can be parallelized in a very nice way:

You have n transactions and m workers. So your time to process becomes O(n / m). If you have transaction trouble (which you'll will every time you want to collect together into blocks), you can drill into this with another O(log n) step so you'll end up at something like O( n / m * log n) at most for finding a fraction of your poset that is all good. Not to bad, is it? Granted, the m might likely be rather the thickness of the pipe to your UTXO set storage (which can be sharded just as well with posets as with lexical ordering!), but the general principle remains.

The partial order is just partial, yes. There's many of them possible. But that doesn't mean at all that it is worthless. Quite far from it. Partial ordering actually is enough to allow you to drill down with a binary search into broken transaction sets to find the offending transactions!

If you know a fancy algorithm that does this for lexical order, I am all ears.

(Also @Peter R., @Mengerian )
 
Last edited:

Peter R

Well-Known Member
Aug 28, 2015
1,398
5,595
@Mengerian, @Tom Zander:

The whole block chain like it is now forms a poset. Due to the nature of posets, any fixed interval that you extra from it, whether it is a block or not, will again be a poset.

If you are convinced that lexical order is the way to go with validation and most efficient, you should be able to sort the whole Blockchain (BCH or BTC) into lexical order and then apply your outer OTI validation algorithm and be faster with your algorithm than doing the same with the current partial order.

I believe this won't happen, due to the nature of the chain. Which is a poset, with outputs consuming inputs, and thus locality that can be exploited to reduce UTXO set traffic in this partial, but not in lexical order.

But I am happy to be proven wrong. I think this is a great test for you guys to make with your code.
This is a great experiment! Treat the entire blockchain as one big 160 GB block. Give one node that huge block in the current order, give another node the block in lexical order. In a fair race, which node would finish validating this huge block first?

When viewed from this perspective, I think most people would bet that the naturally-ordered block would validate first. If the proponents of lexical ordering can demonstrate that the node given the lexically-ordered block would win instead, that would be a counterintuitive result that might change a lot of people's minds.
[doublepost=1534709421,1534708308][/doublepost]
 

jtoomim

Active Member
Jan 2, 2016
130
253
If you are convinced that lexical order is the way to go with validation and most efficient, you should be able to sort the whole Blockchain (BCH or BTC) into lexical order and then apply your outer OTI validation algorithm and be faster with your algorithm than doing the same with the current partial order.
You don't need to sort into lexical order to do this. You can just do OTI on the topologically sorted data. Adding an extra check on order isn't too difficult. Consequently, I think you'd end up using OTI as the algorithm for both lexical and topological sorted blocks.

@Peter R : Please don't use the term "natural ordering". It's meaningless. There are a massive number of topological orderings for any given block, and no one of them is more natural than the others. The specific order we get automatically from our block assembly algorithm is just one of those which the particular implementation we're using gives us, and is about as artificial as it gets. The correct term that you're looking for is "topological ordering" or a more specific term, like "Gavin's topological ordering."

@Peter R: As for your poll, I suspect there is an optimal chunk size for parallel processing using the OTI algorithm, and I believe that optimal chunk size to not be 160 GB. There are relatively few output cancellations within a single block (< 10%?), but there are far more between

In other news, I now have a modified version of Bitcoin ABC's OTI validation algorithm running on pre-fork blocks with parallelizable order checking. I'm finishing up unit tests for it now -- apparently, there were no test cases that examined transaction orders. Once I'm done, I'll publish it as a PR and then get to work on a proof-of-concept parallelization of this validation using OpenMP and a concurrent hashtable. I probably only have the rest of today to work on this, so I don't know if I'll finish the parallelization attempt.