I'd like to understand this debate with respect to parallelization better. Let's just look at block validation for now:
Thanks for laying this all out so clearly
@Peter R, that helps.
Overall I generally agree with how you described things, but there are a few points where I see it a bit differently. These could maybe explain why we have come to different conclusions on LTOR.
Regardless of the way transactions are ordered, validating all of the transactions in the block seems to be highly parallelizable (with one proviso being not _too_ many chained transactions, in which case topological seems better).
I'm not sure about this. Because the technique for parallelizing validation ("outputs-then-inputs") also automatically handles chained transactions, it seems to me that long chains would be easily handled by the parallel algorithm, and checking the topological order would just be an extra difficult-to-parallelize step. I'm not sure how it would all work out with many small transaction chains, versus long transaction chains. Maybe
@awemany has some insight into this.
With TTOR or LTOR, there's an additional check I need to make: does the order of the transactions satisfy the ordering rule? My hunch here is that LTOR will be better, because each worker can check its local ordering--as long as all the local orderings are correct, and the boundary conditions between the workers are correct (I'm a physicist not a computer scientist, so that's probably the wrong term), then the total block order is correct. With TTOR, I think the check would be more complicated to parallelize. I don't know whether this advantage will be significant. (Also, of course AOR is even better here, because then this check isn't required at all.)
I basically agree with the order checking difficulty can be ranked AOR > LTOR > TTOR. I also agree with you description of why checking order with LTOR is easily parallelizable. I think the important thing about this, though, is that the difference between AOR and LTOR is basically insignificant, whereas the difference between LTOR and TTOR
is significant.
Also, I notice a big difference in our thinking when you say: "I think the check would be more complicated to parallelize. I don't know whether this advantage will be significant."
Because, the way I see it, to reach very large scale, parallelizing basically trumps everything else. Let's say something like subchains increases the ceiling on feasible block sizes by incrementally building (or validating) blocks during the 10-minute window. But if the process is inherently "linear" you still have a ceiling, even if it's 100 times higher. You can't really throw more hardware at the problem, because there is ultimately a technical limit of how quickly each step can be done.
Parallelizing, on the other hand, means you can break the task into many smaller chunks, and use more hardware. So even if the parallel process requires, say, twice as much computation as a linear process, you can handle the same workload with twice the hardware. And if it's really parallel at all levels, there no real technical limit to the workload it can handle. It turns it from a technical problem into an economic problem, which automatically solves itself with more usage.
As a hypothetical example, maybe the linear process will need computing resource worth 0.01 cents per transaction, but the parallel process will need 0.02 cents to process a transaction. Either way, it's cheap enough to not be a problem, and the parallel system can keep adding computing resources as usage increases, without ever running into a ceiling.
(By the way, I don't think parallel processes will actually use twice the resources, it's just a hypothetical example to make the point)
Looking at the whole picture, after LTOR is in place, Merkle tree structure is the main remaining non-parallelizable data structure. Maybe changing the Merkle tree to a Merklix tree will look like a good way to go, but I imagine that will have to go trough lots of analysis and debate. If it turns out to be infeaible to do that for some reason, I guess we'll have to assess whether the benefits of CTOR for propagation outweigh the benefits of incrementally building the Merkle tree, and potentially revert LTOR if not.
[EDIT: Just after I posted, I see that you you just posted a comment about Merklix trees. I agree that there is some synergy between the two, and make a compelling roadmap if considered together. But a case can be made for LTOR with or without Meklix]
To completely remove the scale ceiling, we want to turn the technical limits into something that can be solved through economics. The way to do this is to parallelize everything.