Gold collapsing. Bitcoin UP.

Peter R · Aug 19, 2018

@jtoomim: by natural order I mean the order the blockchain is already in. There is only one possible ordering because it's an event that's already happened.
[doublepost=1534714431][/doublepost]More broadly, I think the term "natural ordering" is useful. It could be defined precisely as the order in which transactions entered a node's mempool. Indeed, my node's natural ordering may be different than your node's natural ordering, but I don't think that makes the concept less useful.

jtoomim · Aug 19, 2018

awemany said:
So if your network link has a bandwidth of >20MBit/s, it becomes faster to just transmit the block order along rather than destroy and recreate it.

You're forgetting that you need to upload that data to 20 peers in parallel. That makes it about 400 MBit/s of network goodput. That is quite difficult to achieve on a global network of internet nodes with TCP no matter how much you spend on servers.

@Peter R Perhaps that would be better termed "historical order" or "current order"? Nature is not arbitrary, but history is.

If your definition of "natural order" is the order in which transactions enter mempool, then there is a different natural order for each node on the network. This again becomes an arbitrary order based on location and connectivity. This order also is very different from the block order.

Definition of nature:
1. the phenomena of the physical world collectively, including plants, animals, the landscape, and other features and products of the earth, as opposed to humans or human creations.
2. the basic or inherent features of something, especially when seen as characteristic of it.

I don't think sense #1 is relevant, as it's human-created. The examples of natural orders that you proposed are not inherent features of the transaction graph, so I don't think that fits either.

awemany · Aug 19, 2018

jtoomim said:
You don't need to sort into lexical order to do this. You can just do OTI on the topologically sorted data. Adding an extra check on order isn't too difficult. Consequently, I think you'd end up using OTI as the algorithm for both lexical and topological sorted blocks.

Yes, I thought I made that clear. I get that. See also my simulation that explores both cases using OTI - and which makes me strongly suspect OTI on the poset will be faster due to UTXO cancellations. Which in turn makes me a sceptic on changing the order to CTOR or anything else prematurely.
But we both agree on this (waiting a bit and postponing this change for now) being the best course of action. But maybe you have an ear at ABC?

You're forgetting that you need to upload that data to 20 peers in parallel. That makes it about 400 MBit/s of network goodput. That is quite difficult to achieve on a global network of internet nodes with TCP no matter how much you spend on servers.

I was just waiting for this criticism

But do we know that some UDP multicast protocol won't be the eventual mode for block transmission?

I don't think we really do. And that's why I am so strongly arguing to proceed carefully in this area as well.

Furthemore, with weakblocks (which I hope will work at the 20..30s level or so), that would be cut down by about that factor 20 again ...

Looking forward to your PR. I have something brewing in this area now as well. Aligned to what I've been saying on this recently.

Tom Zander · Aug 19, 2018

jtoomim said:
You don't need to sort into lexical order to do this.

No, no, no.

The entire point that the hard forkers make is that they need to force 100% of the people to actually use lexical order.
I agree that the proposed algorithm doesn't require this. But the people in ABC keep stating that it does by not retracting or even commenting on the hard fork requirement.

I have to say that the discussion is degrading if you ask people to not use "natural ordering".

jtoomim said:
apparently, there were no test cases that examined transaction orders

Thats because it uses a natural ordering. Just like you don't have a rule that your swimming pool requires gravity or your car requires friction, similarly a node doesn't have a rule or check for natural ordering. The entire point is that natural phenomena don't need to be checked, they are instead used to make things actually work.

The whole idea to remove natural ordering reminds me of this joke:

awemany · Aug 19, 2018

@Peter R Perhaps that would be better termed "historical order" or "current order"? Nature is not arbitrary, but history is.

If your definition of "natural order" is the order in which transactions enter mempool, then there is a different natural order for each node on the network. This again becomes an arbitrary order based on location and connectivity. This order also is very different from the block order.

I think I was the one who introduced this wording of calling it the natural order. Let me defend this: Yes I know this is not an unique order. But it is the order the block chain operates in its natural state. It is natural in the sense that transactions are naturally forming a DAG. Which, as I might remind you, was a key point of one of your recent blog posts

I am not transacting with the future or sending it back to the past.

Have you read my comment on Vermorel's paper regarding an omniscient observer? The argument that OTI works because it is a DAG also means that an omniscient observer will always observe the natural order (or an natural order) of transactions.

Of course, we could start nitpicking here about a natural vs. the natural order now. Given that there's the O(log n) drill down that is possible with an order (but only is aligned with the underlying parts of the algorithm in the case of partial ordering, unless someone proves me otherwise!), it fulfills the criteria for "ordering" in this important sense.

Also, we seem to start to agree on (I guess?) that the only place where this extra entropy that is in the natural order is really of interest is in block transmission. I am not 100% on that either, but it seems to be a common point between us. Even if we disagree on the merits of CTOR. Just pointing that out.

jtoomim · Aug 19, 2018

@awemany I mentioned some of the problems with trying to cancel UTXOs at the thread level in a parallel algorithm on reddit, but I never received a reply so I'm not sure if you read it:

I don't talk with the ABC folks much, and don't have an ear there. I've spent way more time talking with you guys.

@Tom Zander: I took a look at your Flowee block validation code. Correct me if I'm wrong in my description of how it works:

First, you scan blocks to find blocks that contain wallet transactions. You ignore all other blocks or process them only cursorily. You do not build a UTXO database for all transactions, as you're not making a full node.

Next, you check your transactions in more detail. If its inputs are UTXOs from a previous block, you go back to that block and check the outputs (right?). If not, then you know the UTXOs must be from the current block. You know that the UTXOs must come earlier in the block than your transaction, so you scan in order until you find them. This means you'd have to scan on average 1/2 of 1/2 of the block (your tx is probably halfway through).

Correct so far?

If this is the case, it seems to me that lexical sorting would actually be an advantage for you, not a disadvantage. Instead of scanning the block in order, you could just do a binary search to find the transaction generating that output. This speeds up your algorithm from O(n/4) to O(log2 n). If you wanted to check that transaction too, you could repeat the process for it, until you get to a transaction from another block. At that point, you need to know the height in which that transaction appeared, which requires indexing.

Anyway, nice chatting with you all. I'm going to duck out of the thread for a bit and try to finish up some code without distractions.
[doublepost=1534715728][/doublepost]P.S.: @awemany What's wrong with the term "topological"? I think it describes exactly the thing you're trying to describe without any of the confusion of "natural".

awemany · Aug 19, 2018

@jtoomim: No I didn't read it before (sorry), though just answered. You listed my solution in your solution set.

On the term "Topological": Nothing is wrong with that, just that I like less syllables

Peter R · Aug 19, 2018

jtoomim said:
@Peter R: As for your poll, I suspect there is an optimal chunk size for parallel processing using the OTI algorithm, and I believe that optimal chunk size to not be 160 GB. There are relatively few output cancellations within a single block (< 10%?), but there are far more between.

This thought experiment is helpful and seeing the downside of the lexical approach. I'm quite confident the block with the current ordering has the advantage.

It may be true that today there are few cancellations within a single block, but when blocks are 100 GB, maybe there will be significantly more (think fast micro payments systems or automated trading algos, etc).

The bottom line is that lexical ordering means trade-offs. We need to better understand those trade-offs before we make a change the alters the nature of the blockchain (it would no longer have the property that @awemany pointed out that any contiguous chunk of transactions in the blockchain also forms a poset).

jtoomim · Aug 20, 2018

https://github.com/Bitcoin-ABC/bitcoin-abc/pull/244 has pre-fork OTI validation. No parallelization yet, though. It will still be interesting to see if it affects performance at all.

Trying to decide if I should do a parallel-with-finegrained-locks version first or just go for broke with a concurrent hashtable. I'm leaning towards doing a locked parallel version first to ensure correctness before hacking CCoinsViewCache all to hell.

Peter R · Aug 20, 2018

@awemany

I re-read the Vermorel CTOR paper again this evening. I know you pointed out the problems with this section, but I couldn't resist posting these three paragraphs. It gives the impression that the transactions you admit into mempool need to be topologically sorted to create a block candidate and that this is some big problem. But the only way they can enter your mempool in the first place is in a topological order. Furthermore, the paper makes it seem like appending new transactions to your block candidate as they come in while maintaining the topological order is "an open research problem." But you can just tack the new transactions onto the end of your block. The order is already topological.

They say you can't have your cake and eat it too. This is true, but you also can't eat your cake before you have it. If someone ate his cake, causality requires that he must have previously had his cake. There is no risk that the "eating" and the "having" occurred in the wrong order.

Or consider one's family tree. It is a directed acyclic graph not unlike transactions in a block. The order the people in one's family tree were born is a topological order. There is no way it can be non-topological because someone cannot have a child until after they themselves were born. When your children have children, you don't need to worry that a child might be born in a non-topological order. If things like that were possible, we'd have bigger problems like your nephew becoming your great uncle and you becoming your own grandfather!

"Son, make sure you practice safe sex. If you have a child in a non-topological order it could lead to a logical paradox that destroys the universe."

Tom Zander · Aug 20, 2018

jtoomim said:
@Tom Zander: I took a look at your Flowee block validation code. Correct me if I'm wrong in my description of how it works:

First, you scan blocks to find blocks that contain wallet transactions. You ignore all other blocks or process them only cursorily. You do not build a UTXO database for all transactions, as you're not making a full node.

I'd be glad to explain that C++ code to you at one point if we meet in real life. Bit tricky on a forum.
Bottom line is that that code you linked to doesn't have a wallet, it is the test application to test the transaction ordering in a multi-threaded utxo database. It does build a UTXO database of all transactions.

jtoomim said:
Instead of scanning the block in order, you could just do a binary search to find the transaction generating that output.

This misunderstands the nature of Bitcoin Blocks. They don't come indexed. You can't jump to the 10th or 100th transaction without at least once serially walking through the entire block. The various var-length integers and generally speaking the variable length of each transaction is the cause of that.
Based on this simple misunderstanding your approach would already not work (faster).

jtoomim · Aug 20, 2018

I see, I misread what you were testing for on your initial scan. And yeah, you can't do a binary search without an index, duh. Thanks for pointing that out.

But since you're going to be processing all transactions anyway, I don't see why the outputs-then-inputs algorithm can't work for you. Are you concerned that doing two passes will be slower, or is it something else?

Tom Zander · Aug 20, 2018

Peter R said:
I re-read the Vermorel CTOR paper again this evening.

I found it hard to get through because of the many false assumptions it makes. The amount of mind-twisting it goes through to get to its conclusion.
The authors of that paper have forever been dragged through the mud. I know they are being paid for it, but this is their personal names there. Their employer isn't even mentioned.

jtoomim said:
But since you're going to be processing all transactions anyway, I don't see why the outputs-then-inputs algorithm can't work for you. Are you concerned that doing two passes will be slower, or is it something else?

The first pass of inserting all will be practically all serial and you lose all advantages of parallel processing.

If you fall to understand why inserting a huge amount of items into a database will end up being all serial, try it or ask any database or even data-structure expert. Especially mention that not all items will fit in memory.

awemany · Aug 20, 2018

@Mengerian, @Tom Zander , @deadalnix , @Peter R :

I think one should at least make the effort to try and look how Canonical ordering compatible with current validation rules would look like. I have therefore spent some effort to implement that:

https://github.com/BitcoinUnlimited/BitcoinUnlimited/pull/1275
[doublepost=1534764164][/doublepost]

The first pass of inserting all will be practically all serial and you lose all advantages of parallel processing.

If you fall to understand why inserting a huge amount of items into a database will end up being all serial, try it or ask any database or even data-structure expert. Especially mention that not all items will fit in memory.

I think we want to move to shards? They would benefit from parallel insertion. I have no objection against shards (Though I DO have an objection against any attempts to "shard geographically" for political reasons. Are there plans to go this route?).

However, I still fail to see the advantage of lexical over partial order for this.

The about only advantage I can see is that transaction routing becomes a bit easier for the store the new UTXOs part. But all the UTXO consumption is as chaotic as before - and you give up the advantage of partial UTXO cancellation in your workers.

And this is all likely going to be io-bound anyways ... and partial cancellation plays out its advantage especially in this scenario.

Tom Zander · Aug 20, 2018

UTXO sharding is an interesting concept that in my opinion is still a couple of product-iterations (or optimization iterations) away. There are much more obvious optimizations to be made today and those will take our focus for some time.
At this point I don't see any focus on sharding being helpful. Maybe in the future that will change. Until then I'd follow the prematur -optimization-is-the-root-of-all-evil rule.

As long as most nodes still use levelDB for the UTXO, with all the changes from a block (both out outputs and new outputs) being stored in memory until the block finishes validation we should likely not look at things like sharding

This approach makes sure that when you have too many inputs and/or outputs your node will run out of memory.

ps. completely agreed on geographically. The assumption I made was that it would be based on the first byte of txid.

micropresident · Aug 20, 2018

Block construction is done by feerate.

Peter R said:
@awemany

I re-read the Vermorel CTOR paper again this evening. I know you pointed out the problems with this section, but I couldn't resist posting these three paragraphs. It gives the impression that the transactions you admit into mempool need to be topologically sorted to create a block candidate and that this is some big problem. But the only way they can enter your mempool in the first place is in a topological order. Furthermore, the paper makes it seem like appending new transactions to your block candidate as they come in while maintaining the topological order is "an open research problem." But you can just tack the new transactions onto the end of your block. The order is already topological.

They say you can't have your cake and eat it too. This is true, but you also can't eat your cake before you have it. If someone ate his cake, causality requires that he must have previously had his cake. There is no risk that the "eating" and the "having" occurred in the wrong order.

Or consider one's family tree. It is a directed acyclic graph not unlike transactions in a block. The order the people in one's family tree were born is a topological order. There is no way it can be non-topological because someone cannot have a child until after they themselves were born. When your children have children, you don't need to worry that a child might be born in a non-topological order. If things like that were possible, we'd have bigger problems like your nephew becoming your great uncle and you becoming your own grandfather!

"Son, make sure you practice safe sex. If you have a child in a non-topological order it could lead to a logical paradox that destroys the universe."

Peter R · Aug 20, 2018

Thanks for joining the discussion.

"Block construction is done by feerate." -- @micropresident

There is no requirement that block construction be done by fee rate. It feels to me as though you're optimizing for a future where you believe certain assumptions will hold, but when the future arrives it may look very different from what you expected. In such a future, those optimizations could hurt rather than help.

This comment of yours is a great example. If you imagine a future where block space is constrained by the protocol, then it absolutely does make sense to eject low-fee transactions in order to make room for higher-fee paying transactions. But if you imagine a future where the block size limit is maintained far above demand, every transaction that enters your mempool (b/c is pays a sufficient fee and meets your other requirements) can immediately go into your block candidate without hurting your bottom line. There is no need to eject lower-fee transactions because you earn more profit by including those transactions than by not including them.

In fairness, I can make two counterpoints to my argument above:

1. The curve that describes the marginal cost to include another transaction is probably a slightly concave-up function of block size, and so you could probably make an argument that while 1 sat/byte transactions are profitable for 1 MB, it requires 1.1 sat / byte for 10 MB. [I think this effect will be negligible until orphan rates are >10%, which I doubt we'll ever see.]

2. A large miner may intentionally delay very low fee transactions in order to create a market for timely confirmation, and such ordering by fee rate might be useful. [To me it would be a good thing if it costs them a bit to pick and choose like this, however].

The way I see it, we really have no idea yet. We should not make a fundamental change to the structure of the blockchain until we have solid evidence that it's a truly helpful change.

freetrader · Aug 20, 2018

Everyone seems to be playing hardball now.
ABC 0.18.0 with November upgrades released and includes CTOR + OP_CHECKDATASIG, but excludes the unfinished nChain/CG opcodes.
So this is confirming trajectory to split, I guess Vin Armani is one step closer to winning his $10 bet.

I'm displeased with ABC pushing this out while analysis of parallelization impacts is still pending completion (and ABC themselves did not publish such thorough analysis). This isn't the right way to treat the community imo.

Remains to be seen who adopts the new client.

SPAZTEEQ · Aug 20, 2018

I would suggest this is in keeping with past behaviour. I understand to be displeased, but since a level of dispute is inevitable, my feeling is the sooner the better, get it done now. Look at it as they are daring others to split the chain, that it is a calculated move. This has been brewing since before the last fork. Nobody was prepared to split yet, and now we are here.

The discourse here is amazing, passionate. I think there is enough smarts to make this thing work as intended. I refer to the broader picture (esp. miners). Well anyways, here's hoping...

AdrianX · Aug 20, 2018

Nobody was prepared to split yet, and now we are here.

Who wants to split the BCH chain and over what? ABC has a president of pushing their way or the high way, it has not mattered until now, do they want to split?

I think someone may want leverage influence and split the chain (and it's not CSW), but I don't think the chain wants to split over rather pointless differences.

Gold collapsing. Bitcoin UP.

Well-Known Member

Active Member

Well-Known Member

Active Member

Well-Known Member

Active Member

Well-Known Member

Well-Known Member

Active Member

Well-Known Member

Active Member

Active Member

Active Member

Well-Known Member

Active Member

New Member

Well-Known Member

Moderator

Member

Well-Known Member