BUIP033: (passed) Parallel Validation

Peter Tschipper · Oct 22, 2016

BUIP033: Parallel Validation
Proposer: Peter Tschipper
Submitted on: 10/22/2016

Summary:

Essentially Parallel Validation is a simple concept. Rather than validating each block within the main processing thread, we instead create a separate thread to do the block validation. If more than one block arrives to be processed then we create yet another thread. There are currently up to 4 parallel block processing threads available making a big block DDOS attack impossible. Furthermore, if any attacker were somehow able to jam all 4 processing threads and another block arrived, then the processing for the largest block would be interrupted allowing the smaller block to proceed, unless the larger block or blocks have most proof of work. So only the most proof of work and smallest blocks will be allowed to finish in such
as case.

If there are multiple blocks processing at the same time, when one of the blocks wins the race to complete, then the other threads of processing are interrupted and the winner will be able to update the UTXO and advance the chain tip. Although the other blocks that were interrupted will still be stored on disk in the event of a re-org.

Design:

The following describes the internal mechanisms used to achieve parallel validation.

2a) Script Check Queues: A total of four script check queues are created with their own thread group which are used to validated signatures. Each new block that arrives will be assigned one of those queues during the validation process.

2b) Semaphores: There are two semaphores for managing block validations which are sized at 4, for new blocks, and 32 for IBD. Although there are only 4 script queues available we still allow 32 IBD threads to run concurrently since that gives the main processing thread a chance to retrieve other blocks while the current ones complete their validation.

2c) Locking: The parallel processing is made possible by first having separate threads for validation, but then largely by managing the internal locks of `cs_main` in the `ConnectBlock()` function which is found in the `src/main.cpp` source file. During the process of `CheckInputs()` we do not have to maintain the lock on `cs_main` which allows other competing threads to continue their own validation; the reason for this is that each thread of validation uses it's own view of the UTXO and the scriptcheckqueue's have their own internal locking mechanism. Furthermore when it comes time to wait for the script threads to finish we also do not need to maintain the `cs_main` locks during the `control.Wait()`.

Although this unlocking and locking of `cs_main` causes some overhead it is not invoked during the mining process but only when we receive a new block from an external source which needs validation. It is designed primarily to prevent the big block DDOS attack from jamming the main processing thread.

2d) Interrupts: If multiple blocks are competing to win the validation race or if all 4 script queues are in use and another new block arrives there are several ways we use to stop one or all of the other competing threads. We do this by first sending a message to quit the script threads which prevents them from completing their verification, followed by issuing a standard thread interrupt. Also, if one block has finished and has advanced the tip, the other concurrent threads will see that the tip has advanced and will exit their validation threads.

2e) Temp view cache: Each processing thread has it's own temporary view of the UTXO which it can used to pre-validate the inputs and ensure that the block is valid (as each input is checked the UTXO must be updated before the next input can be checked because many times the current input depends on some previous input in the same block). When and If a processing thread wins the validation race it will flush it's
temporary and now updated view of the UTXO to the base view which then updates the UTXO on disk. This is key to having several threads of validation running concurrently, since we can not have multiple threads all updating the same UTXO base view at the same time.

2f) nSequenceId: In order to have the correct `pindexMostWork` we must update the `nSequenceId` an additional time after the winning block updates the `UTXO` and advances the chain tip. We can not only rely only on the `pindexMostWork` returned from the `CBlockIndexWorkComparator()` as was previously the case. That is because pindexMostWork returned from the comparator may not necessarily point to the winning block. So what we have to do is swap the `nSequenceId` between the winning and losing blocks such that the winning block has the lowest nSequenceId and the losing blocks nSequenceId's are bumped up one while at the same time keeping their relative ordering.

IBD and new blocks:

Parallel Validation is only in effect when new blocks arrive. In other words, the `fIsChainNearlySyncd` flag must be true indicating that IBD has been completed.

Although parallel validation is not in operation during IBD, the blocks arriving that need validating will get processed in a separate thread. This helps to free up the main processing thread and allows requests for more blocks to be made while the other threads continue to be processed.

How is mining affected:

Mining is not affected by Parallel Validation. When new blocks are created locally they bypass parallel validation. In other words, the `cs_main` locks are not unlocked and then locked repeatedly, allowing the validation process to be completed as quickly as possible. Whether parallel validation is invoked or not depends on the boolean `fParallel`. When set to `true` then parallel validation is in effect, and when `false` , as in the case of generating a new block, then it is turned off.

NOTE: Miners will still use parallel validation if a block arrives from an external source. It is only turned off when validating a block they mine themselves.

Christoph Bergmann · Nov 8, 2016

If this affects syncing a node positively I'm all for it. Syncing should be top-priority since it is imho the most pressing limitation on scaling onchain.

Inca · Nov 8, 2016

I suppose the thing about bitcoin block validation is that the 'account state' is contained in the UTXO set up to and including the previous block which prevents multiple blocks being processed in parallel to speed up synchronisation?

What you are proposing with thread enabling for each block instance which arrives is an interesting idea but how long does it take for the main loop to process a block (in a ..blocking manner)? What benefit is there of processing 4 blocks simultaneously or milliseconds apart if only the first or smallest will win? Is it right that a small block with a later timestamp should beat a much larger block which arrived earlier with longer block validation time?

As a anti-DDOS I think it has merit.

Peter

priestc · Nov 9, 2016

Parallel validation in threads is a good idea, but I don't think doing by block is the best way. If you have 8 CPUs there should be 8 threads, each validating part of a block. In other words, it shouldn't take multiple blocks coming in to kick in parallel validation. All blocks should take advantage of parallelization.

Also, the next step after thread based parallelization should be machine based parallelization, which should *really* allow for massive scaling gains. With threads, you're pretty much limited by how many CPUs the machine has. With machine based parallelization, you can hook multiple machines together to make a "validating cluster", which can have an unlimited amount of computing power.

Imagine firing up a 1000 machine validating cluster, each with 8 CPUs each... You could validate an 8GB block by splitting it into 1000 pieces, each node getting 8MB. Then each CPU gets it's own thread, and does 1MB worth of transactions each. It would take the same amount of time to validate a 8GB block as it does to validate a 1MB block today.

freetrader · Nov 10, 2016

Is it a consensus rule that transactions in a block must be validated in sequence (because of possible dependencies between transactions in said block)?

If so, splitting the block up for parallel validation might need to take into account if there are any such tx chains in the block...

Peter Tschipper · Nov 10, 2016

@priestc @Christoph Bergmann Due to the nature of the UTXO a block can not be split up and validated in parallel the way you mention. The purpose here of doing parallel block validations for new blocks is to allow us to take the block size cap off and get to much larger blocks without the possibility of being attacked by a rouge miner. There is also the additional benefit of taking the block validation out of the main processing thread; currently in bitcoin when a block arrives everything must stop until the block if fully processed. That doesn't happen in parallel validation and the main thread can continue to process transactions while the block is being validated.

We can then also leverage this for transactions and do parallel transaction validation which prevents queues from backing if we receive a long to validate transaction or a group of them. This will be needed in the future if we ever want to get to much higher transactions rates...

Keep in mind though the central purpose of this right now is to get to bigger blocks. To give the miners confidence that the system can not be disrupted.

As for the 4 block scenarios. That is something that will never happen but has to be in place as stop gap. It would be impossible for someone to create 4 attack blocks at the same time IMO, but if they could the measures are there to prevent disruption. Keep in mind that with XTHIN and XVAL, block validations are typically less than 100 ms...it's only the attack blocks we have to concern ourselves with and it's very unlikely we'll have any new blocks validating in parallel at all under normal conditions.

priestc · Nov 10, 2016

freetrader said:
Is it a consensus rule that transactions in a block must be validated in sequence (because of possible dependencies between transactions in said block)?

If so, splitting the block up for parallel validation might need to take into account if there are any such tx chains in the block...

You don't do *all* of the validation in the threads, you only do the CPU intensive parts, which is validating that the signatures are correct. Once all signatures are found to be valid, the block is put back together and sent to the main thread where each of the UTXO's are determined to be valid, which is what needs to be done in order.

Peter Tschipper · Nov 10, 2016

@priestc Yes, we only do the cpu intensive parts in parallel. We already do the sigs in their own parallel threads as well (this is the same way as it works today)...So to visualize that, we have parallel blocks validating, and each parallel block is also using a parallel thread group to take care of the sig checks. So the way it's coded now...first the blocks checkinputs ...each block will have it's own temporary view of the UTXO to play with so at the point where we checkinputs we don't have to hold the cs_main lock anymore, then when the inputs have been checked we wait for the sig check threads to finish...again we don't have to hold the cs_main lock for that. So as we're checking inputs the sigs are also getting validated in a separate thread group , the number of threads for which depends on how many cpu's you're running.

priestc · Nov 10, 2016

If you already have the CPU intensive stuff in their own thread, then what is the point of this BIP? Adding more threads isn't going to squeeze more performance out of a CPU that is already being saturated...

Christoph Bergmann · Nov 11, 2016

@Peter Tschipper

So the aim of this BUIP is

1) "doing parallel block validations for new blocks" to prevent being attacked by a rouge miner. Can you explain?

2) "taking the block validation out of the main procesing thead". This enabled the main thread of Bitcoin Software to continue " process transactions while the block is being validated." Is it the current practice that Bitcoin stopp everything untill it has validated a new block? Does this BUIP not decrease security if new transactions are processed before a block is validated (the validation doesnt know the new blockchain?)

Peter Tschipper · Nov 13, 2016

@Christoph Bergmann

To answer your questions:

1) The large block "rogue miner" attacks is really the main reason we have a block size cap. Neutralizing this attack is a big step forward in finally taking the cap off, or at least through Emergence Consensus move toward much larger blocks without the fear that someone might jam the network with a large or long to validate block.

2) Yes the current practice in bitcoin is to stop all processing until a new block is validated, and no, new transactions have no impact on the currently validating block since transactions only reside in the memory pool and do not effect the block chain.

Christoph Bergmann · Nov 14, 2016

Ok, thx. Just to make sure:

1.) it is about the "Monster-transaction"-attack? (Building a transaction that need minutes or hours to be validated)? And the parallel validation enables nodes to continue their work while hanging on this big transaction?

Maybe we should in future do an AMA before we vote about a BUIP ... especially if the buip is technical more complicated. After these answer I think I know more or less the purpose of this BUIP and I completely support it.

Peter Tschipper · Nov 14, 2016

@Christoph Bergmann You are correct, this will neutralize the monster transaction attack, the most likely approach a rogue miner would use.

Christoph Bergmann · Nov 14, 2016

@Peter Tschipper And this without a hard- or softfork, just with a new feature in the client? That's awesome and some major step forward!

(I think this thread is a good example why development could need the help of marketing. I'd propose for every BUIP to co-write an understandable version and maybe give an AMA before we vote about it)

priestc · Nov 14, 2016

I still don't understand what this change actually does. Validation is a CPU limited operation. If there are already threads keeping all cores busy during validation, then what is the point of this change?

Maybe the threads make it so that you can download a new block while another block is being processed? If that is the case, then it should be called "parallel block downloading", not "parallel block validation".

By the way, if you have multiple blocks validating in parallel at the same time, both blocks will validate slowly because they have to contend with the same CPU resources.

Imagine if an attacker sends a node 4 huge blocks at the same time, each with the first 99% being valid, and the last 1% being invalid. It wouldn't matter it these blocks were validate in parallel or in sequence, the amount of time it takes to determine them to be invalid will be the same.

I think the mistake you are making is assuming that making 4 threads will multiply your CPU power by 4X. Regardless of how many threads you make, CPU resources are limited. The only time making threads increases CPU throughput is when the threads allow one thread to read from disk (or network) while another threads use the CPU.

Also, please explain how this solves the "monster tx problem". Again, you can't simply launch a new thread expecting it to multiply CPU resources.

Peter Tschipper · Nov 15, 2016

@priestc I think there's a confusion about doing parallel block validation and parallel validation within a block. Parallel validation within the block, such as with the sigs or possibly the inputs is for the purpose of speeding up the validation process...I think that's clear and I think you wouldn't argue with that. However, the purpose of validating blocks in parallel is for an entirely different purpose...and yes, you are right, if two blocks validate in parallel then it will take longer to validate a block.

The problem in Bitcoin right now is that when a block arrives and starts validating then everything basically stops until that block has finished. No txns, no inv messages, no other blocks can get through. So this is where the attack vector opens up...if someone can jam the processing thread with huge block or a very long to validate block then there is no way we can increase blocks size without opening up this vector to abuse. So by allowing blocks (and I mean only newly mined blocks, this isn't for IBD) to validate in parallel we neutralize this attack...if someone tries to jam the processing thread with a big block, well then fine, go ahead and try, because when a competing block shows up it will validate first and you will only end up getting orphaned. Also, keep in mind, validating blocks in parallel would be a very rare event. Under normal conditions we'll probably never see two blocks show up at the same time. So in the end, by having this parallel process in place, we can safely up the block size cap to any size without the threat of this attack succeeding.

priestc · Nov 22, 2016

I didn't realize the software can't service other inv messages while a block is being downloaded/validated. I am only familiar with the protocol, not the implementation. Your BUIP should be more clear about this. The mechanism for allowing a smaller block to "beat" a bigger block is essential to the goal of having unconstrained blocksize. Without parallel validation, there is no mechanism for "too big blocks" to be orphaned, and bias towards blocks that may be too big may exist. With parallel validation, no such bias will occur.

As a side note, I think BIP/BUIPS should be written for a non-technical audience. Your BUIP I think is too formally written. I say save the formality for the actual code. The BUIP should be written in plain english so that everyone, technical and non-technical can understand what the change does.

freetrader · Nov 22, 2016

> The BUIP should be written in plain english so that everyone, technical and non-technical can understand what the change does.

I think a BUIP should consist of several presentation layers, the first one being what you describe above.

Lower-level layers should get into the specifics in detail though.

priestc · Nov 23, 2016

The only people who care about the technical details are the ones who have enough technical knowledge to look at the actual code and understand the BIP on that level. If you can read the code, you don't care what the BIP says (other than as a high level place to use as a starting point to understanding the code). There is no reason to describe the contents of the code in the body of the BIP, that serves no purpose but to bloat the text and potentially push out the words that better describe the intent of the change.

lunar · Nov 24, 2016

priestc said:
The only people who care about the technical details are the ones who have enough technical knowledge to look at the actual code and understand the BIP on that level.

Disagree. Any parameter that changes or has the potential to change the economic incentives underpinning the network, should be highlighted and made clear for open discussion.

*obviously that's not so important in this case, but generally speaking

BUIP033: (passed) Parallel Validation

Active Member

Active Member

Moderator

Member

Moderator

Active Member

Member

Active Member

Member

Active Member

Active Member

Active Member

Active Member

Active Member

Member

Active Member

Member

Moderator

Member

Well-Known Member