the problem is that all peers need to use the same parameters
otherwise the attacker just claims he runs the lowest end nodes
also nonce protection consumes CPU and no bandwdith, that is what creates a leverage. The attacker would need DIFF times more calculations than the receiver.
I wrote an encrpted UDP network with onion routing and PoW protection, so this brings back memories. the CPU really is a lot faster than UDP packets so you can do a lot of PoW per packet, a surprising amount. since we are constrained spacewise where every byte counts, using a 32bit nonce for the PoW would make sense. I used the SaM hashing, but any pure CPU algo would work.
For the tuning,I would suggest a global network diff, which could be even set to being OFF, or at the lowest setting where the nonce just acts as a checksum, so then it wont even take any extra space assuming you are using a checksum already. Not sure if 16 bits is enough, but if 65536 iterations of the hash function is more than ever needed, then the checksum/nonce can be as small at 16 bits.
On testnet the network diff could be adjusted manually to calibrate the values with a typical cross section of nodes. Once the magic diffs are calibrated, then some sort of attack detection could be added to really boost the diff during times of an active attack. And actually with a lookup table, the nonce diff can be encoded into 8 bits, 256 different levels of protection seems plenty
A simple adaptive method that comes to mind would only require that each node be able to notice it is being attacked. Then it would increment the diff to the next leverage level. If it detects it is not being attacked, it downshifts all the way down to the global diff, one step at a time. not sure how frequently it should change, the goldilocks principle would apply, not too fast, not too slow
Now the dynamic changing of nonce diff breaks "contact" with all the other nodes, but if there is another bit or two allocated in the packet, it can use this to signal the other nodes that it shifted its nonce diff. Ah, yes, my problem was I changed keypairs, so lost all contact in some situations. here it is plaintext, so the diff used can just be another part of the packet header.
OK, here is quite simple protocol:
0. Get global minimum/maximum nonce diff (probably hardcoded per release, but need to be careful about backwards and forwards compatibility)
sending packets
1. add nonce and nonce diff to each packet
2. if being attacked, increase nonce diff
3. if not being attacked, decrease nonce diff
receiving
1. verify nonce satisfies diff and that diff is above global minimum and below max
2. average diff changes from peers and update attack status
Need to be careful about false positives as the above will propagate attack status from one node across the network. But this allows a way for the entire network to adaptively change the nonce diff to rather extreme levels if it is a sustained large scale attack. The larger the attack, the higher the nonce diff goes, so it would ramp up the diff until the attacker cant keep up, regardless of how much CPU he gets (subject to global max limits)
Essentially under attack, all nodes go into full red alert and each packet might take 400% CPU to retain realtime bandwidth and it would be possible to sacrifice some throughput to really boost the nonce diff
If possible, such adaptive protection should be used for all p2p comms, as currently there is basically no protection against DOS attack in any crypto, though TCP connections offer some protections, but with the additional step of establishing a valid bitcoin peer, the tcp "protection" is nothing
James
P.S. By occasionally posting info into the blockchain, all nodes using the nonce protection can coordinate even without requiring all such nodes all have paths to each other. Of course, best would be if the miner put a byte into the coinbase script or any other place that has no effect on blockchain status