@Mengerian: Understood. My point was rather that I think Bitcoin operating in a mode of constantly 'bootstrapping itself' (by new nodes joining from UTXO set and partial history and even the scenario of initial history eventually forgotten) is as far as I can see as good as current scenario of full nodes having all transactions, maybe full nodes minus some small epsilon.
@awemany I see value in sharding, but I don't think this is appropriate to debate this here anyway. This proposal isn't about sharding. I'll just tell you that the assumption you make "Dividing up the load means dividing up the transaction generator graph, and making the slices of your pie so that not too many edges are sticking out of your slice." is overly restrictive. You can create sqrt(n) shard of sqrt(n) elements, for a total of sqrt(n)^2 = n elements. Even if 100% of the edges are sticking out (worst case scenario) you got to do sqrt(n) request out of your shard and serve sqrt(n) request from other shards. in such a situation, the workload for one node scales in sqrt(n) and the number of node required scales also in sqrt(n) to maintain a given level of security. There are ways to do better, but that should be enough to show there is value there.
Point taken.
But I still don't see how you can maintain a given level of security with just your sqrt(n) nodes then. Those sqrt(n) nodes also contain only sqrt(n) of the data, and with the assumption that you want equal security (~ number of full nodes), you'd need sqrt(n) times the number of nodes?
I guess I was arguing somewhat along the wrong lines for the right reason (and I feel like I have a deja vu, as if we argued the same things here before): With sharding, you only get a fraction of the data stored and transmitted per shard. But with random distribution across the shards, all I can assume is that a sharded node stores 1/(no-of-shards) fraction of my data.
With some kind of clustering, I'd know more about the shards I need, so that would change the full node security question.
For example, each person storing their own UTXOs (+ Merkle branches leading to their coins) could be considered -in a way- an extreme form of sharding while exploiting clustering.
But that would still mean that each transaction has to propagate through the full network so that everyone is on the same page regarding the new UTXO set in the next block. I however, definitely see value in this (because it would remove the psychological roadblock to scaling that is increasing storage requirements). But I can't see how you'd get there without still using the full bandwidth (and some more even, to transmit the Merkle branches from the UTXO set to the other nodes).
But maybe I am still too stupid here to see the scenario where you'd truly gain from random sharding. Can you make a realistic scenario?