BUIP017 (passed): Datastream Compression

Peter Tschipper · Mar 20, 2016

BUIP017: Datastream Compression
Proposer: Peter Tschipper
Submitted on: 03/20/2016

Summary

Bandwidth is an issue for users that run nodes in regions where bandwidth is expensive and subject to caps, in addition network latency in some regions can also be quite high. By compressing data we can reduce the daily bandwidth use in a significant way while at the same time speed up the transmission of data throughout the network. This will encourage users to keep their nodes running longer and allow for more peer connections with less need for bandwidth throttling. In addition, it may also encourage people in areas of marginal internet connectivity to run nodes where in the past they would not be able to, while at the same time improving the performance for downloading historical blocks, bloom filters, xthins and transactions. Furthermore by bundling or concatenating transactions together we can achieve much higher compression and transmission rates than would normally be the case.

Design

Using a service bit, compression and decompression can be turned on/off completely at both ends with a configuration setting. It is important to be able to easily turn off compression and decompression as a fall back mechanism. Using a service bit also makes the code fully compatible with any Nodes that do not currently support compression, since the service bit must be turned on, on both nodes for compressed data to be sent and decompressed at the other end. Therefore, nodes that do not present the correct service bit will receive data in standard uncompressed format.

All blocks will be compressed as they compress well enough, even below 500 bytes.

Transactions below 500 bytes do not compress well and will be sent uncompressed unless they can be further concatenated.

Multiple transaction requests that are in queue will be concatenated when possible. This further reduces bandwidth needs and speeds the transfer of large requests for many transactions, such as with MSG_FILTERED_BLOCK requests, or when the system gets busy and is flooded with transactions.

Although unlikely, if compression fails for any reason then blocks and transactions will be sent uncompressed. Therefore, even with compression turned on, a node has to be able to handle both compressed and uncompressed data from another peer.

By abstracting the compression and decompression code into class CDataStream, compression can in the future be applied easily to any additional datastream.

The compression library LZO1x does not compress to the extent that Zlib does but it is clearly the better performer, particularly as file sizes get larger, and at the same time providing very good compression.

Test Results

Current test results show a 20% compression when blocks and transactions are compressed and up to 27% when concatenated. In addition there is decent performance improvement, particularly when there is latency on the network.

range: data size range
ubytes: average size of uncompressed blocks
cbytes: average size of compressed blocks
ctime: average time to compress
dtime: average time to decompress
cmp%: compression percentage
datapoints: number of data points taken

range ubytes cbytes ctime dtime cmp% datapoints
0-250b 215 189 0.001 0.000 12.41 79498
250-500b 440 405 0.001 0.000 7.82 11903
500-1KB 762 702 0.001 0.000 7.83 10448
1KB-10KB 4166 3561 0.001 0.000 14.51 50572
10KB-100KB 40820 31597 0.005 0.001 22.59 75555
100KB-200KB 146238 106320 0.015 0.001 27.30 25024
200KB-300KB 242913 175482 0.025 0.002 27.76 20450
300KB-400KB 343430 251760 0.034 0.003 26.69 2069
400KB-500KB 457448 343495 0.045 0.004 24.91 1889
500KB-600KB 540736 424255 0.056 0.007 21.54 90
600KB-700KB 647851 506888 0.063 0.007 21.76 59
700KB-800KB 749513 586551 0.073 0.007 21.74 48
800KB-900KB 859439 652166 0.086 0.008 24.12 39
900KB-1MB 952333 725191 0.089 0.009 23.85 78

Compression results for transactions without concatenation
range: block size range
ubytes: average size of uncompressed transactions
cbytes: average size of compressed transactions
cmp%: compression percent
datapoints: number of data points taken

range ubytes cbytes cmp% datapoints
0-250b 220 227 -3.16 23780
250-500b 356 354 0.68 20882
500-600b 534 505 5.29 2772
600-700 653 608 6.95 1853
700-800 757 649 14.22 578
800-900 822 758 7.77 661
900-1KB 954 862 9.69 906
1KB-10KB 2698 2222 17.64 3370
10KB-100KB 15463 12092 21.80 15429

From the above table you can see that transactions don't compress well below 500 bytes. However, most transactions happen to be in the < 500 byte range. So the next step is to apply bundling for those smaller transactions when there are multiple getdata requests in queue for a peer. Doing that yields some very good compression ratios on par with compressing blocks, which makes sense since blocks are, for the most part, composed of concatenated transactions.

Some examples as follows:

The best one seen so far was the following where 175 transactions were concatenated before being compressed. That yielded a 20% compression ratio, but that doesn't take into account the savings from the unneeded 174 message headers (24 bytes each) as well as 174 TCP ACK's of 52 bytes each which yields and additional 76*174=13224 bytes, making the overall bandwidth savings 32%.

2015-11-18 01:09:09.002061 compressed data from 79890 to 67426 txcount:175

However, most transaction aggregates right now are in the 2 to 10 transaction range. Such as the following:

2015-11-17 21:08:28.469313 compressed data from 3199 to 2876 txcount:10

But even here the savings of 10% is far better than the "nothing" we would get without bundling, but add to that the 76 byte * 9 transaction savings and we have a total 20% savings in bandwidth for transactions that otherwise would not be compressible and as transaction rates rise there will be more opportunity for this type of transaction compression and hence more and bigger savings in the future.

From the data it is clear that bundling and then compressing transactions will be of great benefit as transactions rates rise and will save approx. 30% overall bandwidth through the direct result of compression as well as the savings of transaction headers and TCP ACK’s, while at the same time reducing overall network chatter.

Backward Compatibility

Lacking the correct service bit, older clients will continue to receive standard uncompressed data and will be fully compatible with this change.

Fall Back

It is important to be able to entirely and easily turn off compression and decompression as a fall back mechanism. This can be done with a simple bitcoin.conf setting.

Deployment

This enhancement does not require a hard or soft fork.

Future Strategies

In the future we could offer other compression libraries such as LXOx-99, Zlib or some other promising new compression algorithm. Although they would not be as fast as LZOx-1 they could offer higher compression rates for those that wish to save maximum bandwidth at the expense of additional time spent compressing and decompressing.

Copyright

This document is placed in the public domain.

cypherdoc · Mar 20, 2016

is this yet another optimization occurring onchain? what's going on here?

_mr_e · Mar 20, 2016

I'm sure core already does this better in one way or another. No need to bother.

Norway · Mar 20, 2016

@Peter Tschipper
Great! You're on fire!

solex · Mar 20, 2016

_mr_e said:
I'm sure core already does this better in one way or another. No need to bother.

Actually they don't have anything like it, but have rejected this idea under their guiding principle "No matter how good something is its no good until its perfect."

Props to @Peter Tschipper for another excellent proposal.

freetrader · Mar 20, 2016

Great proposal Peter!
I'm going to have to come up with a Chuck Norris meme you if you keep up this optimizing rate

_mr_e · Mar 20, 2016

solex said:
Actually they don't have anything like it, but have rejected this idea under their guiding principle "No matter how good something is its no good until its perfect."

Props to @Peter Tschipper for another excellent proposal.

Hopefully you didn't miss my /s

Chronos · Mar 20, 2016

I see a fair amount of real-world test results in this BUIP. What's the current status of the code? Is it ready for a PR?

Also, how often does transaction-concatenation occur in the wild, and what are its downsides?

Peter Tschipper · Mar 21, 2016

@Chronos the code was originally written for Core v11.x...I have to port it over, shouldn't take long, but I want to spend some time testing the effects on bloom filters.

Tx concatenation happens more as tx rates rise...the faster they come in the more they get concatenated.

There are no downsides to concatenation that I know of.

Christoph Bergmann · Mar 21, 2016

Awesome!

Does it need CPU to decompress the tx? And does it need time to do so?

Peter Tschipper · Mar 21, 2016

@christoph yes and yes, but very little of both...see the test results in the BUIP.

LZOx-1 ends up being a slight net win when there is no latency on the network and about a 10-20% improvement in performance over the internet. Faster performance in spite of the compression/decompression time. Using other libraries like Zlib or LZOx-99 are not nearly as fast at compression/decompression and are not as scalable, but have better compression profiles. So that's why i chose LZOx-1 to start with.. after we have had some success with this, maybe we can offer other options for those that prefer maximum bandwidth over speed.

Chronos · Mar 21, 2016

Peter Tschipper said:
the code was originally written for Core v11.x

Interesting. Was it proposed (and rejected) for that release? Can I read any details on Core's reaction to the idea?

sickpig · Mar 21, 2016

@Chronos

Yes propsed and rejected, see btc-dev archive circa nov/dec 2015.

This the email in which @Peter Tschipper asked for a bip number:

https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2015-November/011837.html

Peter Tschipper · Mar 21, 2016

@Chronos It was never formally rejected and a PR is still open there ( https://github.com/bitcoin/bitcoin/pull/6973 ) but I would say it was more ignored after I started the BIP process. However, Jeff Garzick seemed warm to the idea and put me on to testing LZO which ended up to be a good suggestion. Originally I was using Zlib-6 which would now be my second choice, a little slower with better compression but not as scalable.

sickpig · Mar 21, 2016

@Peter Tschipper

let say then it was not rejected but worse: suspended indefinitely into the limbo of NIH universe.

Chronos · Mar 21, 2016

Good conversation in that PR link. I love this quote from gmax.

Matt's relay network protocol can achieve much better compaction (e.g. sending a 1MB block in a couple kilobytes); and considering zlib's security non-track record... this doesn't excite me greatly.

I've heard this line before: Fast Relay Network is so good, let's not improve anything! What I don't understand is how you can say "small blocks because bandwidth" at the same time.

Anyway, I see comments that compression is best during transaction reuse. Assuming that no inputs or outputs are reused in a block, are the bandwidth savings still comparable?

Peter Tschipper · Mar 21, 2016

@Chronos certainly if you re-use data you should get better compression but in all the testing i did i only used production data, I don't really know how much data is duplicated in a block, i've never looked at it.

btw as with small files, sometimes the data doesn't compress well and ends up being slightly larger that the original because of the addition meta data; in those cases we don't send the compressed file but the original, it's doesn't happen often but can so the logic is in there.

jl777 · Mar 21, 2016

if the compression is used only after it is verified to work, and also only does this with other nodes that support compression, i dont see any issue other than extra CPU used for the codec process. Since in almost all cases bandwidth is the more precious this is fine as long as it can be disabled.

With iguana, I am processing data at 50MB/sec to 100MB/sec and at those speeds the compression/decompression times will start to be significant, so for parallel sync I would turn off the codec, but once it is caught up then the codec makes sense in all but the most extreme cases, ie ancient computer on 100mbps+ network

Though smartphones on unlimited wifi and limited battery (CPU) would also want to turn it off.

Dont get me wrong. compressing the wire data is of course a way to save bandwidth with little negatives. I would suggest adding a codec id byte before the compressed data, that will allow nodes that have lots of CPU available vs. bandwidth to just try all the possible compression algos and use the one that works the best.

With a GUI exposed button to disable compression, that removes all negatives as long it is secure

James

P.S. Also make sure to construct buffer overflow attack packets that try to explode the decompression, ie 1GB of zeroes all compressed down to small size

Peter Tschipper · Mar 21, 2016

@jl777 thanks for your input...good advise on the buffer attacks...I think we're covered there but we'll test it out just the same.

On another note, I looked over the code again, it's been a while and I had forgotten that I already had multiple compression schemes in place...basically LZOx-1 and LZOx-999. So we have both the fastest compression LZOx-1 and also for those who don't care about losing a few tenths of a second, we have LZOx-999 which gives about 27% compression and is very close to what Zlib-6 will give.

jl777 · Mar 21, 2016

having 8 bits to indicate the type of data would allow for bitwise enabling of codec features and also an easy way to have uncompressed supported within the codec path. By choosing various effective width/depth/dictionary size settings I think there can be half a dozen optimized compressions for the various type of blocks so we can always get the best compression. no need to have just one codec as one size definitely wont fit all.

Also, you might look at doing a string substitution pass prior to the LZO compression. The txid's and pubkeys and rmd160 are not compressible at all, but often they are duplicated, so within a block each duplicated high entropy item would be a gain of 18 to 30 bytes. But more important is that as the resulting indices are themselves compressible.

Given an up to date blockchain, it would be possible to get a canonical numbering of all high entropy items I think the typical block would compress to being half sigs data and the other half would be 25% the size, which can then be compressed 50%. But the sigs data can just go into a separate section so we dont even need to count their size at all and it would be 90% compression (this is a segwit joke). I think after removing the pubkeys from the vindata it will shrink quite a bit as most tx are pay to pubkey hash so maybe the vin data will become 30%? In that case we are looking at ~45% original size. Which would let us have the same bandwidth usage for 2MB blocks as for 1MB blocks. But I would need somebody else to do the above as my workload is a bit booked up for a while. I would be happy to collaborate on an iguana codec that would work even for a node that uses the current DB for blockchain.

In any case, please reserve one of the compression byte values for iguana

for iguana I was compressing the tmp files to disk, but I ended up going 100% uncompressed due to the slowdown that created, but that is extreme case.

Clearly compressing wire data is a standard thing to do and a requirement if you are encrypting anything. having maximum entropy data as input to encryption is usually expected or even required to maintain the security, so having a compression stage is a nice precursor to being able to have p2p encrypted channels between nodes.

I am already doing that with the supernet overlay, but I can see some future bitcoin message that could benefit from encryption.

James

P.S. are there small C libraries for the compression? The zlib is so bloated I really dont like it much as the entire iguana codebase is compiling to a couple MB now so it can load fast in a browser.

BUIP017 (passed): Datastream Compression

Active Member

Well-Known Member

Active Member

Well-Known Member

Moderator

Moderator

Active Member

Member

Active Member

Active Member

Active Member

Member

Active Member

Active Member

Active Member

Member

Active Member

Active Member

Active Member

Active Member