BUIP120: (passed) Replace LevelDB with an immutable stateless storage design

Jonathan Silverblood

Active Member
Nov 21, 2018
100
73
BUIP120: Replace LevelDB with an immutable stateless storage design
Submitted by: Jonathan Silverblood
Date: 2019/3/28
edit 2019/5/5 by solex: budget added

Summary
The purpose of this BUIP is to build a development branch of Bitcoin Unlimited's node software that uses an immutable stateless storage design (such as Bitcrust, CashDB or others) as a storage component for blockchain data and do a real-world study of the performance implications.

Proposal
This BUIP proposes that we use bitcoin unlimited funds to:
  • Pay for the development of a branch of Bitcoin Unlimited that implements an immutable stateless storage design as an alternate storage component.
  • Use this branch to study the performance and other practical implications.
The choice of which software to use for this is determined by the lead developer and could either be taking something that is already production ready (like CashDB), taking up something that has significant progress but not quite there yet (like Bitcrust-db) or designing something from scratch.

Motivation
Going forward it is critical that we remove all significant performance bottlenecks. One such bottleneck is the LevelDB dependency currently used as the storage component for the bitcoin unlimited node software. This dependency on a generic mutable storage is a remnant of the early bitcoin implementations and there are some people that believe that it is practically impossible to achieve global scale with it.

Budget
A maximum expenditure cap of $20,000 is proposed for direct development costs incurred to deploy this change to the BU testnet.
Due to the uncertain scope of the work, this is an estimate recommended by solex.

Background
Thomas van der Wansem held a presentation during the Mar, 2018 Satoshi's Vision conference illustrating the structure of bitcrust-db. Since then, several other projects have emerged with similar designs (Flowee?, Verde?, CashDB?) and are reaching maturity.



Links
 
Last edited by a moderator:

Jonathan Silverblood

Active Member
Nov 21, 2018
100
73
This is a draft BUIP and this is the first time I write a formal BUIP. I don't know in detail what BU can be expected to actually do should this be something that is voted for (what does "pay for" mean in a practical context?) and if someone with more experience withing the organization could spend some time to give me feedback and ease me into the process, I'd appreciate that.
 
  • Like
Reactions: freetrader

Griffith

Active Member
Jun 5, 2017
188
157
@Jonathan Silverblood
Inside the BUCash code leveldb is only used to manage the UTXO set, some minor metadata, and the block index. actual block data (including transactions) is stored in (by default) sequential files. I did implement one of @thezergs ideas a while back that lets you change from sequential files to a different DB of your choice by making the block storage use an api that you could essentially plug in any db into.

My request for this BUIP would be to clarify what storage you want to replace. the blocks themselves? the utxo set? block index data/other metadata? all of the above?

I am against having the utxo set be stored in an immutable db because of how often it changes and how much redundant data would be stored when we roll back a block.
 
Last edited:

Jonathan Silverblood

Active Member
Nov 21, 2018
100
73
Given the presentations explanation for the difference between which data by necessity needs to be mutable and which do not, and how that mutability difference essentially gives you instant and free reorgs, I can only say i'm not quite sure.

Unless significantly damaging to performance, I think all data that is stores should go through a properly documented API such that the storage layer can be replaced and multiple implementations can compete.

I'm not a data durability expert, but from my viewpoint the only real drawbacks to not commiting outputs/transactions to disk and keeping a reference list is performance and reliability in case of crashes. The immutable storage design as explained in CashID is very clear: in case if issues it is safe to write-again as the results can never become something that isn't desirable.

I can go over the presentation and note down the types of data and how the presentation explains that their storage should be, but I'd rather not write a technical BUIP that limits the options for the implementation by forcing it to be exactly like the presentation: but rather leave it open for some interpretation and use the presentation and linked to examples (flowee, cashdb) as sources of inspiration.

The key idea is to embrace the immutability to get rid of dependencies that are generic, for something tailor made for the usecase - if it's more performant / more resilient / exhibits positive traits that motivates keeping it in the long run.
 

Jonathan Silverblood

Active Member
Nov 21, 2018
100
73
@solex I cant' seem to edit this post anymore. I'd like this to be BUIP120.

I also can't edit the post titles of the other two, but I've put 118 to lookup features and 119 to send-to support. Could you update the titles for me?
 

solex

Moderator
Staff member
Aug 22, 2015
1,558
4,693
Certainly. Not sure why this problem is occurring.
_edit_ done now. I will put the titles in the Index...
 
Last edited:

solex

Moderator
Staff member
Aug 22, 2015
1,558
4,693

theZerg

Moderator
Staff member
Aug 28, 2015
1,012
2,327
We have already reorganized the back end, so it should be able to accept other storage schemes more easily. I think that this would be interesting, practical research.

But the proposal should specify a maximum amount in USD to be authorized. While actual payments would happen on an hourly basis, a maximum would allow voters to understand what they are voting for.
 

Jonathan Silverblood

Active Member
Nov 21, 2018
100
73
I agree with setting a maximum, but I am unfamiliar with BUs budgets, economical situation and common practices in this regard.

If someone who has more experience with this than me were to suggest a change to this BUIP it would be appreciated.

I would also like to know what the practical implications are, should this be voted through without a budget.
 
  • Like
Reactions: freetrader

solex

Moderator
Staff member
Aug 22, 2015
1,558
4,693
@Jonathan Silverblood
Same applies here, as it does to BUIP121.
In the absence of a budget, volunteer work is relied upon, which can take a long time.
Budgets are usually estimates. This is a big project, however, as you detail, Thomas has done a substantial amount of work which I am sure he would welcome being re-used. I suggest $20k to get this advanced to an alpha version at least. I note that @theZerg did not suggest a budget figure. We can always come back for a top-up, if needed.
 
  • Like
Reactions: freetrader

Jonathan Silverblood

Active Member
Nov 21, 2018
100
73
@solex Can you update BUIP 120, and 121 to have a $20,000 budget cap with a note that the number was suggested by someone else than me, and update BUIP 118 and 119 to clarify that they intentionally have no budget and thus relies on volunteer work?

This should make it clear through action where my priorities are and if 118 and/or 119 gets voted through but no volunters are available we can, as you said, make another BUIP with a budget in the future if needed.

I can't seem to update myself since the posts are now rather old.
 

solex

Moderator
Staff member
Aug 22, 2015
1,558
4,693
Certainly. I think this is a good approach.
Both updated now.
 
Last edited:
  • Like
Reactions: freetrader

Griffith

Active Member
Jun 5, 2017
188
157
@solex do funds allocated for a given task that remain unspent post completion of said task they were allocated for return to the pool of unallocated funds? or are they considered spent once allocated and then can be used at the discretion of the person who's task it was the employ them?

Specific example: I am assuming that funds allocated for a development task are removed from the unallocated pool and held by "The Developer" as allocated funds because they will be the one to decide on task completion (based on code merge or something similar). Ifn terms of this BUIP, in the scenario where 20k is allocated but only 10k consumed, does the remaining 10k return to the unallocated fund pool? Or are they now considered spent and allowed to be used "The Developer" for something else at their discretion (which may include just returning them to the unallocated pool)?

The articles do not outline (or I could not find) what happens in this case.
 

solex

Moderator
Staff member
Aug 22, 2015
1,558
4,693
@Griffith
Unused funds revert back to reserves to be used in future BUIPs. In your example, if 10k is spent and the BUIP objectives discharged, then the 10k remainder is no longer available for discretionary use.
The spending of allocated funds is at the discretion of the elected officers. Usually, on development matters, this is by @theZerg.
 

jtoomim

Active Member
Jan 2, 2016
130
253
Do we have anybody in mind to actually implement this BUIP? Or is the hope that we can allocate the funds as a bounty, and hope that someone will come along and claim it?

I agree that there is a lot of room for improvement in the DB layer. However, I am not yet convinced that this BUIP will be sufficient to make that happen. Simply writing a new DB layer that is stateless would be enough to claim the bounty, but that would not necessarily perform better than LevelDB.

It would be nice if you at least included "allowing parallel readers" in the specification. It would also be desirable to mention that the DB needs to minimize the number of random-access disk reads per UTXO read, either via RAM caching or via careful packing of metadata into 4k blocks.

If we don't have someone specific in mind and are just posting a bounty, perhaps we should make the bounty contingent upon performance? E.g. 50% for meeting the design specifications (stateless, immutable, parallel), 40% for surpassing the existing LevelDB system in performance, and 10% for surpassing LevelDB in reliability.
 
  • Like
Reactions: freetrader

solex

Moderator
Staff member
Aug 22, 2015
1,558
4,693
There is a dev budget authorised, should the BUIP pass. However, this is not an outright bounty to be paid to the first person who comes up with something satisfying the BUIP. There is no-one that this is targeted towards. The overall design approach needs to be agreed first by @theZerg, @Jonathan Silverblood and other developers who show an interest and capability of getting involved. I agree in principle that splitting the budget as @jtoomim suggests makes sense.

I recall the discussion between Mike Hearn and the Core developers when the LevelDB choice was made about its high performance and reliability in the first place. We certainly don't want to go backwards!
 

Jonathan Silverblood

Active Member
Nov 21, 2018
100
73
I'd like to point out, that completing the abstraction that allows for selectively choosing backends and proving it to be compatible with a backend that meets the expectations set forward in the BUIP (including the expectations referenced in the presentation linked to) might be enough for now.

I'd be just as happy to see BU run with cashdb/bitcrust-db as storage components as I would to see completely new development. I don't expect the full budget to be used unless an entirely new or mostly new storage component has to be built specifically for BU.

Then again, I am new to BUIPs and BUs budget and have very little experience with how these budgets work.