Running a full node, database corruption?

Wolfbeast · Feb 2, 2017

Hey folks,

First post here, apologies if this isn't in the correct board. I just signed up to ask about a problem with the official Bitcoin Unlimited client (latest 1.0.0.1):
I want to help out the Bitcoin Unlimited network by running a full node, and as such went ahead and grabbed the appropriate client for my system (Windows x86_64 on Win 7), installed it and pointed it to a drive for the blockchain with plenty of space using -datadir= (and indexing transactions with -txindex 1) on the NTFS filesystem. I've run full nodes of other cryptocurrencies without problems the same way for years already.

After syncing most of the blockchain (about 40 weeks behind) the client crashed with "database corruption".
This is my main machine which is kept well-maintained and I know for a fact that the drive the blockchains are stored on does not have any physical defects.

The debug log tail:

2017-02-02 04:43:13 Acceptable block: ver:20000000 time:1473615256 size: 996591 Tx:209 Sig:346
2017-02-02 04:43:13 BLOCK_DOWNLOAD_WINDOW is 8 MAX_BLOCKS_IN_TRANSIT_PER_PEER is 1
2017-02-02 04:43:13 Acceptable block: ver:20000000 time:1473616356 size: 836143 Tx:1489 Sig:3156
2017-02-02 04:43:13 BLOCK_DOWNLOAD_WINDOW is 8 MAX_BLOCKS_IN_TRANSIT_PER_PEER is 1
2017-02-02 04:43:13 Acceptable block: ver:20000000 time:1473615734 size: 472467 Tx:1148 Sig:2695
2017-02-02 04:43:13 BLOCK_DOWNLOAD_WINDOW is 8 MAX_BLOCKS_IN_TRANSIT_PER_PEER is 1
2017-02-02 04:43:15 Acceptable block: ver:20000000 time:1473614458 size: 992058 Tx:841 Sig:1530
2017-02-02 04:43:15 BLOCK_DOWNLOAD_WINDOW is 8 MAX_BLOCKS_IN_TRANSIT_PER_PEER is 1
2017-02-02 04:43:15 Acceptable block: ver:20000000 time:1473614458 size: 992058 Tx:841 Sig:1530
2017-02-02 04:43:16 Pre-allocating up to position 0x400000 in rev00623.dat
2017-02-02 04:43:16 UpdateTip: new best=000000000000000004506cc5f22e0e863061e79ddf89f09a9a085b26747bd1fa height=429317 log2_work=85.256281 tx=155407440 date=2016-09-11 17:20:58 progress=0.936130 cache=68.7MiB(33743tx)
2017-02-02 04:43:16 Acceptable block: ver:20000000 time:1473614610 size: 174178 Tx:449 Sig:748
2017-02-02 04:43:16 UpdateTip: new best=000000000000000000a4d96fd93ececd63bab1470506d84a9aa449905cf807d3 height=429318 log2_work=85.256311 tx=155407889 date=2016-09-11 17:23:30 progress=0.936131 cache=69.0MiB(34343tx)
2017-02-02 04:43:16 Acceptable block: ver:20000000 time:1473614655 size: 38988 Tx:114 Sig:185
2017-02-02 04:43:16 UpdateTip: new best=00000000000000000307a9de6888b25984e976f5385208376eef44830662b7a3 height=429319 log2_work=85.25634 tx=155408003 date=2016-09-11 17:24:15 progress=0.936131 cache=69.2MiB(34490tx)
2017-02-02 04:43:16 Acceptable block: ver:20000000 time:1473615197 size: 612257 Tx:1351 Sig:3077
2017-02-02 04:43:17 UpdateTip: new best=000000000000000003a44705ae38d33083284463fdb26a3e160920e0fdced0b3 height=429320 log2_work=85.25637 tx=155409354 date=2016-09-11 17:33:17 progress=0.936134 cache=71.9MiB(36631tx)
2017-02-02 04:43:17 Acceptable block: ver:20000000 time:1473615256 size: 996591 Tx:209 Sig:346
2017-02-02 04:43:17 LevelDB read failure: Corruption: block checksum mismatch
2017-02-02 04:43:17 Corruption: block checksum mismatch
2017-02-02 04:44:27 UPnP Port Mapping successful.
2017-02-02 05:04:28 UPnP Port Mapping successful.
2017-02-02 05:24:29 UPnP Port Mapping successful.
2017-02-02 05:44:30 UPnP Port Mapping successful.
2017-02-02 06:04:31 UPnP Port Mapping successful.
2017-02-02 06:24:32 UPnP Port Mapping successful.
2017-02-02 06:44:33 UPnP Port Mapping successful.
2017-02-02 07:04:34 UPnP Port Mapping successful.
2017-02-02 07:24:35 UPnP Port Mapping successful.
2017-02-02 07:44:36 UPnP Port Mapping successful.
2017-02-02 07:55:36 Error reading from database: Database corrupted
2017-02-02 07:57:27

After which the client shut down. Trying to restart the client after seeing what happened results in the same message and client shutdown.
Is this a known problem? How do I recover from this (preferably without re-synching 100GB of data from the network again) and how do I prevent this kind of thing in the future?

Thanks in advance for your help.

sickpig · Feb 2, 2017

Sorry to here about your problem.

from the log you post this message is quite telling:

[...]
2017-02-02 04:43:17 Acceptable block: ver:20000000 time:1473615256 size: 996591 Tx:209 Sig:346
2017-02-02 04:43:17 LevelDB read failure: Corruption: block checksum mismatch
2017-02-02 04:43:17 Corruption: block checksum mismatch
[...]

This what you can try to work around the issue:

1) rebuilding the database with -reindex

2) verify your hardware, also the CPU and RAM could lead to this kind of problem.

3) if hardware issues are present using -par=1 along with -reindex slow down the pace of script verification cause only one thread will be used. That way the HW of you machine will be less stressed. If runnign with -reindex and -par=1 turn out to be successful the probability that you have faulty HW is quite high.

One last notice of course reindexing could take quite a while especially when using only one thread for script verification

Wolfbeast · Feb 2, 2017

As stated in my original post, this is a well-maintained workstation that does not have any HW faults (and in fact the CPU is only a few months old, having replaced it). If it would have any issues with the hardware, it would certainly show in my daily development activities that strain it a lot harder than any bitcoin client ever would.

I'll try reindexing.

Wolfbeast · Feb 2, 2017

So far, so good. Reindexing didn't take any time whatsoever and it's back to synchronising with the network now for the remaining blocks I still needed to get. Chalking it up to a glitch - db drivers can fall over too, after all. I'll post again if I continue having these issues.

sickpig · Feb 2, 2017

great news! I'm glad that your issue seems to be solved. Just keep us posted in case you have any other inconveniences.

Wolfbeast · Feb 2, 2017

Unfortunately it does seem that something completely screwed up, and I ended up having to clear out the downloaded blockchain data because it would not get any further than some point in Nov 2016 in its synchronisation (12 weeks out from completion). I've just had it sit there doing nothing for many hours; no processing or downloading going on, nothing in the log either explaining why -- even refreshing what peers it connected to didn't help (in case it was a peering issue).

I'm hoping that it will complete without issues this time, but it does seem rather unstable.

Search

Search

Running a full node, database corruption?

Wolfbeast

New Member

sickpig

Active Member

Wolfbeast

New Member

Wolfbeast

New Member

sickpig

Active Member

Wolfbeast

New Member

Latest posts

Latest threads

Members online