Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bootstrap import ignores >300MB worth of already downloaded blk0001.dat #313

Open
TheButterZone opened this issue Aug 28, 2017 · 11 comments

Comments

@TheButterZone
Copy link

TheButterZone commented Aug 28, 2017

I pulled my cable to ensure it wasn't downloading what I'd already downloaded, and I went from 349,000 or so blocks down to 0 after putting bootstrap.dat in the same directory. WTF?!

As people either a) download the bootstrap after getting frustrated at slow P2P speed with a partial blk0001.dat or b) download & insert the bootstrap near the time of first-run, shouldn't the client be able to cat for the a) people automatically?

@tryphe
Copy link
Collaborator

tryphe commented Aug 28, 2017

That's how it works. You put it in the directory, and your database goes bye bye, because it's not in any particular order (those files are just digested SSTs based on local leveldb settings).

@TheButterZone
Copy link
Author

TheButterZone commented Aug 28, 2017

So I've wasted all that time & data cap downloading from peers that ground to a near-halt. And instead of overwriting the blk0001.dat file, it's just getting added filesize from where I left off >300, to now >500. Will the duplicate shit be cleared, or clearable, out of it?

ETA: Maybe that series of

ERROR: ProcessBlock() : already have block

means it is just filling in blk0001.dat with missing blocks present in bootstrap & not appending the same ones that were already downloaded P2P? It didn't seem like the filesize started going up again until it went back to regular ProcessBlock, but I only caught it 5 seconds before the switchback.

ETA2: No, that doesn't seem right, it seemed like 1K per block before bootstrap, and the file's around 572mb now with 243739 one of the latest to scroll by in the "Now" debug.log

Maybe there should be a prompt on first run, then:
There are this many up-to-date peers: x
Would you like to download the blockchain from them, or the bootstrap torrent or DDL?

@dooglus
Copy link
Collaborator

dooglus commented Aug 30, 2017

I've never seen any blocks get lost as a result of importing a bootstrap.dat file, and I've imported a lot of them.

It will scan the bootstrap.dat file looking for blocks you don't already have, and that will count up through all the blocks in the file. You'll see a bunch of "already have block" messages for the ones you already have. It won't re-import blocks you already have.

Recent versions of Bitcoin will copy the new blocks it finds in the bootstrap.dat file into the blkNNNNN.dat files before importing them into the database, so that may explain why you're seeing the blk*.dat file(s) grow more than you expect. I forget which version of Bitcoin the last release of the CLAM client was based on, but suspect that its behavior is similar.

I don't think there's any need for a warning. Importing the bootstrap.dat file doesn't delete anything, and doesn't use any bandwidth.

@dooglus
Copy link
Collaborator

dooglus commented Aug 30, 2017

Here's an example from the log of a Bitcoin bootstrap.dat I'm importing at the moment:

2017-08-29 19:10:51 Pre-allocating up to position 0x1000000 in blk00136.dat
2017-08-29 19:10:52 Pre-allocating up to position 0x2000000 in blk00136.dat
2017-08-29 19:10:52 Pre-allocating up to position 0x3000000 in blk00136.dat
2017-08-29 19:10:53 Pre-allocating up to position 0x4000000 in blk00136.dat
2017-08-29 19:10:53 Pre-allocating up to position 0x5000000 in blk00136.dat
2017-08-29 19:10:54 Pre-allocating up to position 0x6000000 in blk00136.dat
2017-08-29 19:10:54 Pre-allocating up to position 0x7000000 in blk00136.dat
2017-08-29 19:10:55 Pre-allocating up to position 0x8000000 in blk00136.dat
2017-08-29 19:10:55 Leaving block file 136: CBlockFileInfo(blocks=597, size=134161673, heights=298291...298887, time=2014-04-29...2014-05-03)
2017-08-29 19:10:58 Pre-allocating up to position 0x1000000 in blk00137.dat
2017-08-29 19:10:58 Pre-allocating up to position 0x2000000 in blk00137.dat
2017-08-29 19:10:59 Pre-allocating up to position 0x3000000 in blk00137.dat
2017-08-29 19:10:59 Pre-allocating up to position 0x4000000 in blk00137.dat
2017-08-29 19:11:00 Pre-allocating up to position 0x5000000 in blk00137.dat
2017-08-29 19:11:01 Pre-allocating up to position 0x6000000 in blk00137.dat
2017-08-29 19:11:01 Pre-allocating up to position 0x7000000 in blk00137.dat
2017-08-29 19:11:02 Pre-allocating up to position 0x8000000 in blk00137.dat
2017-08-29 19:11:02 Leaving block file 137: CBlockFileInfo(blocks=590, size=134195296, heights=298888...299477, time=2014-05-03...2014-05-07)
2017-08-29 19:11:06 Pre-allocating up to position 0x1000000 in blk00138.dat
2017-08-29 19:11:06 Pre-allocating up to position 0x2000000 in blk00138.dat
2017-08-29 19:11:07 Pre-allocating up to position 0x3000000 in blk00138.dat
2017-08-29 19:11:07 Pre-allocating up to position 0x4000000 in blk00138.dat
2017-08-29 19:11:08 Pre-allocating up to position 0x5000000 in blk00138.dat
2017-08-29 19:11:09 Pre-allocating up to position 0x6000000 in blk00138.dat
2017-08-29 19:11:09 Pre-allocating up to position 0x7000000 in blk00138.dat
2017-08-29 19:11:10 Pre-allocating up to position 0x8000000 in blk00138.dat
2017-08-29 19:11:10 Loaded 50000 blocks from external file in 647735ms
2017-08-29 19:11:17 UpdateTip: new best=000000000000003887df1f29024b06fc2200b55f8af8f35453d7be294df2d214 height=250000 version=0x00000002 log2_work=71.012098 tx=21491097 date='2013-08-03 12:36:23' progress=0.085733 cache=0.1MiB(718txo)
2017-08-29 19:11:21 UpdateTip: new best=000000000000001b3f536a81be90d5cbe8b79c2c1df53d1f91540cf5cb5a7c58 height=250001 version=0x00000002 log2_work=71.012195 tx=21491225 date='2013-08-03 12:47:32' progress=0.085733 cache=0.2MiB(1412txo)
2017-08-29 19:11:22 UpdateTip: new best=000000000000006b0b79274e9cfdfeaa89196a2281bc92493b1a1e74f2eac087 height=250002 version=0x00000002 log2_work=71.012292 tx=21491463 date='2013-08-03 12:48:37' progress=0.085734 cache=0.3MiB(1947txo)
2017-08-29 19:11:24 UpdateTip: new best=0000000000000054502d8fc7843719bd20d6094ea9a3ea8e4f4a7b9862fb45c2 height=250003 version=0x00000002 log2_work=71.01239 tx=21491794 date='2013-08-03 13:00:11' progress=0.085736 cache=0.4MiB(2659txo)
2017-08-29 19:11:25 UpdateTip: new best=000000000000001ad8cc4aafb8db55b0e4444fad216ae63f26cbfe9adb6031a9 height=250004 version=0x00000002 log2_work=71.012487 tx=21491958 date='2013-08-03 13:07:53' progress=0.085736 cache=0.4MiB(2933txo)

All the 'pre-allocating ...' messages happen as it copies the raw block data from bootstrap.dat into the various blk*.dat files, and only once it has finished doing that does it start doing the UpdateTip stuff.

@TheButterZone
Copy link
Author

TheButterZone commented Aug 30, 2017

The only blk*.dat file in my ~/Library/Application Support/Clam & subdirectories is blk0001.dat

debug.log does not contain "Pre-allocating" at this point, so the behavior is divergent.

What was a waste of bandwidth was to download > 300 mb P2P first, then have to download the bootstrap second, which contains same > 300 mb already downloaded P2P. The quickest method should be selected first & stuck to, so you don't end up downloading stuff twice.

"already have block" ran out of search results at 620695

Coming up on ProcessBlock... 1095000

@accttotech
Copy link

accttotech commented Aug 30, 2017 via email

@dooglus
Copy link
Collaborator

dooglus commented Aug 30, 2017

What was a waste of bandwidth was to download > 300 mb P2P first, then have to download the bootstrap second, which contains same > 300 mb already downloaded P2P

Yes. That's why on my bootstrap post I have split the file into pieces each containing 10k blocks:

I also made a series of 'partial' bootstrap files. Each one contains the block data for 10,000 blocks.

https://s3.amazonaws.com/dooglus/bootstrap-000.dat is blocks 0 through 9999
https://s3.amazonaws.com/dooglus/bootstrap-001.dat is blocks 10000 through 19999
https://s3.amazonaws.com/dooglus/bootstrap-002.dat is blocks 20000 through 29999
etc.

I'll add a new one for each new set of 10k blocks. Currently they go up to bootstrap-165.dat.

That way you can download just the pieces you need.

@TheButterZone
Copy link
Author

TheButterZone commented Aug 30, 2017

I had tried the partial bootstrap starting range where the client left off & the "current number of blocks" went from 349,000 or so blocks down to 0. Then I tried the full bootstrap. Back to 0. Importing bootstrap shouldn't make it look like you lost 100% of your progress to date.

@accttotech
Copy link

accttotech commented Aug 30, 2017 via email

@TheButterZone
Copy link
Author

Define "working". bootstrap.dat is 1.84GB, blk0001.dat is 1.43GB at 48 weeks behind.

@TheButterZone
Copy link
Author

TheButterZone commented Sep 1, 2017

blk0001.dat has exceeded bootstrap.dat size by 0.04 GB & counting, without being able to connect to internet. 06/01/17 scrolling past now. Waiting for it to run out of bootstrap & then I'll start downloading the segments, renaming them to bootstrap.dat & restarting after each completed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants