Skip to content
This repository has been archived by the owner on Feb 12, 2024. It is now read-only.

Performance regression in [email protected] when adding files #3792

Closed
achingbrain opened this issue Aug 5, 2021 · 4 comments
Closed

Performance regression in [email protected] when adding files #3792

achingbrain opened this issue Aug 5, 2021 · 4 comments
Labels
need/triage Needs initial labeling and prioritization

Comments

@achingbrain
Copy link
Member

This seems linked to the new multiformats stack, adding files has got quite a bit slower:

Interop suite, [email protected]

  files
    ✔ returns an error when reading non-existent files
    ✔ returns an error when writing deeply nested files and the parents do not exist
    - uses raw nodes for leaf data
    ✔ errors when creating the same directory twice (74ms)
    ✔ does not error when creating the same directory twice and parents option is passed
    ✔ errors when creating the root directory
    has the same hashes for
      ✔ empty files
      ✔ small files
      ✔ big files (132ms)
      ✔ files that have had data appended (457ms)
      ✔ files that have had data overwritten (38ms)
      ✔ small files with CIDv1
      ✔ big files with CIDv1 (120ms)
      ✔ trickle DAGs (21537ms)
      ✔ rabin chunker (461ms)
      ✔ rabin chunker small chunks (13523ms)
      ✔ hamt shards (4765ms)
      ✔ updating mfs hamt shards (5850ms)

Interop suite, [email protected]

  files
    ✓ returns an error when reading non-existent files
    ✓ returns an error when writing deeply nested files and the parents do not exist
    - uses raw nodes for leaf data
    ✓ errors when creating the same directory twice
    ✓ does not error when creating the same directory twice and parents option is passed
    ✓ errors when creating the root directory
    has the same hashes for
      ✓ empty files
      ✓ small files
      ✓ big files (101ms)
      ✓ files that have had data appended (40ms)
      ✓ files that have had data overwritten (38ms)
      ✓ small files with CIDv1
      ✓ big files with CIDv1 (79ms)
      ✓ trickle DAGs (5157ms)
      ✓ rabin chunker (198ms)
      ✓ rabin chunker small chunks (3175ms)
      ✓ hamt shards (2360ms)
      ✓ updating mfs hamt shards (2607ms)

The Trickle DAG test in particular went from 5s to 21s, rabin chunker small chunks from 3s to 13s.

@achingbrain achingbrain added the need/triage Needs initial labeling and prioritization label Aug 5, 2021
@rvagg
Copy link
Member

rvagg commented Aug 6, 2021

Hey @achingbrain, I've spent a bit of time trying to figure out what's implicated here and I haven't made a ton of of progress, trying to use some flamegraphs to get to details, but it's pretty illusive!

Check out https://bafybeicgd356f3klytj6ggqkemmt6bdokggcwd72daik5bepgguhvdcff4.ipfs.dweb.link/

They're all generated just using the trickle DAGs test. There's a few sets there - the ones for http-client are generated from just the tests are purely from the client side, within the process running the tests. The ones for daemon are purely from the js-ipfs daemon during its lifecycle backing the tests (data collected on separate runs). The x100 versions are runs where I've cranked up createDataStream(10000) to createDataStream(1000000) to get activity focused on the bit we want to look at. At that level, on my machine, 55.0 runs 18133ms and 56.1 runs 51325ms, so quite a difference. But it's not obvious from the flamegraphs where the differential is. The only thing that stands out is that the daemon spends a lot more time in epoll_wait (turn on cpp, this is happening in libuv). I don't see anything obvious on the client end that would explain a difference in sending behaviour, although I don't see obvious evidence of client slowing down either which is a bit sus.

If you find trickle.js it's only 1.3% on 0.55.0 and 0.3% on 0.56.1, so I don't think the multiformats stack is to blame, at least as far as unixfs is concerned. But it's all pretty weird so 🤷. Maybe you can see more sense in here than is standing out to me, I'm not very familiar with the hapi stack, nor pino.

@achingbrain
Copy link
Member Author

achingbrain commented Aug 6, 2021

Thanks for looking into this. I’m AFK right now, but I’d try isolating the test code to run against ipfs-core instead of over http (eg remove hapi and the client etc from the equation) and increase the number of blocks - I decreased the arg to createDataStream to get the tests passing again, it was using the default value previously.

@rvagg
Copy link
Member

rvagg commented Aug 6, 2021

sorry, didn't do that but had in mind something else that I've just tried, skip the compare and just run the testHashesAreEqual for js and go separately with a bigger number: return testHashesAreEqual(js, createDataStream(1000000), options)

  • 0.55.0 JS 17608ms
  • 0.55.0 Go 376ms
  • 0.56.1 JS 49772ms
  • 0.56.1 Go 369ms

Delta is only on the JS side in this, likely on the server, which is not what that big gap for epoll_wait suggests. Maybe we're looking at streaming oddities.

There's a few too many layers in here for it to be obvious to me how to rewire this to work against ipfs-core directly. It might be easier just writing something entirely separate from the test facade in here that uses the same process to make that work? Probably won't have time for that just now, unless I get the itch over the weekend. It might be quicker for you to do that if you have the time.

@achingbrain
Copy link
Member Author

I think this has been caused by the removal of the filter that checked to see if a block was in the blockstore before adding it, oops. Fixed in #3801, performance is similar to the previous version with that change.

That said, I think there's still some fairly significant low-hanging fruit to be had in the importer, for example when writing directories/HAMTs out we persist the children sequentially, doing that in parallel (similar to what we do with individual files) would give us a nice speed up.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
need/triage Needs initial labeling and prioritization
Projects
None yet
Development

No branches or pull requests

2 participants