-
Notifications
You must be signed in to change notification settings - Fork 402
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
IPNS/IPFS gateway is down #495
Comments
We are investigating this issue. |
more info on what is problematic - this is what
but: this fails:
also this fails:
these things work on an IPFS node that I run without docker. Pretty sure this is some docker problem & I do not really want to touch docker stuff. |
Is this the custom docker file you use in your installation? https://github.com/ethereum/sourcify/blob/master/services/ipfs/Dockerfile.ipfs it seems it is not exposing any port outside the container, see the official ipfs docker image: https://github.com/ipfs/go-ipfs/blob/master/Dockerfile#L76 |
I think the important one is the 4001, the other ports you probably don't want to expose for security reasons. |
|
@kuzdogan BTW, have you considered activating IPNS pubsub? https://github.com/ipfs/go-ipfs/blob/master/docs/experimental-features.md#ipns-pubsub if I read the config correctly, it is not turned on. I have this option enabled on in my client, but it also needs support from the publisher for it to work. |
I think this will help with the IPNS resolution issues? ethereum#495 as suggested by @wmitsuda
The issue we believe was with the docker container's network config as the node does not announce itself with the public ip and therefore was unaccessible. This should be addressed by 528c105 and subsequent commits. We just moved our servers physically so we couldn't config the firewall yet but hopefully the networking issue will be sorted out this week. The pubsub should further improve resolution but the node needs to be accessible first. Thanks for the input @wmitsuda @MysticRyuujin |
Good.. I did some tests from my location trying to resolve the staging IPNS. I'm assuming the staging name is Running
It is curious that every first time I try to resolve after my local IPFS node restart, it resolves to I have the IPNS pubsub option turned on in my receiving client, as required for it to work. Turning off pubsub, it goes back to resolving very slowly every request:
However it always resolves to |
However, getting individual files is still extremely slow, I couldn't even get a successful call. I tried it prepending the resolved hash, so this particular case (getting files) doesn't seem related to the IPNS issue. If you allow me a suggestion: are you using the default flatfs datastore? It seems it is pretty bad for the sourcify use case (thousands of small files), have you considered using badgerds? My experience: when @ligi shared with me the entire production snapshot I tried to do an Then I converted my local installation to badger datastore and it works as a charm. It took just a few minutes to add the entire repo. reference: ipfs/kubo#4279 |
UpdatesIt seems the IPNS issue is resolved with the As examples here are some CIDs from the folder, output of
Also the current multiaddress of the staging node is: Diagnosis Regarding the ipfs-check tool @ligi shared: with the multiaddress of the staging and different CIDs, I was nearly always getting the third check failed "Could not find the multihash in the dht" except with the top level directory CID shown with the up-to-date IPNS gateway (under Index of: /ipnfs/...). But that one is also not being fetched by the gateways faster either. What I did Now it seems all CIDs are passing the ipfs-check tool test and it seems to take less to retrieve the files 🎉 If others also confirm their problems are resolved, I'd mark this closed. Maybe one thing we should keep an eye on is if IPFS now consumes too many resouces with these new settings. Also, our node has an hourly scheduled |
@kuzdogan thanks for the feedback! There is a trick when using ipns pubsub, according to the docs it has to enabled in BOTH client and server. So even with it enabled on Sourcify node, users have to enable it on their local node if they want to take advantage of it. My local node has it enabled and I'm experience good timings with ipfs resolve. If I turn it off, it always takes seconds or forever. When using public gateways, my experience is like yours, very irregular, my guess is that they probably don't use pubsub and we experience the same ugly performance as a local node without pubsub enabled. So I think the "resolve" part is solved for now, as long as users use a local node and opt-in for pubsub, and yes, it is also marked as experimental :) Regarding getting files, I tried your command:
but right now it is not able to complete, but my guess is that the directory changed and the hash was garbage collected. if I do a ls with the ipns name as a prefix:
it is able to complete. more specifically, doing a ls in a specific contract:
seems to return quickly (1.1 seconds) Getting the
Note: all those successful examples I did on my local node with ipns pubsub enabled, if I disable it, I get the same unusable performance:
~27 seconds for the first run, 4-5 seconds after that. Regarding badgerds, I cannot vouch for it, it just happened that was the only way I found to pin the entire repo on my machine :) Did you apply all those changes to production already? I tested against production ipns, it seems ipfs resolve is fast as the staging site, but getting files is still very slow, can you confirm? |
Yes the resolution seems all fine with pubsub and from what I understood in discussions, it has become the defacto standard and plain dht resolution has always been slow. Currently the production setting has pubsub enabled but
So I'd say it is expected that the resolution is quick but file retrieval is either really slow or impossible. What I noticed is as there are many files/hashes shared between staging and prod, the retrieval can get quick at times. But when given a path from prod ipns it takes too long:
Once we apply the changes and fix the NAT, I expect the production to behave similarly to the staging i.e. the file retrievals to happen in reasonable times. I guess then we can mark this as resolved :) Thank you very much for the very informative input. I will keep you updated here about how it goes. |
Looking forward to it! |
I guess the real validation for this issue will be if I and @MysticRyuujin can pin the entire production hash from ipns in our machines. |
We pushed the changes to production and it seems to be working fine. It also passes the ipfs-check tool test. Here's the id Looking forward to hear good news :) 🤞 |
I just tried some manual tests and it seems pretty good! I'm now trying to pin the ipns name, it seems to be progressing, no hiccups. I'll measure the time (I hope it takes < few hours, not days!) and return back. |
~44 min, however my local repo had objects from a previous backup I pinned manually and that probably reduced the total time. I'll rename my current repo and do a pin from scratch now and measure the total time again.
|
Pinning from scratch, badgerds: ~1 hour, ~5GB
Later today I'll try to pin using the default flatfs storage. I tried it earlier, but stopped at about ~1 hour, it was slowing down my computer a little, I'll try again later today when I get back to my homeoffice and let it run non-stop, then I'll post the numbers here. |
I'd say pretty good. This means the data availability problems seem to be resolved 🎉. Curious about how much difference it will take flatfs vs badgerfs, thanks a lot for investigating. Also considering adding an external pin for redundancy, let me know if you have any takes on this #560 |
Results recreating the repo and using the default ipfs configuration (flatfs):
I gave up yesterday after ~6 hours, and as you can see it downloaded roughly 10% of the data. Also, my computer slowed down noticeably, which made up stop the test. It would be good if others could replicate the test so we have more datapoints (@MysticRyuujin ? 😀), but my personal conclusion is that for Sourcify-like data (volume/amount of files) the default configuration is not good and it should be documented somewhere so people willing to contribute pinning can prepare in advance. After I stopped the test yesterday, I tried to recreate my repo using badgerds, however I noticed that it started to get stuck at different points each time:
I just tried it again now in hopes something went crazy yesterday due to repo creating/recreation and network peering, but I'm still getting stuck. Not sure if it is something on my side this time or something went wrong on the Sourcify IPFS side, are you able to pin from scratch without getting stuck right now? |
Humm, is the command bellow supposed to work?
|
It seems it is back now :) |
Results from my second trial on pin from scratch + badgerds:
< 2:30 hours, still pretty affordable for domestic users with standard hardware. |
That's strange, this time my repo got smaller, ~3GB
No idea what happened this time. |
I started experimenting with the data and got an error when Examining the raw data, it seems there are some contract files with ".." and "..." inside the filename. Not sure if this is submitter mistake or a bug in Sourcify code. Example: Anyway, the ipfs client should handle this properly, so reported there. |
I guess we can close this issue. Feel free to reopen if anything comes up or comment on #560 |
it seems something is down again, I can't resolve the ipns name:
The public gateway at ipfs.io also can't resolve it. Not sure if it is just the name or the entire service is down. |
I opened another issue since this one is closed. |
Can't load the ipns hash of
51qzi5uqu5dll0ocge71eudqnrgnogmbr37gsgl12uubsinphjoknl6bbi41p
from any IPFS gateway currently, even the one on the websitegateway.ipfs.io
which seems weird since I've pinned the ipns to my own nodes personally with pub/sub support.View in Huly HI-243
The text was updated successfully, but these errors were encountered: