-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
get/import: could not perform a HEAD request #2600
Comments
p.s. I've also tried without |
@jorgeorpinel should it be |
🤦♂ Oops. I forgot I changed the directory name from $ dvc import --rev 0547f58 \
[email protected]:iterative/dataset-registry.git \
use-cases/cats-dogs
Importing 'use-cases/cats-dogs ([email protected]:iterative/dataset-registry.git)' -> 'cats-dogs'
WARNING: Some of the cache files do not exist neither locally nor on remote. Missing cache files:
name: ../../../../private/var/folders/_c/3mt_xn_d4xl2ddsx2m98h_r40000gn/T/tmpfnwm64lqdvc-repo/use-cases/cats-dogs, md5: b6923e1e4ad16ea1a7e2b328842d56a2.dir
Missing cache for directory '../../../../private/var/folders/_c/3mt_xn_d4xl2ddsx2m98h_r40000gn/T/tmpfnwm64lqdvc-repo/use-cases/cats-dogs'. Cache for files inside will be lost. Would you like to continue? Use '-f' to force. [y/n] y
WARNING: Some of the cache files do not exist neither locally nor on remote. Missing cache files:
name: ../../../../private/var/folders/_c/3mt_xn_d4xl2ddsx2m98h_r40000gn/T/tmpfnwm64lqdvc-repo/use-cases/cats-dogs, md5: b6923e1e4ad16ea1a7e2b328842d56a2.dir
WARNING: Cache 'b6923e1e4ad16ea1a7e2b328842d56a2.dir' not found. File 'cats-dogs' won't be created.
ERROR: failed to import 'use-cases/cats-dogs' from '[email protected]:iterative/dataset-registry.git'. - output 'cats-dogs' does not exist
Having any troubles? Hit us up at https://dvc.org/support, we are always happy to help!
What does the "output 'cats-dogs' does not exist" error mean? |
Hmmmm... Apparently I didn't push that version of the |
@jorgeorpinel yes, please open a new UI issue! |
Done! #2602 |
So, I pushed the data to the remote now and checked that it actually exists on S3: $ aws s3 ls s3://dvc-public/remote/dataset-registry/b6/
2019-10-05 01:51:13 6388 2f5c18d1af468fd41c979873a8404b
2019-10-05 01:51:41 22202 4ced1e881cc37c0e0673bafe6e789c
2019-10-12 19:10:03 161184 923e1e4ad16ea1a7e2b328842d56a2.dir <-- Bingo
2019-10-05 01:50:56 17450 efd10ab38ff17fa593e3b102d088ac However, I try to import it (into the same empty non-Git DVC project) and, although the progress bar runs for a while up to around 90%, the progress bar suddenly disappears and I get: $ dvc import --rev 0547f58 \
[email protected]:iterative/dataset-registry.git \
use-cases/cats-dogs
Importing 'use-cases/cats-dogs ([email protected]:iterative/dataset-registry.git)' -> 'cats-dogs'
ERROR: failed to import 'use-cases/cats-dogs' from '[email protected]:iterative/dataset-registry.git'. - could not perform a HEAD request And nothing is downloaded. I've tried several times. My Internet connection is fine:
|
p.s. Here's the last part of the That one run failed at file $ aws s3 ls s3://dvc-public/remote/dataset-registry/ad/b29c1de1624c53c808f1a15bd332ba
2019-10-05 01:51:44 22427 b29c1de1624c53c808f1a15bd332ba |
@iterative/engineering |
Can reproduce on my mac, but not on linux
|
Reproduction steps for Linux:
Number of max connections here needs to be changed to some big amount. For me 10k worked. |
It seems like we are hitting some limit here. |
Related: #2473 |
It seems that the problem is that, with every request send, we are reserving socket "through" example: and run:
|
I didn't have any problem pushing this whole directory (1800 images) from the source project though. I'm guessing probably |
|
p.s. I also just tried |
Little summary so far:
Possible way of handling the problem: |
Can reproduce this same bug on windows too :( |
EDIT: wrong issue, it was meant for #2589 |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
https://requests.kennethreitz.org/en/master/user/advanced/ says that session is using a connection pool by default. Chaning to using session instead of requests.request directly made everything work for me and I no longer see fluctuations in fd numbers. Will send a patch ASAP. Kudos @pared 🎉 |
This way we are able to properly utilize automatic connection pools and not create new fds for each request, which overflows ulimit for max fds very quickly on mac and windows. Kudos @pared for investigating 🎉 Fixes iterative#2600 Signed-off-by: Ruslan Kuprieiev <[email protected]>
This way we are able to properly utilize automatic connection pools and not create new fds for each request, which overflows ulimit for max fds very quickly on mac and windows. Kudos @pared for investigating 🎉 Fixes #2600 Signed-off-by: Ruslan Kuprieiev <[email protected]>
I can confirm it's fixed for me as well in DVC 0.63.4. Thanks!!! |
I'm trying to import a directory versioned in our own dataset registry project into an empty, non-Git DVC project, but getting this cryptic error:
The directory in question has file name
b6923e1e4ad16ea1a7e2b328842d56a2.dir
(See use-cases/cats-dogs.dvc of that version). And the default remote is [configured[(https://github.com/iterative/dataset-registry/blob/master/.dvc/config) to https://remote.dvc.org/dataset-registry (which is an HTTP redirect to the s3://dvc-public/remote/dataset-registry bucket).The file seems to be in the remoteAm I just doing something wrong here (hopefully), or is
dvc import
broken?The text was updated successfully, but these errors were encountered: