Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

get: trying to download a git-tracked directory fails #5356

Closed
skshetry opened this issue Jan 28, 2021 · 14 comments
Closed

get: trying to download a git-tracked directory fails #5356

skshetry opened this issue Jan 28, 2021 · 14 comments
Labels
bug Did we break something? p0-critical Critical issue. Needs to be fixed ASAP. research

Comments

@skshetry
Copy link
Member

Bug Report

$ dvc get https://github.com/schacon/cowsay cows -v
2021-01-28 20:45:34,300 DEBUG: Creating external repo https://github.com/schacon/cowsay@None
2021-01-28 20:45:34,301 DEBUG: erepo: git clone 'https://github.com/schacon/cowsay' to a temporary dir
2021-01-28 20:45:37,886 DEBUG: Downloading '../../../../../tmp/tmpfxyquu5ydvc-clone/cows/udder.cow' to 'cows/udder.cow'
2021-01-28 20:45:37,886 DEBUG: Downloading '../../../../../tmp/tmpfxyquu5ydvc-clone/cows/elephant-in-snake.cow' to 'cows/elephant-in-snake.cow'
2021-01-28 20:45:37,887 DEBUG: Downloading '../../../../../tmp/tmpfxyquu5ydvc-clone/cows/moose.cow' to 'cows/moose.cow'
2021-01-28 20:45:37,887 DEBUG: Downloading '../../../../../tmp/tmpfxyquu5ydvc-clone/cows/vader-koala.cow' to 'cows/vader-koala.cow'
2021-01-28 20:45:37,888 DEBUG: Downloading '../../../../../tmp/tmpfxyquu5ydvc-clone/cows/mutilated.cow' to 'cows/mutilated.cow'
2021-01-28 20:45:37,888 DEBUG: Downloading '../../../../../tmp/tmpfxyquu5ydvc-clone/cows/default.cow' to 'cows/default.cow'
2021-01-28 20:45:37,888 DEBUG: Downloading '../../../../../tmp/tmpfxyquu5ydvc-clone/cows/turkey.cow' to 'cows/turkey.cow'
2021-01-28 20:45:37,889 DEBUG: Downloading '../../../../../tmp/tmpfxyquu5ydvc-clone/cows/stegosaurus.cow' to 'cows/stegosaurus.cow'
2021-01-28 20:45:37,890 DEBUG: Downloading '../../../../../tmp/tmpfxyquu5ydvc-clone/cows/vader.cow' to 'cows/vader.cow'
2021-01-28 20:45:37,891 DEBUG: Downloading '../../../../../tmp/tmpfxyquu5ydvc-clone/cows/tux.cow' to 'cows/tux.cow'
2021-01-28 20:45:37,891 DEBUG: Downloading '../../../../../tmp/tmpfxyquu5ydvc-clone/cows/supermilker.cow' to 'cows/supermilker.cow'
2021-01-28 20:45:37,891 DEBUG: Downloading '../../../../../tmp/tmpfxyquu5ydvc-clone/cows/kiss.cow' to 'cows/kiss.cow'
2021-01-28 20:45:37,892 DEBUG: Downloading '../../../../../tmp/tmpfxyquu5ydvc-clone/cows/mech-and-cow' to 'cows/mech-and-cow'
2021-01-28 20:45:37,892 DEBUG: Downloading '../../../../../tmp/tmpfxyquu5ydvc-clone/cows/dragon-and-cow.cow' to 'cows/dragon-and-cow.cow'
2021-01-28 20:45:37,893 DEBUG: Downloading '../../../../../tmp/tmpfxyquu5ydvc-clone/cows/luke-koala.cow' to 'cows/luke-koala.cow'
2021-01-28 20:45:37,893 DEBUG: Downloading '../../../../../tmp/tmpfxyquu5ydvc-clone/cows/flaming-sheep.cow' to 'cows/flaming-sheep.cow'
2021-01-28 20:45:37,894 DEBUG: Downloading '../../../../../tmp/tmpfxyquu5ydvc-clone/cows/squirrel.cow' to 'cows/squirrel.cow'
2021-01-28 20:45:37,895 DEBUG: Downloading '../../../../../tmp/tmpfxyquu5ydvc-clone/cows/www.cow' to 'cows/www.cow'
2021-01-28 20:45:37,896 DEBUG: Downloading '../../../../../tmp/tmpfxyquu5ydvc-clone/cows/bunny.cow' to 'cows/bunny.cow'
2021-01-28 20:45:37,897 DEBUG: Downloading '../../../../../tmp/tmpfxyquu5ydvc-clone/cows/cheese.cow' to 'cows/cheese.cow'
2021-01-28 20:45:37,897 DEBUG: Downloading '../../../../../tmp/tmpfxyquu5ydvc-clone/cows/sodomized.cow' to 'cows/sodomized.cow'
2021-01-28 20:45:37,900 DEBUG: Downloading '../../../../../tmp/tmpfxyquu5ydvc-clone/cows/skeleton.cow' to 'cows/skeleton.cow'
2021-01-28 20:45:37,900 DEBUG: Downloading '../../../../../tmp/tmpfxyquu5ydvc-clone/cows/kosh.cow' to 'cows/kosh.cow'
2021-01-28 20:45:37,900 DEBUG: Downloading '../../../../../tmp/tmpfxyquu5ydvc-clone/cows/cower.cow' to 'cows/cower.cow'
2021-01-28 20:45:37,901 DEBUG: Downloading '../../../../../tmp/tmpfxyquu5ydvc-clone/cows/milk.cow' to 'cows/milk.cow'
2021-01-28 20:45:37,905 DEBUG: Downloading '../../../../../tmp/tmpfxyquu5ydvc-clone/cows/bong.cow' to 'cows/bong.cow'
2021-01-28 20:45:37,905 DEBUG: Downloading '../../../../../tmp/tmpfxyquu5ydvc-clone/cows/ghostbusters.cow' to 'cows/ghostbusters.cow'
2021-01-28 20:45:37,906 DEBUG: Downloading '../../../../../tmp/tmpfxyquu5ydvc-clone/cows/dragon.cow' to 'cows/dragon.cow'
2021-01-28 20:45:37,907 DEBUG: Downloading '../../../../../tmp/tmpfxyquu5ydvc-clone/cows/beavis.zen.cow' to 'cows/beavis.zen.cow'
2021-01-28 20:45:37,909 DEBUG: Downloading '../../../../../tmp/tmpfxyquu5ydvc-clone/cows/sheep.cow' to 'cows/sheep.cow'
2021-01-28 20:45:37,910 DEBUG: Downloading '../../../../../tmp/tmpfxyquu5ydvc-clone/cows/moofasa.cow' to 'cows/moofasa.cow'
2021-01-28 20:45:37,910 DEBUG: Downloading '../../../../../tmp/tmpfxyquu5ydvc-clone/cows/eyes.cow' to 'cows/eyes.cow'
2021-01-28 20:45:37,912 DEBUG: Downloading '../../../../../tmp/tmpfxyquu5ydvc-clone/cows/three-eyes.cow' to 'cows/three-eyes.cow'
2021-01-28 20:45:37,913 DEBUG: Downloading '../../../../../tmp/tmpfxyquu5ydvc-clone/cows/satanic.cow' to 'cows/satanic.cow'
2021-01-28 20:45:37,913 DEBUG: Downloading '../../../../../tmp/tmpfxyquu5ydvc-clone/cows/stimpy.cow' to 'cows/stimpy.cow'
2021-01-28 20:45:37,914 DEBUG: Downloading '../../../../../tmp/tmpfxyquu5ydvc-clone/cows/ren.cow' to 'cows/ren.cow'
2021-01-28 20:45:37,914 DEBUG: Downloading '../../../../../tmp/tmpfxyquu5ydvc-clone/cows/meow.cow' to 'cows/meow.cow'
2021-01-28 20:45:37,931 DEBUG: Removing '/home/saugat/repos/iterative/dvc/.2PaKC9XmePtLojQ9YYkjWq'
2021-01-28 20:45:37,940 ERROR: unexpected error - decompressed data does not match expected size
------------------------------------------------------------
Traceback (most recent call last):
  File "/home/saugat/repos/iterative/dvc/dvc/main.py", line 50, in main
    ret = cmd.run()
  File "/home/saugat/repos/iterative/dvc/dvc/command/get.py", line 31, in run
    return self._get_file_from_repo()
  File "/home/saugat/repos/iterative/dvc/dvc/command/get.py", line 37, in _get_file_from_repo
    Repo.get(
  File "/home/saugat/repos/iterative/dvc/dvc/repo/get.py", line 55, in get
    repo.repo_tree.download(from_info, to_info, follow_subrepos=False)
  File "/home/saugat/repos/iterative/dvc/dvc/tree/base.py", line 393, in download
    return self._download_dir(
  File "/home/saugat/repos/iterative/dvc/dvc/tree/base.py", line 435, in _download_dir
    raise exc
  File "/usr/lib/python3.9/concurrent/futures/thread.py", line 52, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/home/saugat/repos/iterative/dvc/dvc/progress.py", line 129, in wrapped
    res = fn(*args, **kwargs)
  File "/home/saugat/repos/iterative/dvc/dvc/tree/base.py", line 447, in _download_file
    self._download(  # noqa, pylint: disable=no-member
  File "/home/saugat/repos/iterative/dvc/dvc/tree/repo.py", line 415, in _download
    with self.open(from_info, "rb", **kwargs) as from_fobj:
  File "/home/saugat/repos/iterative/dvc/dvc/tree/repo.py", line 137, in open
    return tree.open(path_info, mode=mode, encoding=encoding)
  File "/home/saugat/repos/iterative/dvc/dvc/tree/git.py", line 59, in open
    return self.trie.open(key, mode=mode, encoding=encoding)
  File "/home/saugat/repos/iterative/dvc/dvc/scm/git/objects.py", line 74, in open
    return obj.open(mode=mode, encoding=encoding)
  File "/home/saugat/repos/iterative/dvc/dvc/scm/git/backend/dulwich.py", line 35, in open
    obj = self.repo[self.sha]
  File "/home/saugat/venvs/dvc/env39/lib/python3.9/site-packages/dulwich/repo.py", line 737, in __getitem__
    return self.object_store[name]
  File "/home/saugat/venvs/dvc/env39/lib/python3.9/site-packages/dulwich/object_store.py", line 121, in __getitem__
    type_num, uncomp = self.get_raw(sha)
  File "/home/saugat/venvs/dvc/env39/lib/python3.9/site-packages/dulwich/object_store.py", line 486, in get_raw
    return pack.get_raw(sha)
  File "/home/saugat/venvs/dvc/env39/lib/python3.9/site-packages/dulwich/pack.py", line 2057, in get_raw
    obj_type, obj = self.data.get_object_at(offset)
  File "/home/saugat/venvs/dvc/env39/lib/python3.9/site-packages/dulwich/pack.py", line 1304, in get_object_at
    unpacked, _ = unpack_object(self._file.read)
  File "/home/saugat/venvs/dvc/env39/lib/python3.9/site-packages/dulwich/pack.py", line 769, in unpack_object
    unused = read_zlib_chunks(read_some, unpacked, buffer_size=zlib_bufsize,
  File "/home/saugat/venvs/dvc/env39/lib/python3.9/site-packages/dulwich/pack.py", line 252, in read_zlib_chunks
    raise zlib.error('decompressed data does not match expected size')
zlib.error: decompressed data does not match expected size
------------------------------------------------------------
2021-01-28 20:45:38,693 DEBUG: Version info for developers:
DVC version: 2.0.0a0+bb4604
---------------------------------
Platform: Python 3.9.1 on Linux-5.10.7-arch1-1-x86_64-with-glibc2.32
Supports: azure, gdrive, gs, http, https, s3, ssh, oss, webdav, webdavs, webhdfs
Cache types: symlink
Cache directory: tmpfs on tmpfs
Caches: local
Remotes: https
Workspace directory: ext4 on /dev/sda9
Repo: dvc, git

Having any troubles? Hit us up at https://dvc.org/support, we are always happy to help!
2021-01-28 20:45:38,695 DEBUG: Analytics is disabled.

Note that it does partially create a directory and downloads particular entries from it, but it panics later and fails, leaving the directory in partial state with .tmp files inside.

@skshetry skshetry added bug Did we break something? p0-critical Critical issue. Needs to be fixed ASAP. labels Jan 28, 2021
@skshetry
Copy link
Member Author

We should really start writing e2e tests, at least for get/import[-url]. Though this should have been caught by our tests as well, I wonder where things went wrong in the tests.

@efiop
Copy link
Contributor

efiop commented Jan 28, 2021

Might be related to my erepo patches, I'll check.

EDIT: ah, unlikely. Something else is going on.

@efiop
Copy link
Contributor

efiop commented Jan 28, 2021

Hm, unable to reproduce. Are you getting that error consistently or from time to time?

Also, could you show pip freeze | grep dulwich?

@skshetry
Copy link
Member Author

@efiop, I am getting it consistently. Does that above script work for you?

$ pip freeze | grep dulwich
dulwich==0.20.15

@efiop
Copy link
Contributor

efiop commented Jan 28, 2021

Yep, same dulwich version.

Running that dvc get dozens of times now, not able to reproduce yet 🤔

@skshetry
Copy link
Member Author

hmm, dvc installed with pip install git+https://github.com/iterative/dvc#egg=dvc also gives me the same error message.

Not only with the url, it also happens on a local repo.

@efiop
Copy link
Contributor

efiop commented Jan 28, 2021

@skshetry I see that you are using tmpfs, could you try on regular fs? Just wondering if it is too fast or something and results in some race condition in dulwich/dvc or something.

@skshetry
Copy link
Member Author

I'll close this issue for now, then. Will investigate tomorrow on the cause.

@skshetry
Copy link
Member Author

skshetry commented Jan 28, 2021

I see that you are using tmpfs, could you try on regular fs?

Ahh, I was using global cache directory when trying out, but I have removed it after creating this issue.

Here's the updated traceback:

$ dvc get https://github.com/schacon/cowsay cows -v
2021-01-28 22:53:28,418 DEBUG: Creating external repo https://github.com/schacon/cowsay@None
2021-01-28 22:53:28,419 DEBUG: erepo: git clone 'https://github.com/schacon/cowsay' to a temporary dir
2021-01-28 22:53:32,353 DEBUG: Downloading '../../../../../tmp/tmpyq3s66fcdvc-clone/cows/dragon-and-cow.cow' to 'cows/dragon-and-cow.cow'
2021-01-28 22:53:32,355 DEBUG: Downloading '../../../../../tmp/tmpyq3s66fcdvc-clone/cows/flaming-sheep.cow' to 'cows/flaming-sheep.cow'
2021-01-28 22:53:32,355 DEBUG: Downloading '../../../../../tmp/tmpyq3s66fcdvc-clone/cows/stegosaurus.cow' to 'cows/stegosaurus.cow'
2021-01-28 22:53:32,355 DEBUG: Downloading '../../../../../tmp/tmpyq3s66fcdvc-clone/cows/vader.cow' to 'cows/vader.cow'
2021-01-28 22:53:32,356 DEBUG: Downloading '../../../../../tmp/tmpyq3s66fcdvc-clone/cows/head-in.cow' to 'cows/head-in.cow'
2021-01-28 22:53:32,357 DEBUG: Downloading '../../../../../tmp/tmpyq3s66fcdvc-clone/cows/surgery.cow' to 'cows/surgery.cow'
2021-01-28 22:53:32,357 DEBUG: Downloading '../../../../../tmp/tmpyq3s66fcdvc-clone/cows/telebears.cow' to 'cows/telebears.cow'
2021-01-28 22:53:32,358 DEBUG: Downloading '../../../../../tmp/tmpyq3s66fcdvc-clone/cows/turtle.cow' to 'cows/turtle.cow'
2021-01-28 22:53:32,358 DEBUG: Downloading '../../../../../tmp/tmpyq3s66fcdvc-clone/cows/three-eyes.cow' to 'cows/three-eyes.cow'
2021-01-28 22:53:32,358 DEBUG: Downloading '../../../../../tmp/tmpyq3s66fcdvc-clone/cows/elephant-in-snake.cow' to 'cows/elephant-in-snake.cow'
2021-01-28 22:53:32,358 DEBUG: Downloading '../../../../../tmp/tmpyq3s66fcdvc-clone/cows/udder.cow' to 'cows/udder.cow'
2021-01-28 22:53:32,359 DEBUG: Downloading '../../../../../tmp/tmpyq3s66fcdvc-clone/cows/stimpy.cow' to 'cows/stimpy.cow'
2021-01-28 22:53:32,360 DEBUG: Downloading '../../../../../tmp/tmpyq3s66fcdvc-clone/cows/cower.cow' to 'cows/cower.cow'
2021-01-28 22:53:32,360 DEBUG: Downloading '../../../../../tmp/tmpyq3s66fcdvc-clone/cows/cheese.cow' to 'cows/cheese.cow'
2021-01-28 22:53:32,361 DEBUG: Downloading '../../../../../tmp/tmpyq3s66fcdvc-clone/cows/satanic.cow' to 'cows/satanic.cow'
2021-01-28 22:53:32,361 DEBUG: Downloading '../../../../../tmp/tmpyq3s66fcdvc-clone/cows/squirrel.cow' to 'cows/squirrel.cow'
2021-01-28 22:53:32,361 DEBUG: Downloading '../../../../../tmp/tmpyq3s66fcdvc-clone/cows/ghostbusters.cow' to 'cows/ghostbusters.cow'
2021-01-28 22:53:32,362 DEBUG: Downloading '../../../../../tmp/tmpyq3s66fcdvc-clone/cows/moose.cow' to 'cows/moose.cow'
2021-01-28 22:53:32,363 DEBUG: Downloading '../../../../../tmp/tmpyq3s66fcdvc-clone/cows/sodomized.cow' to 'cows/sodomized.cow'
2021-01-28 22:53:32,365 DEBUG: Downloading '../../../../../tmp/tmpyq3s66fcdvc-clone/cows/kiss.cow' to 'cows/kiss.cow'
2021-01-28 22:53:32,367 DEBUG: Downloading '../../../../../tmp/tmpyq3s66fcdvc-clone/cows/kosh.cow' to 'cows/kosh.cow'
2021-01-28 22:53:32,368 DEBUG: Downloading '../../../../../tmp/tmpyq3s66fcdvc-clone/cows/luke-koala.cow' to 'cows/luke-koala.cow'
2021-01-28 22:53:32,368 DEBUG: Downloading '../../../../../tmp/tmpyq3s66fcdvc-clone/cows/bunny.cow' to 'cows/bunny.cow'
2021-01-28 22:53:32,369 DEBUG: Downloading '../../../../../tmp/tmpyq3s66fcdvc-clone/cows/vader-koala.cow' to 'cows/vader-koala.cow'
2021-01-28 22:53:32,371 DEBUG: Downloading '../../../../../tmp/tmpyq3s66fcdvc-clone/cows/beavis.zen.cow' to 'cows/beavis.zen.cow'
2021-01-28 22:53:32,373 DEBUG: Downloading '../../../../../tmp/tmpyq3s66fcdvc-clone/cows/bud-frogs.cow' to 'cows/bud-frogs.cow'
2021-01-28 22:53:32,374 DEBUG: Downloading '../../../../../tmp/tmpyq3s66fcdvc-clone/cows/hellokitty.cow' to 'cows/hellokitty.cow'
2021-01-28 22:53:32,375 DEBUG: Downloading '../../../../../tmp/tmpyq3s66fcdvc-clone/cows/meow.cow' to 'cows/meow.cow'
2021-01-28 22:53:32,376 DEBUG: Downloading '../../../../../tmp/tmpyq3s66fcdvc-clone/cows/eyes.cow' to 'cows/eyes.cow'
2021-01-28 22:53:32,377 DEBUG: Downloading '../../../../../tmp/tmpyq3s66fcdvc-clone/cows/small.cow' to 'cows/small.cow'
2021-01-28 22:53:32,379 DEBUG: Downloading '../../../../../tmp/tmpyq3s66fcdvc-clone/cows/sheep.cow' to 'cows/sheep.cow'
2021-01-28 22:53:32,381 DEBUG: Downloading '../../../../../tmp/tmpyq3s66fcdvc-clone/cows/ren.cow' to 'cows/ren.cow'
2021-01-28 22:53:32,382 DEBUG: Downloading '../../../../../tmp/tmpyq3s66fcdvc-clone/cows/daemon.cow' to 'cows/daemon.cow'
2021-01-28 22:53:32,382 DEBUG: Downloading '../../../../../tmp/tmpyq3s66fcdvc-clone/cows/kitty.cow' to 'cows/kitty.cow'
2021-01-28 22:53:32,384 DEBUG: Downloading '../../../../../tmp/tmpyq3s66fcdvc-clone/cows/milk.cow' to 'cows/milk.cow'
2021-01-28 22:53:32,385 DEBUG: Downloading '../../../../../tmp/tmpyq3s66fcdvc-clone/cows/mech-and-cow' to 'cows/mech-and-cow'
2021-01-28 22:53:32,386 DEBUG: Downloading '../../../../../tmp/tmpyq3s66fcdvc-clone/cows/mutilated.cow' to 'cows/mutilated.cow'
2021-01-28 22:53:32,388 DEBUG: Downloading '../../../../../tmp/tmpyq3s66fcdvc-clone/cows/www.cow' to 'cows/www.cow'
2021-01-28 22:53:32,388 DEBUG: Downloading '../../../../../tmp/tmpyq3s66fcdvc-clone/cows/dragon.cow' to 'cows/dragon.cow'
2021-01-28 22:53:32,390 DEBUG: Downloading '../../../../../tmp/tmpyq3s66fcdvc-clone/cows/turkey.cow' to 'cows/turkey.cow'
2021-01-28 22:53:32,407 DEBUG: Removing '/home/saugat/repos/iterative/dvc/.YPg2bmqbXPvwReTicionjs'
2021-01-28 22:53:32,416 ERROR: unexpected error - Error -3 while decompressing data: incorrect header check
------------------------------------------------------------
Traceback (most recent call last):
  File "/home/saugat/repos/iterative/dvc/dvc/main.py", line 50, in main
    ret = cmd.run()
  File "/home/saugat/repos/iterative/dvc/dvc/command/get.py", line 31, in run
    return self._get_file_from_repo()
  File "/home/saugat/repos/iterative/dvc/dvc/command/get.py", line 37, in _get_file_from_repo
    Repo.get(
  File "/home/saugat/repos/iterative/dvc/dvc/repo/get.py", line 55, in get
    repo.repo_tree.download(from_info, to_info, follow_subrepos=False)
  File "/home/saugat/repos/iterative/dvc/dvc/tree/base.py", line 393, in download
    return self._download_dir(
  File "/home/saugat/repos/iterative/dvc/dvc/tree/base.py", line 435, in _download_dir
    raise exc
  File "/usr/lib/python3.9/concurrent/futures/thread.py", line 52, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/home/saugat/repos/iterative/dvc/dvc/progress.py", line 129, in wrapped
    res = fn(*args, **kwargs)
  File "/home/saugat/repos/iterative/dvc/dvc/tree/base.py", line 447, in _download_file
    self._download(  # noqa, pylint: disable=no-member
  File "/home/saugat/repos/iterative/dvc/dvc/tree/repo.py", line 415, in _download
    with self.open(from_info, "rb", **kwargs) as from_fobj:
  File "/home/saugat/repos/iterative/dvc/dvc/tree/repo.py", line 137, in open
    return tree.open(path_info, mode=mode, encoding=encoding)
  File "/home/saugat/repos/iterative/dvc/dvc/tree/git.py", line 59, in open
    return self.trie.open(key, mode=mode, encoding=encoding)
  File "/home/saugat/repos/iterative/dvc/dvc/scm/git/objects.py", line 74, in open
    return obj.open(mode=mode, encoding=encoding)
  File "/home/saugat/repos/iterative/dvc/dvc/scm/git/backend/dulwich.py", line 35, in open
    obj = self.repo[self.sha]
  File "/home/saugat/venvs/dvc/env39/lib/python3.9/site-packages/dulwich/repo.py", line 737, in __getitem__
    return self.object_store[name]
  File "/home/saugat/venvs/dvc/env39/lib/python3.9/site-packages/dulwich/object_store.py", line 121, in __getitem__
    type_num, uncomp = self.get_raw(sha)
  File "/home/saugat/venvs/dvc/env39/lib/python3.9/site-packages/dulwich/object_store.py", line 486, in get_raw
    return pack.get_raw(sha)
  File "/home/saugat/venvs/dvc/env39/lib/python3.9/site-packages/dulwich/pack.py", line 2057, in get_raw
    obj_type, obj = self.data.get_object_at(offset)
  File "/home/saugat/venvs/dvc/env39/lib/python3.9/site-packages/dulwich/pack.py", line 1304, in get_object_at
    unpacked, _ = unpack_object(self._file.read)
  File "/home/saugat/venvs/dvc/env39/lib/python3.9/site-packages/dulwich/pack.py", line 769, in unpack_object
    unused = read_zlib_chunks(read_some, unpacked, buffer_size=zlib_bufsize,
  File "/home/saugat/venvs/dvc/env39/lib/python3.9/site-packages/dulwich/pack.py", line 235, in read_zlib_chunks
    decomp = decomp_obj.decompress(add)
zlib.error: Error -3 while decompressing data: incorrect header check
------------------------------------------------------------
2021-01-28 22:53:33,152 DEBUG: Version info for developers:
DVC version: 2.0.0a0+bb4604.mod
---------------------------------
Platform: Python 3.9.1 on Linux-5.10.7-arch1-1-x86_64-with-glibc2.32
Supports: azure, gdrive, gs, http, https, s3, ssh, oss, webdav, webdavs, webhdfs
Cache types: hardlink, symlink
Cache directory: ext4 on /dev/sda9
Caches: local
Remotes: https
Workspace directory: ext4 on /dev/sda9
Repo: dvc, git

Having any troubles? Hit us up at https://dvc.org/support, we are always happy to help!
2021-01-28 22:53:33,154 DEBUG: Analytics is disabled.

@efiop
Copy link
Contributor

efiop commented Jan 28, 2021

whoa, a new type of error. 😱 CC @iterative/dvc guys, could you give that command a shot? Just as a sanity check.

@efiop
Copy link
Contributor

efiop commented Jan 28, 2021

@skshetry Let's keep it open for now.

@efiop efiop reopened this Jan 28, 2021
@skshetry
Copy link
Member Author

skshetry commented Jan 28, 2021

whoa, a new type of error

Sometime it's this one, other times it's the above one. But it fails consistently. The order of files downloaded is not the same though. The directory is partially downloaded and fails.

@skshetry
Copy link
Member Author

The results with other backends are even more strange:

  1. Gitpython
dvc get https://github.com/schacon/cowsay cows -v
2021-01-29 12:49:15,309 DEBUG: Creating external repo https://github.com/schacon/cowsay@None
2021-01-29 12:49:15,310 DEBUG: erepo: git clone 'https://github.com/schacon/cowsay' to a temporary dir
2021-01-29 12:49:18,917 DEBUG: Downloading '../../../../../tmp/tmpfs9qbgikdvc-clone/cows/udder.cow' to 'cows/udder.cow'
2021-01-29 12:49:18,917 DEBUG: Downloading '../../../../../tmp/tmpfs9qbgikdvc-clone/cows/sheep.cow' to 'cows/sheep.cow'
2021-01-29 12:49:18,918 DEBUG: Downloading '../../../../../tmp/tmpfs9qbgikdvc-clone/cows/bunny.cow' to 'cows/bunny.cow'
2021-01-29 12:49:18,918 DEBUG: Downloading '../../../../../tmp/tmpfs9qbgikdvc-clone/cows/satanic.cow' to 'cows/satanic.cow'
2021-01-29 12:49:18,919 DEBUG: Downloading '../../../../../tmp/tmpfs9qbgikdvc-clone/cows/supermilker.cow' to 'cows/supermilker.cow'
2021-01-29 12:49:18,919 DEBUG: Downloading '../../../../../tmp/tmpfs9qbgikdvc-clone/cows/daemon.cow' to 'cows/daemon.cow'
2021-01-29 12:49:18,920 DEBUG: Downloading '../../../../../tmp/tmpfs9qbgikdvc-clone/cows/hellokitty.cow' to 'cows/hellokitty.cow'
2021-01-29 12:49:18,920 DEBUG: Downloading '../../../../../tmp/tmpfs9qbgikdvc-clone/cows/turkey.cow' to 'cows/turkey.cow'
2021-01-29 12:49:18,921 DEBUG: Downloading '../../../../../tmp/tmpfs9qbgikdvc-clone/cows/koala.cow' to 'cows/koala.cow'
2021-01-29 12:49:18,921 DEBUG: Downloading '../../../../../tmp/tmpfs9qbgikdvc-clone/cows/sodomized.cow' to 'cows/sodomized.cow'
2021-01-29 12:49:18,923 DEBUG: Downloading '../../../../../tmp/tmpfs9qbgikdvc-clone/cows/small.cow' to 'cows/small.cow'
2021-01-29 12:49:18,924 DEBUG: Downloading '../../../../../tmp/tmpfs9qbgikdvc-clone/cows/bong.cow' to 'cows/bong.cow'
2021-01-29 12:49:18,924 DEBUG: Downloading '../../../../../tmp/tmpfs9qbgikdvc-clone/cows/three-eyes.cow' to 'cows/three-eyes.cow'
2021-01-29 12:49:18,924 DEBUG: Downloading '../../../../../tmp/tmpfs9qbgikdvc-clone/cows/kitty.cow' to 'cows/kitty.cow'
2021-01-29 12:49:18,926 DEBUG: Downloading '../../../../../tmp/tmpfs9qbgikdvc-clone/cows/elephant-in-snake.cow' to 'cows/elephant-in-snake.cow'
2021-01-29 12:49:18,927 DEBUG: Downloading '../../../../../tmp/tmpfs9qbgikdvc-clone/cows/eyes.cow' to 'cows/eyes.cow'
2021-01-29 12:49:18,930 DEBUG: Downloading '../../../../../tmp/tmpfs9qbgikdvc-clone/cows/default.cow' to 'cows/default.cow'
2021-01-29 12:49:18,931 DEBUG: Downloading '../../../../../tmp/tmpfs9qbgikdvc-clone/cows/milk.cow' to 'cows/milk.cow'
2021-01-29 12:49:18,932 DEBUG: Downloading '../../../../../tmp/tmpfs9qbgikdvc-clone/cows/moofasa.cow' to 'cows/moofasa.cow'
2021-01-29 12:49:18,934 DEBUG: Downloading '../../../../../tmp/tmpfs9qbgikdvc-clone/cows/cheese.cow' to 'cows/cheese.cow'
2021-01-29 12:49:18,935 DEBUG: Downloading '../../../../../tmp/tmpfs9qbgikdvc-clone/cows/bud-frogs.cow' to 'cows/bud-frogs.cow'
2021-01-29 12:49:18,951 DEBUG: Removing '/home/saugat/repos/iterative/dvc/.NGMhxvMD6hfJNhXUgwC6nX'
2021-01-29 12:49:18,962 ERROR: unexpected error - SHA b'##' could not be resolved, git returned: b'##'
------------------------------------------------------------
Traceback (most recent call last):
  File "/home/saugat/repos/iterative/dvc/dvc/main.py", line 50, in main
    ret = cmd.run()
  File "/home/saugat/repos/iterative/dvc/dvc/command/get.py", line 31, in run
    return self._get_file_from_repo()
  File "/home/saugat/repos/iterative/dvc/dvc/command/get.py", line 37, in _get_file_from_repo
    Repo.get(
  File "/home/saugat/repos/iterative/dvc/dvc/repo/get.py", line 55, in get
    repo.repo_tree.download(from_info, to_info, follow_subrepos=False)
  File "/home/saugat/repos/iterative/dvc/dvc/tree/base.py", line 393, in download
    return self._download_dir(
  File "/home/saugat/repos/iterative/dvc/dvc/tree/base.py", line 435, in _download_dir
    raise exc
  File "/usr/lib/python3.9/concurrent/futures/thread.py", line 52, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/home/saugat/repos/iterative/dvc/dvc/progress.py", line 129, in wrapped
    res = fn(*args, **kwargs)
  File "/home/saugat/repos/iterative/dvc/dvc/tree/base.py", line 447, in _download_file
    self._download(  # noqa, pylint: disable=no-member
  File "/home/saugat/repos/iterative/dvc/dvc/tree/repo.py", line 415, in _download
    with self.open(from_info, "rb", **kwargs) as from_fobj:
  File "/home/saugat/repos/iterative/dvc/dvc/tree/repo.py", line 137, in open
    return tree.open(path_info, mode=mode, encoding=encoding)
  File "/home/saugat/repos/iterative/dvc/dvc/tree/git.py", line 59, in open
    return self.trie.open(key, mode=mode, encoding=encoding)
  File "/home/saugat/repos/iterative/dvc/dvc/scm/git/objects.py", line 74, in open
    return obj.open(mode=mode, encoding=encoding)
  File "/home/saugat/repos/iterative/dvc/dvc/scm/git/backend/gitpython.py", line 63, in open
    data = self.obj.data_stream.read()
  File "/home/saugat/venvs/dvc/env39/lib/python3.9/site-packages/git/objects/base.py", line 112, in data_stream
    return self.repo.odb.stream(self.binsha)
  File "/home/saugat/venvs/dvc/env39/lib/python3.9/site-packages/git/db.py", line 42, in stream
    hexsha, typename, size, stream = self._git.stream_object_data(bin_to_hex(sha))
  File "/home/saugat/venvs/dvc/env39/lib/python3.9/site-packages/git/cmd.py", line 1086, in stream_object_data
    hexsha, typename, size = self.__get_object_header(cmd, ref)
  File "/home/saugat/venvs/dvc/env39/lib/python3.9/site-packages/git/cmd.py", line 1058, in __get_object_header
    return self._parse_object_header(cmd.stdout.readline())
  File "/home/saugat/venvs/dvc/env39/lib/python3.9/site-packages/git/cmd.py", line 1022, in _parse_object_header
    raise ValueError("SHA %s could not be resolved, git returned: %r" % (tokens[0], header_line.strip()))
ValueError: SHA b'##' could not be resolved, git returned: b'##'
------------------------------------------------------------
2021-01-29 12:49:19,737 DEBUG: Version info for developers:
DVC version: 2.0.0a0+bb4604.mod 
---------------------------------
Platform: Python 3.9.1 on Linux-5.10.11-arch1-1-x86_64-with-glibc2.32
Supports: azure, gdrive, gs, http, https, s3, ssh, oss, webdav, webdavs, webhdfs
Cache types: hardlink, symlink
Cache directory: ext4 on /dev/sda9
Caches: local
Remotes: https
Workspace directory: ext4 on /dev/sda9
Repo: dvc, git

Having any troubles? Hit us up at https://dvc.org/support, we are always happy to help!
  1. Pygit2
dvc get https://github.com/schacon/cowsay cows -v
2021-01-29 12:50:06,146 DEBUG: Creating external repo https://github.com/schacon/cowsay@None
2021-01-29 12:50:06,147 DEBUG: erepo: git clone 'https://github.com/schacon/cowsay' to a temporary dir
2021-01-29 12:50:09,697 DEBUG: Removing '/home/saugat/repos/iterative/dvc/.UjjDH7jJPD7FsqFSDag2ck'                                      
2021-01-29 12:50:09,708 ERROR: unexpected error - an integer is required
------------------------------------------------------------
Traceback (most recent call last):
  File "/home/saugat/repos/iterative/dvc/dvc/main.py", line 50, in main
    ret = cmd.run()
  File "/home/saugat/repos/iterative/dvc/dvc/command/get.py", line 31, in run
    return self._get_file_from_repo()
  File "/home/saugat/repos/iterative/dvc/dvc/command/get.py", line 37, in _get_file_from_repo
    Repo.get(
  File "/home/saugat/repos/iterative/dvc/dvc/repo/get.py", line 55, in get
    repo.repo_tree.download(from_info, to_info, follow_subrepos=False)
  File "/home/saugat/repos/iterative/dvc/dvc/tree/base.py", line 393, in download
    return self._download_dir(
  File "/home/saugat/repos/iterative/dvc/dvc/tree/base.py", line 401, in _download_dir
    from_infos = list(self.walk_files(from_info, **kwargs))
  File "/home/saugat/repos/iterative/dvc/dvc/tree/repo.py", line 369, in walk_files
    for root, _, files in self.walk(top, **kwargs):
  File "/home/saugat/repos/iterative/dvc/dvc/tree/repo.py", line 355, in walk
    if not dvc_tree or (repo_exists and dvc_tree.isdvc(top)):
  File "/home/saugat/repos/iterative/dvc/dvc/tree/dvc.py", line 207, in isdvc
    meta = self.metadata(path)
  File "/home/saugat/repos/iterative/dvc/dvc/tree/dvc.py", line 243, in metadata
    outs = self._find_outs(path_info, strict=False, recursive=True)
  File "/home/saugat/repos/iterative/dvc/dvc/tree/dvc.py", line 33, in _find_outs
    outs = self.repo.find_outs_by_path(path, *args, **kwargs)
  File "/home/saugat/repos/iterative/dvc/dvc/repo/__init__.py", line 433, in find_outs_by_path
    outs = outs or self.outs_graph
  File "/home/saugat/venvs/dvc/env39/lib/python3.9/site-packages/funcy/objects.py", line 28, in __get__
    res = instance.__dict__[self.fget.__name__] = self.fget(instance)
  File "/home/saugat/repos/iterative/dvc/dvc/repo/__init__.py", line 411, in outs_graph
    return build_outs_graph(self.graph, self.outs_trie)
  File "/home/saugat/venvs/dvc/env39/lib/python3.9/site-packages/funcy/objects.py", line 28, in __get__
    res = instance.__dict__[self.fget.__name__] = self.fget(instance)
  File "/home/saugat/repos/iterative/dvc/dvc/repo/__init__.py", line 407, in graph
    return build_graph(self.stages, self.outs_trie)
  File "/home/saugat/venvs/dvc/env39/lib/python3.9/site-packages/funcy/objects.py", line 28, in __get__
    res = instance.__dict__[self.fget.__name__] = self.fget(instance)
  File "/home/saugat/repos/iterative/dvc/dvc/repo/__init__.py", line 429, in stages
    return self.stage.collect_repo(onerror=error_handler)
  File "/home/saugat/repos/iterative/dvc/dvc/repo/stage.py", line 397, in collect_repo
    for root, dirs, files in self.tree.walk(self.repo.root_dir):
  File "/home/saugat/repos/iterative/dvc/dvc/tree/git.py", line 109, in walk
    if not self.isdir(top, use_dvcignore=use_dvcignore):
  File "/home/saugat/repos/iterative/dvc/dvc/tree/git.py", line 88, in isdir
    return self.trie.isdir(key)
  File "/home/saugat/repos/iterative/dvc/dvc/scm/git/objects.py", line 84, in isdir
    return obj.isdir
  File "/home/saugat/repos/iterative/dvc/dvc/scm/git/objects.py", line 40, in isdir
    return stat.S_ISDIR(self.mode)
TypeError: an integer is required
------------------------------------------------------------
2021-01-29 12:50:10,419 DEBUG: Version info for developers:
DVC version: 2.0.0a0+bb4604.mod 
---------------------------------
Platform: Python 3.9.1 on Linux-5.10.11-arch1-1-x86_64-with-glibc2.32
Supports: azure, gdrive, gs, http, https, s3, ssh, oss, webdav, webdavs, webhdfs
Cache types: hardlink, symlink
Cache directory: ext4 on /dev/sda9
Caches: local
Remotes: https
Workspace directory: ext4 on /dev/sda9
Repo: dvc, git

Having any troubles? Hit us up at https://dvc.org/support, we are always happy to help!

I guess pygit2 is not completely functional right now (could be fixed easily as somewhere a wrong mode is being returned), but what's up with gitpython. Race condition between backends?

@dberenbaum dberenbaum mentioned this issue Jan 29, 2021
31 tasks
@efiop efiop added research p1-important Important, aka current backlog of things to do and removed p1-important Important, aka current backlog of things to do labels Feb 1, 2021
@skshetry
Copy link
Member Author

I can no longer reproduce this. Closing ...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Did we break something? p0-critical Critical issue. Needs to be fixed ASAP. research
Projects
None yet
Development

No branches or pull requests

2 participants