-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DVC add fails when there are broken symlinks in the dataset #3717
Comments
efiop
added
bug
Did we break something?
p2-medium
Medium priority, should be done, but less important
labels
May 3, 2020
I get the following error now. Although it says I think this should error out, and should not suppress or add a file anyway like $ dvc add data
Adding...
ERROR: output 'data' does not exist: [Errno 2] No such file or directory: '/Users/saugat/Projects/iterative/dvc/data/bar' Verbose logging
$ dvc add data -v
2024-03-26 15:50:01,042 DEBUG: v3.48.5.dev11+ge223f51e7, CPython 3.12.2 on macOS-14.4-x86_64-i386-64bit
2024-03-26 15:50:01,043 DEBUG: command: /Users/saugat/Projects/iterative/dvc/.venv/bin/dvc add data -v
Adding...
2024-03-26 15:50:01,625 ERROR: output 'data' does not exist: [Errno 2] No such file or directory: '/Users/saugat/Projects/iterative/dvc/data/bar'
Traceback (most recent call last):
File "/Users/saugat/Projects/iterative/dvc/dvc/output.py", line 1359, in add
staging, meta, obj = self._build(
^^^^^^^^^^^^
File "/Users/saugat/Projects/iterative/dvc/dvc/output.py", line 545, in _build
return build(*args, callback=pb.as_callback(), **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/saugat/Projects/iterative/dvc/.venv/lib/python3.12/site-packages/dvc_data/hashfile/build.py", line 257, in build
meta, obj = _build_tree(
^^^^^^^^^^^^
File "/Users/saugat/Projects/iterative/dvc/.venv/lib/python3.12/site-packages/dvc_data/hashfile/build.py", line 145, in _build_tree
meta, obj = _build_file(
^^^^^^^^^^^^
File "/Users/saugat/Projects/iterative/dvc/.venv/lib/python3.12/site-packages/dvc_data/hashfile/build.py", line 74, in _build_file
meta, hash_info = hash_file(path, fs, name, state=state)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/saugat/Projects/iterative/dvc/.venv/lib/python3.12/site-packages/dvc_data/hashfile/hash.py", line 199, in hash_file
hash_value, meta = _hash_file(path, fs, name, callback=cb, info=info)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/saugat/Projects/iterative/dvc/.venv/lib/python3.12/site-packages/dvc_data/hashfile/hash.py", line 139, in _hash_file
info = info or fs.info(path)
^^^^^^^^^^^^^
File "/Users/saugat/Projects/iterative/dvc/.venv/lib/python3.12/site-packages/dvc_objects/fs/base.py", line 592, in info
return self.fs.info(path, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/saugat/Projects/iterative/dvc/.venv/lib/python3.12/site-packages/dvc_objects/fs/local.py", line 39, in info
return self.fs.info(path)
^^^^^^^^^^^^^^^^^^
File "/Users/saugat/Projects/iterative/dvc/.venv/lib/python3.12/site-packages/fsspec/implementations/local.py", line 90, in info
out = os.stat(path, follow_symlinks=True)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
FileNotFoundError: [Errno 2] No such file or directory: '/Users/saugat/Projects/iterative/dvc/data/bar'
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/Users/saugat/Projects/iterative/dvc/dvc/commands/add.py", line 45, in run
self.repo.add(
File "/Users/saugat/Projects/iterative/dvc/dvc/repo/__init__.py", line 58, in wrapper
return f(repo, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/saugat/Projects/iterative/dvc/dvc/repo/scm_context.py", line 143, in run
return method(repo, *args, **kw)
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/saugat/Projects/iterative/dvc/dvc/repo/add.py", line 227, in add
_add(stage, source if output_exists else None, no_commit=no_commit)
File "/Users/saugat/Projects/iterative/dvc/dvc/repo/add.py", line 178, in _add
stage.add_outs(path, no_commit=no_commit)
File "/Users/saugat/Projects/iterative/dvc/.venv/lib/python3.12/site-packages/funcy/decorators.py", line 47, in wrapper
return deco(call, *dargs, **dkwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/saugat/Projects/iterative/dvc/dvc/stage/decorators.py", line 44, in rwlocked
return call()
^^^^^^
File "/Users/saugat/Projects/iterative/dvc/.venv/lib/python3.12/site-packages/funcy/decorators.py", line 68, in __call__
return self._func(*self._args, **self._kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/saugat/Projects/iterative/dvc/dvc/stage/__init__.py", line 574, in add_outs
out.add(filter_info, **kwargs)
File "/Users/saugat/Projects/iterative/dvc/dvc/output.py", line 1369, in add
raise self.DoesNotExistError(self) from exc
dvc.output.OutputDoesNotExistError: output 'data' does not exist
2024-03-26 15:50:01,635 DEBUG: Analytics is disabled. |
skshetry
added a commit
to skshetry/dvc
that referenced
this issue
Mar 26, 2024
`dvc add` command incorrectly raises a `DoesNotExistError` when a broken symlink exists in an output directory, and the target name is same as the directory's name. eg: If `data` is an output, and is the command is invoked as `dvc add data` (i.e. no virtual directory operations to perform). The expected behavior to raise a `FileNotFoundError`. `DoesNotExistError` should only be raised if the output itself does not exist. Related: iterative#3717
skshetry
added a commit
that referenced
this issue
Mar 26, 2024
…10373) `dvc add` command incorrectly raises a `DoesNotExistError` when a broken symlink exists in an output directory, and the target name is same as the directory's name. eg: If `data` is an output, and is the command is invoked as `dvc add data` (i.e. no virtual directory operations to perform). The expected behavior to raise a `FileNotFoundError`. `DoesNotExistError` should only be raised if the output itself does not exist. Related: #3717
BradyJ27
pushed a commit
to BradyJ27/dvc
that referenced
this issue
Apr 22, 2024
…terative#10373) `dvc add` command incorrectly raises a `DoesNotExistError` when a broken symlink exists in an output directory, and the target name is same as the directory's name. eg: If `data` is an output, and is the command is invoked as `dvc add data` (i.e. no virtual directory operations to perform). The expected behavior to raise a `FileNotFoundError`. `DoesNotExistError` should only be raised if the output itself does not exist. Related: iterative#3717
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
When t:rying to add TED-LIUM Release 3 (https://www.openslr.org/51/) to a data registry,
dvc add
failed. The problem was that this dataset has a few broken symlinks. Removing these symlinks fixed the problem. Here is the stack trace fromdvc add -v
The text was updated successfully, but these errors were encountered: