-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
dvc data status reports imported directories as "not in remote" #9346
Comments
@efiop looks like Is there a plan to include info about an entry being part of an import stage in the data index or should this "filter" happen at the UI level (i.e. post-process |
@dberenbaum I am putting p1 here as it creates quite some noise in repos using |
imports are recorded in the index, but as a source for the outputs, which they really are. The issue here is really that in data status we check against a particular remote, which, as expected, doesn't have imports. We should check against all corresponding remotes instead (e.g. if there are per-output remotes), which for imports might also mean that we should skip them. |
We've updated to v2.58.2 and we are no longer seeing "not in remote". Instead, the status of imports is "deleted", which again feels misleading. |
@johnyaku when does it show as "deleted"? When they are missing locally, or even when they exist locally? |
The only common denominator in the spurious "deleted" reports is that all of these files have been imported from a dvc data registry. The files report as "deleted" all exist in the local workspace, as links to a shared external cache. So this looks to me like another twist on the "not in remote" message, which has been fixed by no longer using this as the default message, but I suspect that the basic problem is the same. Namely, |
@johnyaku Are you still able to reproduce the issues? These days |
@johnyaku Or, if you are still able to reproduce, I'm happy to maybe jump on a quick call to figure it out at the spot. |
Apologies for the slow response on this one. Thanks to your help the other day we have our index mirroring sorted out now and I can confirm that I can reproduce what my colleague was seeing. |
I tried to create a reprex using DVC v3.15.2 to check if this had been fixed since v2.58.2. I made a toy registry here: https://github.com/johnyaku/test_reg This contains one file ( I then create a toy dataset here: [email protected]:johnyaku/imp_test.git Nothing to see there yet, because
This seems reminiscent of this issue: iterative/dvc-gdrive#29 I can paste a full stack trace if you like, but the main take-aways are as follows:
I know we have veered off into a new issue here, but I'll need to work thru this in order to create a reprex. |
@johnyaku Can't reproduce original issue anymore with newest dvc. Regarding Feel free to create a new issue if you run into anything still not working. |
Bug Report
Description
dvc data status
reports imported directories as "not in remote".Technically, this is correct, as the data is in the remote for the source repo, not the current repo. But this is a bit confusing.
Reproduce
Expected
Either no message about not being in the remote.
Or, perhaps more helpfully,
dvc data status
could look up the remote for the source repo and check if the data is there, and only report a problem if it is not found.Environment information
dvc 2.53
The text was updated successfully, but these errors were encountered: