Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

get: ERROR: unexpected error - : config file error: no remote specified. #7168

Closed
adhadse opened this issue Dec 19, 2021 · 1 comment
Closed
Labels
A: data-sync Related to dvc get/fetch/import/pull/push awaiting response we are waiting for your reply, please respond! :)

Comments

@adhadse
Copy link

adhadse commented Dec 19, 2021

Bug Report

get: ERROR: unexpected error - : config file error: no remote specified.

Description

I wanted to just download DVC directory into local Google Colab environment. I was able to list all the subdirectories as well as individual files that were tracked by DVC by calling list command:

# ✅list all files
dvc list https://adhadse:<my_token>@dagshub.com/adhadse/<repo>.git data/raw/Training_consumption_data
>interim
>preprocessed
>raw

# ✅list sub directories correctly
!dvc list https://adhadse:<my_token>@dagshub.com/adhadse/<repo>.git data
>.gitignore
>consumption_training_data_apr21.csv
>consumption_training_data_apr21.csv.dvc
>consumption_training_data_aug21.csv
>...

But, when I try to get download data directory containing subdirectories, I get the following error:

!dvc get https://adhadse:<my_token>@dagshub.com/<my_token>/<repo> data
>ERROR: unexpected error - : config file error: no remote specified. Setup default remote with
>   dvc remote default <remote name>
>or use:
>   dvc <command> -r <remote name>

The problem is get command does not list any -r option for remote, neither it's intended for. I don't want to clone the whole git repo and pull it. What could be the possible issues, I also noticed that md5 checksums for files are None. Actually I had moved the files using move command into Training_consumption_data directory which is Currently inside raw directory (see the output of first dvc list command), but then moved the whole Training_consumption_data directory dragging into raw directory (which previously wasn't there).

I checked the .gitignore files and confirmed they were not tracked by git and git committed them, and also made the dvc push, all looked fine until I tried pulling the files.

I my moving a directory without using move command result of all this. I also noticed when cloning git repo and then pulling I was given warning and files weren't copied:

Warning: Some of the cache files do not exist neither locally nor on remote. Missing files are

image

But on HEAD of master branch, the dvc status -v list this: no md5 checksums, but individual .dvc file for each csv listed a md5 checksum.
image

The dvc did not raised any issue at all on moving the directory using OS GUI. I still don't have any Idea what went wrong.

Also is it possible to use DVC without copying files at all in Google Colab somewhat like Google Drive mount?

Reproduce

I don't know if it's reproducible, but I tried moving a directory containing dvc tracked files using OS GUI and NOT DVC command.

Expected

DVC should raise issue when tracked files are moved using OS GUI, and the get command should give more meaningful error for what went wrong.

Environment information

Google Colab standard environment.
Python 3.7.12
DVC 2.9.2

Output of dvc doctor:

$ dvc doctor
DVC version: 2.9.2 (pip)
---------------------------------
Platform: Python 3.7.12 on Linux-5.4.104+-x86_64-with-Ubuntu-18.04-bionic
Supports:
	hdfs (fsspec = 2021.11.1, pyarrow = 3.0.0),
	webhdfs (fsspec = 2021.11.1),
	http (aiohttp = 3.8.1, aiohttp-retry = 2.4.6),
	https (aiohttp = 3.8.1, aiohttp-retry = 2.4.6)

Additional Information (if any):

@karajan1001
Copy link
Contributor

  1. for dvc get, it will clone the remote repo into a temp local dir. But because the data tracked by DVC is not stored in the git repo, DVC still needs to download them from the remote cache. This is the reason why DVC is asking for a default remote.
  2. for the None md5 it looks like your repo is in some corrupt state, but I can't tell how it became like this. I tried on my own computer, mv DVC tracked files and directories in and out of directories but can't reproduce it.
  3. for Also is it possible to use DVC without copying files at all in Google Colab somewhat like Google Drive mount? This is external data in DVC but it only support s3, ssh and hdfs for now (doc)

@karajan1001 karajan1001 added the awaiting response we are waiting for your reply, please respond! :) label Dec 22, 2021
@daavoo daavoo added A: get Related to dvc get A: data-sync Related to dvc get/fetch/import/pull/push and removed A: get Related to dvc get labels Feb 22, 2022
@adhadse adhadse closed this as completed Mar 19, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A: data-sync Related to dvc get/fetch/import/pull/push awaiting response we are waiting for your reply, please respond! :)
Projects
None yet
Development

No branches or pull requests

3 participants