Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stream progress #2875

Merged
merged 5 commits into from
Dec 3, 2019
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
39 changes: 22 additions & 17 deletions dvc/remote/gdrive/__init__.py → dvc/remote/gdrive.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@
from funcy import retry, compose, decorator, wrap_with
from funcy.py3 import cat

from dvc.remote.gdrive.utils import TrackFileReadProgress, FOLDER_MIME_TYPE
from dvc.progress import Tqdm
from dvc.scheme import Schemes
from dvc.path_info import CloudURLInfo
from dvc.remote.base import RemoteBASE
Expand All @@ -17,6 +17,7 @@
from dvc.utils import tmp_fname

logger = logging.getLogger(__name__)
FOLDER_MIME_TYPE = "application/vnd.google-apps.folder"


class GDriveRetriableError(DvcException):
Expand Down Expand Up @@ -92,30 +93,34 @@ def gdrive_upload_file(
item = self.drive.CreateFile(
{"title": args["title"], "parents": [{"id": args["parent_id"]}]}
)
self.upload_file(item, no_progress_bar, from_file, progress_name)
return item

def upload_file(self, item, no_progress_bar, from_file, progress_name):
with open(from_file, "rb") as opened_file:
if not no_progress_bar:
opened_file = TrackFileReadProgress(progress_name, opened_file)
# PyDrive doesn't like content property setting for empty files
# https://github.com/gsuitedevs/PyDrive/issues/121
if os.stat(from_file).st_size:
item.content = opened_file
item.Upload()
with open(from_file, "rb") as fobj:
total = os.path.getsize(from_file)
with Tqdm.wrapattr(
fobj,
"read",
desc=progress_name,
total=total,
disable=no_progress_bar,
) as wrapped:
# PyDrive doesn't like content property setting for empty files
# https://github.com/gsuitedevs/PyDrive/issues/121
if total:
item.content = wrapped
item.Upload()
return item

@gdrive_retry
def gdrive_download_file(
self, file_id, to_file, progress_name, no_progress_bar
):
from dvc.progress import Tqdm

gdrive_file = self.drive.CreateFile({"id": file_id})
bar_format = (
"Donwloading {desc:{ncols_desc}.{ncols_desc}}... "
+ Tqdm.format_sizeof(int(gdrive_file["fileSize"]), "B", 1024)
)
with Tqdm(
Copy link
Contributor Author

@casperdcl casperdcl Dec 2, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe we should remove this completely for now... Right now, a user will see something like "0/10GB" for 10mins before suddenly the operation completes and the bar disappears. It's probably better if instead we say "Downloading...".

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@casperdcl Let's keep it as is, at least there are some notifications (binary, but still) 🙂

Copy link
Contributor Author

@casperdcl casperdcl Dec 2, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes but if I saw a "0/10GB" bar doing nothing for a minute I'd assume something was wrong and nothing was happening & kill the process

Copy link
Contributor

@efiop efiop Dec 2, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@casperdcl But what are our options at this point? I suppose:

  1. remove it (which will just leave an empty screen, which is worse)
  2. add some kind of temporary spinner to keep the user entertained
  3. dive into pydrive and hack a proper solution for us and submit it to pydrive

am I missing something? If not, which one do you suggest? 🙂

Copy link
Contributor Author

@casperdcl casperdcl Dec 2, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely option 2 (well, not a spinner, but just the word "Downloading...")

  1. remove it: agreed bad
  2. change to something else: technically correct
  3. properly fix: would take a while and better to do in other PR(s)
  4. leave as-is (0/10GB hanging): I think worse than (1) because I'd be more likely to incorrectly think the program was hanging

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@casperdcl The word will be static? If so, it doesn't seem very different from the current approach. Maybe Downloading... (no progress available) to explain? 🙂 Though, we've just merged gdrive support and haven't even documented it yet, so perhaps we are overthinking the importance of this at this particular moment and we probably could simply wait for that stuff to get properly fixed before publically announcing the support.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shcheklein my 2cs - I would vote for spinner in these cases. And I think it's good to have them in our toolkit anyways - will give us more flexibility in terms of building UI. @casperdcl do you have a spinner/super-compact mode for TQDM ? :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah idk I still prefer a correct static word to a misleading and incorrect 0/10GB. Of course we could try to improve the static word but atm tqdm doesn't support threaded spinners (i.e. a background thread updates the screen even though no calls to update or __iter__ were performed)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Always happy to accept a PR :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Guys, we are overthinking this one, it will probably get fixed properly very soon, so any placeholder will do (even the previous one, we used to have it for gs for the longest time and didn't receive that much complains, so I'm not really worried about the current temp solution). I'm fine with the current static "Downloading" for now.

desc=progress_name,
total=int(gdrive_file["fileSize"]),
disable=no_progress_bar,
bar_format=bar_format, desc=progress_name, disable=no_progress_bar
):
gdrive_file.GetContentFile(to_file)

Expand Down
25 changes: 0 additions & 25 deletions dvc/remote/gdrive/utils.py

This file was deleted.

50 changes: 18 additions & 32 deletions dvc/remote/gs.py
Original file line number Diff line number Diff line change
Expand Up @@ -58,23 +58,15 @@ def _upload_to_bucket(
no_progress_bar=True,
):
blob = bucket.blob(to_info.path, chunk_size=chunk_size)
with Tqdm(
desc=name or to_info.path,
total=os.path.getsize(from_file),
bytes=True,
disable=no_progress_bar,
) as pbar:
with io.open(from_file, mode="rb") as fobj:
raw_read = fobj.read

def read(size=chunk_size):
res = raw_read(size)
if res:
pbar.update(len(res))
return res

fobj.read = read
blob.upload_from_file(fobj)
with io.open(from_file, mode="rb") as fobj:
with Tqdm.wrapattr(
fobj,
"read",
desc=name or to_info.path,
total=os.path.getsize(from_file),
disable=no_progress_bar,
) as wrapped:
blob.upload_from_file(wrapped)


class RemoteGS(RemoteBASE):
Expand Down Expand Up @@ -162,21 +154,15 @@ def _upload(self, from_file, to_info, name=None, no_progress_bar=True):
def _download(self, from_info, to_file, name=None, no_progress_bar=True):
bucket = self.gs.bucket(from_info.bucket)
blob = bucket.get_blob(from_info.path)
with Tqdm(
desc=name or from_info.path,
total=blob.size,
bytes=True,
disable=no_progress_bar,
) as pbar:
with io.open(to_file, mode="wb") as fobj:
raw_write = fobj.write

def write(byte_string):
raw_write(byte_string)
pbar.update(len(byte_string))

fobj.write = write
blob.download_to_file(fobj)
with io.open(to_file, mode="wb") as fobj:
with Tqdm.wrapattr(
fobj,
"write",
desc=name or from_info.path,
total=blob.size,
disable=no_progress_bar,
) as wrapped:
blob.download_to_file(wrapped)

def _generate_download_url(self, path_info, expires=3600):
expiration = timedelta(seconds=int(expires))
Expand Down
2 changes: 1 addition & 1 deletion setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -77,7 +77,7 @@ def run(self):
"funcy>=1.14",
"pathspec>=0.6.0",
"shortuuid>=0.5.0",
"tqdm>=4.38.0,<5",
"tqdm>=4.40.0,<5",
"packaging>=19.0",
"win-unicode-console>=0.5; sys_platform == 'win32'",
"pywin32>=225; sys_platform == 'win32'",
Expand Down