Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix for local file uploads without scheme #1326

Open
wants to merge 6 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 3 additions & 1 deletion clearml/datasets/dataset.py
Original file line number Diff line number Diff line change
Expand Up @@ -3350,6 +3350,8 @@ def _add_external_files(
# noinspection PyBroadException
try:
if StorageManager.exists_file(source_url):
# handle local path provided without scheme
source_url = StorageHelper.sanitize_url(source_url)
remote_objects = [StorageManager.get_metadata(source_url, return_full_path=True)]
elif not source_url.startswith(("http://", "https://")):
if source_url[-1] != "/":
Expand All @@ -3368,7 +3370,7 @@ def _add_external_files(
link = remote_object.get("name")
relative_path = link[len(source_url):]
if not relative_path:
relative_path = source_url.split("/")[-1]
relative_path = os.path.basename(source_url)
if not matches_any_wildcard(relative_path, wildcard, recursive=recursive):
continue
try:
Expand Down
8 changes: 8 additions & 0 deletions clearml/storage/helper.py
Original file line number Diff line number Diff line change
Expand Up @@ -3067,6 +3067,14 @@ def exists_file(self, remote_url):
return self._driver.exists_file(
container_name=self._container.name if self._container else "", object_name=object_name
)

@classmethod
def sanitize_url(cls, remote_url):
base_url = cls._resolve_base_url(remote_url)
if base_url != 'file://' or remote_url.startswith("file://"):
return remote_url
absoulte_path = os.path.abspath(remote_url)
return base_url + absoulte_path
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that, in case a user adds a file that is already prefixed with file://, the os.path.abspath(remote_url) will return an unexpected value. For example:

>>> os.path.abspath("file:///home/ajecc/work")
'/home/ajecc/file:/home/ajecc/work'

I suggest checking if the remote_url already starts with file://, in which case you don't need to add the base_url

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for getting it @eugen-ajechiloae-clearml . Sorry i missed this case. Updated the code and tested it.
image

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @eugen-ajechiloae-clearml , please review this change once when you get the time.



def normalize_local_path(local_path):
Expand Down
Binary file modified docs/experiment_manager.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/orchestration.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.