Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for complex file_id #156

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 8 additions & 4 deletions sqlalchemy_file/storage.py
Original file line number Diff line number Diff line change
Expand Up @@ -127,7 +127,7 @@ def get_file(cls, path: str) -> StoredFile:
The path is expected to be `storage_name/file_id`.
"""
upload_storage, file_id = cls._get_storage_and_file_id(path)
return StoredFile(StorageManager.get(upload_storage).get_object(file_id))
return StoredFile(cls.get(upload_storage).get_object(file_id))

@classmethod
def delete_file(cls, path: str) -> bool:
Expand All @@ -136,7 +136,7 @@ def delete_file(cls, path: str) -> bool:
The path is expected to be `storage_name/file_id`.
"""
upload_storage, file_id = cls._get_storage_and_file_id(path)
obj = StorageManager.get(upload_storage).get_object(file_id)
obj = cls.get(upload_storage).get_object(file_id)
if obj.driver.name == LOCAL_STORAGE_DRIVER_NAME:
"""Try deleting associated metadata file"""
with contextlib.suppress(ObjectDoesNotExistError):
Expand All @@ -156,5 +156,9 @@ def _get_storage_and_file_id(cls, path: str) -> Tuple[str, str]:

The path is expected to be `storage_name/file_id`.
"""
path_parts = path.split("/")
return "/".join(path_parts[:-1]), path_parts[-1]
# first trying complex file_id
storage_name, file_id = path.split("/", 1)
if storage_name in cls._storages:
return storage_name, file_id
# the trying complex storage name
return tuple(path.rsplit("/", 1))
Comment on lines +160 to +164
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This only supports two cases:

  • complex file_id (e.g folder/file) and simple storage name
  • complex storage name (e.g storage/folder) and simple file_id

We could add support for both complex storage_name (storage/folder) and file_id (subfolder/file)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for replying. Initially, I tried adding support for complex storage_name and file_id at the same time, but the challenge I ran into was that using / as a separator was insufficient to easily identify where the storage part ended and the file part began. At the time, I could not come up with a simple solution that would also be straightforward to the users. So I settled upon the simplest or approach that covers the case I'm interested in.

To give more context on the challenge of supporting complex storage_name and complex file_id in the same path, consider this example storage/folder_a/folder_b/folder_c/file. The parts [folder_a, folder_b, folder_c] can belong to either the storage, the file, or both (in different combinations, where complexity increases with the number of sub-folders).
One way to solve it is to introduce a special path separator that indicates that the left part is a storage and the right part is a file, something like // so users would use storage/folder_a // folder_b/folder_c/file. It's a simple approach, but it breaks backward compatibility.
Another approach is similar to the current solution: test the combination of sub-paths against storage and return the first matching. However, this will require adding complex logic to rule out possible conflicts, not to mention getting the time complexity into at least O(N) space.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The O(N) complexity is not bad assuming that N<= 10 for most cases. What I am worried about is whether this is supported by Apache libcloud.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I am worried about is whether this is supported by Apache libcloud.

I don't see why libcloud would not support it, as far as the storage name is detected correctly. The tests are passing, and, in my fork, I have it running successfully against S3 as a storage.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, in that case feel free to update and include some tests

12 changes: 12 additions & 0 deletions tests/test_storage_manager.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,18 @@ def test_get_storage_and_file_id(self) -> None:
"file",
)

def test_get_storage_and_complex_file_id(self) -> None:
StorageManager.add_storage("storage", get_dummy_container("storage"))

assert StorageManager._get_storage_and_file_id("storage/file") == (
"storage",
"file",
)
assert StorageManager._get_storage_and_file_id("storage/folder/file") == (
"storage",
"folder/file",
)

def test_first_configured_is_default(self) -> None:
StorageManager.add_storage("first", get_dummy_container("first"))
StorageManager.add_storage("second", get_dummy_container("second"))
Expand Down
Loading