Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Smarter logic when copying data links #9522

Closed
radeusgd opened this issue Mar 22, 2024 · 3 comments · Fixed by #9750
Closed

Smarter logic when copying data links #9522

radeusgd opened this issue Mar 22, 2024 · 3 comments · Fixed by #9750
Assignees
Labels
-libs Libraries: New libraries to be implemented l-cloud-integration Enso Cloud integration work

Comments

@radeusgd
Copy link
Member

When working on #9485 I implemented the 'symlinks' metaphor for datalinks and tested the ability to copy them.

Currently, the copy/move logic does not follow the data link but instead copies/moves its raw config. This is sometimes desired but sometimes not. If one of the (source, destination) pair is a datalink but the other is not it seems more intuitive to follow the datalink and perform regular copying.

This ticket introduces exactly that, as explained in: #9485 (comment)

However, it feels like sometimes it may be useful to be able to copy actual data the datalink points to.

I was thinking if we should actually differentiate the behaviour based on what the source and destination are. We can tell if a File is a datalink without reading it (and it doesn't even have to exist yet) - it's based on the file metadata, mostly just file.extension == ".datalink" check, alternatively content-type for HTTP and asset_type==Data_Link for Enso_File - so we could rely on that.

I'd suggest to modify the behaviour (in a followup ticket, not here) in the following way:

  • if both source and destination are a data link, the data link config is copied from one place to another - this is the currently implemented behaviour - sticking to the symlink metaphor.

  • if both source and destination are not data links, the data is copied/moved as data - that is of course the current behaviour and always has been.

  • NOW the new idea is:

    • if the source is a data link but the destination is a regular file, we'd copy the data from the data link's target into the destination;
    • and if the source is a regular file and the destination is a data link, we perform a write to the data link's target of the source data (not yet possible, Writing to DataLinks #9292).

Currently the behaviour is weird, as can be seen in these tests:

  • if the source is a datalink but the target is not, the config is copied to the target and the target file exposes the raw datalink config;
  • if the source is a regular file but the target is a datalink, that creates a new datalink that may e.g. be malformed.

I think the proposed behaviour would be better and more intuitive. What do you think? If you think this makes sense, I will create a ticket to implement it.

@radeusgd radeusgd added -libs Libraries: New libraries to be implemented l-cloud-integration Enso Cloud integration work labels Mar 22, 2024
@radeusgd radeusgd self-assigned this Mar 22, 2024
@jdunkerley jdunkerley moved this from ❓New to 📤 Backlog in Issues Board Apr 2, 2024
@radeusgd radeusgd moved this from 📤 Backlog to 🔧 Implementation in Issues Board Apr 22, 2024
@enso-bot enso-bot bot mentioned this issue Apr 22, 2024
5 tasks
@radeusgd
Copy link
Member Author

Summary of Libs Team discussion:

  • File.copy_to is 'dumb' (and it should be) - not aware of underlying data format - csv.copy_to json will break the data and that's expected - it's just copying files.
  • Data Links abstract away the data format - I don't know if it's CSV or JSON under the hood - the data link just gives me back data (e.g. a Table) and can be written to
  • thus we have a bit of mismatched levels of abstraction - data links are higher level and copy_to is low level
  • thus we should not allow copying to/from data links as it's confusing. The right way is to read the datalink (materializing whatever format it was from into pure data in Enso) and then write it
  • it is useful to copy a data link to another data link but the meaning is also unclear
  • thus we disallow that in File.copy_to, but add an 'advanced' method Data_Link.copy that allows to do that. If datalinks are present in File.copy_to, an Illegal_Argument error shall be thrown clearly telling the user that they should use Data_Link.copy instead.

Analogously for move_to - add a method Data_Link.move and disallow data links in File.move_to

@enso-bot
Copy link

enso-bot bot commented Apr 23, 2024

Radosław Waśko reports a new STANDUP for yesterday (2024-04-22):

Progress: Caught up on reviews. Refined specification for the semantics of copying/moving datalinks. Updated tests to reflect the expected semantics, and increased coverage. Refactored Data_Link methods as they were are in a single module: now split into user-facing API, implementor-facing helpers and type classes. Implemented basic copy logic for data links, but not yet finished. It should be finished by 2024-04-24.

Next Day: Next day I will be working on the same task. Add informative errors when datalinks are copied with wrong method. Make the tests pass, prepare PR.

@radeusgd radeusgd mentioned this issue Apr 23, 2024
5 tasks
@radeusgd radeusgd moved this from 🔧 Implementation to 👁️ Code review in Issues Board Apr 23, 2024
@enso-bot
Copy link

enso-bot bot commented Apr 24, 2024

Radosław Waśko reports a new STANDUP for yesterday (2024-04-23):

Progress: Got the PR ready (added error messages, updated some tests, fixed problems that were causing failures). Reviews. Found an edge case with copy/move operation - added tests for it and working on fixes. It should be finished by 2024-04-24.

Next Day: Next day I will be working on the #9534 task. Fix the copy/move edge case in separate PR. Work on Test framework ticket.

@mergify mergify bot closed this as completed in #9750 Apr 25, 2024
@mergify mergify bot closed this as completed in 0c989c6 Apr 25, 2024
@github-project-automation github-project-automation bot moved this from 👁️ Code review to 🟢 Accepted in Issues Board Apr 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
-libs Libraries: New libraries to be implemented l-cloud-integration Enso Cloud integration work
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

1 participant