Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Writing to Data Links #9750

Merged
merged 53 commits into from
Apr 25, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
53 commits
Select commit Hold shift + click to select a range
d9019dc
adding tests 1
radeusgd Apr 18, 2024
ca4e509
tests data link in the Cloud
radeusgd Apr 18, 2024
17f583b
S3 test, append spec
radeusgd Apr 18, 2024
1573604
Postgres test
radeusgd Apr 18, 2024
177714a
Data_Link_With_Output_Stream stub
radeusgd Apr 18, 2024
53fce80
write
radeusgd Apr 18, 2024
9013f73
workaround for https://github.com/enso-org/enso/issues/8937
radeusgd Apr 18, 2024
07e698b
fixes in test file
radeusgd Apr 18, 2024
f4f6fff
fixes in test file
radeusgd Apr 18, 2024
2c186d5
writing to Enso_File links
radeusgd Apr 18, 2024
5faaa31
do not expose AWS keys in to_text
radeusgd Apr 18, 2024
f89a3c5
writing S3 links
radeusgd Apr 18, 2024
d1611e3
add `Writable_Data_Link`
radeusgd Apr 19, 2024
271fce4
update test
radeusgd Apr 19, 2024
416d52a
fixes
radeusgd Apr 19, 2024
d54b37d
fix unexpected dataflow error missing decorations because it was not …
radeusgd Apr 19, 2024
5c9ef7e
add stack trace to failure report
radeusgd Apr 19, 2024
4aa6ff7
do not follow datalink when writing raw config
radeusgd Apr 19, 2024
b7a2246
comment
radeusgd Apr 19, 2024
5fd4792
test
radeusgd Apr 19, 2024
ad5b9c8
ensuring the original datalink is returned when writing to a file
radeusgd Apr 19, 2024
f6e71d6
fixes
radeusgd Apr 19, 2024
a1f2aaa
update test
radeusgd Apr 19, 2024
6a91184
replace returned file with proxy
radeusgd Apr 19, 2024
90e1a4d
Revert "replace returned file with proxy"
radeusgd Apr 19, 2024
5e127fc
fix S3 test
radeusgd Apr 19, 2024
1c990cd
sigs
radeusgd Apr 19, 2024
7cb28a9
Reapply "replace returned file with proxy"
radeusgd Apr 19, 2024
7313124
comment
radeusgd Apr 19, 2024
352e3ad
changelog
radeusgd Apr 19, 2024
454f7f5
missing imports
radeusgd Apr 19, 2024
7e7e26c
test for target datalink not existing
radeusgd Apr 22, 2024
e047289
workaround for https://github.com/enso-org/enso/issues/9669
radeusgd Apr 22, 2024
f949e64
Merge branch 'refs/heads/develop' into wip/radeusgd/9292-write-to-dat…
radeusgd Apr 22, 2024
765da44
add copy and move for datalinks
radeusgd Apr 22, 2024
1372a80
refactor: move things around
radeusgd Apr 22, 2024
405b31b
fixes after refactor
radeusgd Apr 22, 2024
872b3c7
fix
radeusgd Apr 22, 2024
ed27b95
update tests to new semantics
radeusgd Apr 22, 2024
780949a
adding tests
radeusgd Apr 22, 2024
d4ef32d
fix test
radeusgd Apr 23, 2024
4dbaa39
Merge branch 'refs/heads/develop' into wip/radeusgd/9292-write-to-dat…
radeusgd Apr 23, 2024
ea3ce7d
error on incompatible combinations
radeusgd Apr 23, 2024
e751170
Merge branch 'refs/heads/develop' into wip/radeusgd/9292-write-to-dat…
radeusgd Apr 23, 2024
0a43982
update messages, tests
radeusgd Apr 23, 2024
6faf09d
fix - defer conversion
radeusgd Apr 23, 2024
9173165
update other FSs
radeusgd Apr 23, 2024
c2b82c8
Merge branch 'refs/heads/develop' into wip/radeusgd/9292-write-to-dat…
radeusgd Apr 24, 2024
1ddfea9
fix package.yaml
radeusgd Apr 24, 2024
95d68a5
fix a typo
radeusgd Apr 24, 2024
9288705
Merge branch 'refs/heads/develop' into wip/radeusgd/9292-write-to-dat…
radeusgd Apr 25, 2024
76a4033
fixing writing Images to datalinks
radeusgd Apr 25, 2024
ae1271e
Merge branch 'refs/heads/develop' into wip/radeusgd/9292-write-to-dat…
radeusgd Apr 25, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -657,6 +657,7 @@
- [Added `Table.running` method][9577]
- [Added `Excel_Workbook.read_many` allowing reading more than one sheet at a
time.][9759]
- [Added ability to write to Data Links.][9750]

[debug-shortcuts]:
https://github.com/enso-org/enso/blob/develop/app/gui/docs/product/shortcuts.md#debug
Expand Down Expand Up @@ -963,6 +964,7 @@
[9725]: https://github.com/enso-org/enso/pull/9725
[9759]: https://github.com/enso-org/enso/pull/9759
[9577]: https://github.com/enso-org/enso/pull/9577
[9750]: https://github.com/enso-org/enso/pull/9750

#### Enso Compiler

Expand Down
11 changes: 11 additions & 0 deletions distribution/lib/Standard/AWS/0.0.0-dev/src/AWS_Credential.enso
Original file line number Diff line number Diff line change
Expand Up @@ -92,3 +92,14 @@ type AWS_Credential
previous = Illegal_Argument.handle_java_exception <|
ClientBuilder.setDefaultCredentialOverride override.as_java
Panic.with_finalizer (ClientBuilder.setDefaultCredentialOverride previous) action

## PRIVATE
to_text self -> Text = case self of
AWS_Credential.Default -> "AWS_Credential.Default"
AWS_Credential.Profile profile -> "AWS_Credential.Profile " + profile
AWS_Credential.Key _ _ -> "AWS_Credential.Key"
AWS_Credential.With_Configuration base_credential region ->
base_credential.to_text + "(" + region.to_text + ")"

## PRIVATE
to_display_text self -> Text = self.to_text.to_display_text
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ private

from Standard.Base import all
import Standard.Base.Errors.Illegal_State.Illegal_State
from Standard.Base.Enso_Cloud.Data_Link import parse_secure_value
from Standard.Base.Enso_Cloud.Data_Link_Helpers import parse_secure_value
from Standard.Base.Enso_Cloud.Public_Utils import get_required_field

import project.AWS_Credential.AWS_Credential
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,11 @@ private

from Standard.Base import all
import Standard.Base.Errors.Illegal_State.Illegal_State
import Standard.Base.System.File.Generic.Writable_File.Writable_File
import Standard.Base.System.Input_Stream.Input_Stream
from Standard.Base.Enso_Cloud.Data_Link import Data_Link_With_Input_Stream, parse_format
import Standard.Base.System.Output_Stream.Output_Stream
from Standard.Base.Enso_Cloud.Data_Link_Capabilities import Data_Link_With_Input_Stream, Data_Link_With_Output_Stream, Writable_Data_Link
from Standard.Base.Enso_Cloud.Data_Link_Helpers import parse_format
from Standard.Base.Enso_Cloud.Public_Utils import get_optional_field, get_required_field

import project.AWS_Credential.AWS_Credential
Expand Down Expand Up @@ -38,5 +41,18 @@ type S3_Data_Link
with_input_stream self (open_options : Vector) (action : Input_Stream -> Any) -> Any =
self.as_file.with_input_stream open_options action

## PRIVATE
with_output_stream self (open_options : Vector) (action : Output_Stream -> Any) -> Any =
self.as_file.with_output_stream open_options action

## PRIVATE
as_writable_file self -> Writable_File = Writable_File.from self.as_file

## PRIVATE
Data_Link_With_Input_Stream.from (that:S3_Data_Link) = Data_Link_With_Input_Stream.Value that

## PRIVATE
Data_Link_With_Output_Stream.from (that:S3_Data_Link) = Data_Link_With_Output_Stream.Value that

## PRIVATE
Writable_Data_Link.from (that:S3_Data_Link) = Writable_Data_Link.Value that
27 changes: 15 additions & 12 deletions distribution/lib/Standard/AWS/0.0.0-dev/src/S3/S3_File.enso
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
from Standard.Base import all
import Standard.Base.Enso_Cloud.Data_Link
import Standard.Base.Enso_Cloud.Data_Link.Data_Link
import Standard.Base.Enso_Cloud.Data_Link_Helpers
import Standard.Base.Errors.Common.Syntax_Error
import Standard.Base.Errors.File_Error.File_Error
import Standard.Base.Errors.Illegal_Argument.Illegal_Argument
Expand Down Expand Up @@ -108,7 +109,7 @@ type S3_File
with_output_stream self (open_options : Vector) action = if self.is_directory then Error.throw (S3_Error.Error "S3 directory cannot be opened as a stream." self.uri) else
Context.Output.if_enabled disabled_message="Writing to an S3_File is forbidden as the Output context is disabled." panic=False <|
open_as_data_link = (open_options.contains Data_Link_Access.No_Follow . not) && (Data_Link.is_data_link self)
if open_as_data_link then Data_Link.write_data_link_as_stream self open_options action else
if open_as_data_link then Data_Link_Helpers.write_data_link_as_stream self open_options action else
if open_options.contains File_Access.Append then Error.throw (S3_Error.Error "S3 does not support appending to a file. Instead you may read it, modify and then write the new contents." self.uri) else
File_Access.ensure_only_allowed_options "with_output_stream" [File_Access.Write, File_Access.Create_New, File_Access.Truncate_Existing, File_Access.Create, Data_Link_Access.No_Follow] open_options <|
# The exists check is not atomic, but it is the best we can do with S3
Expand Down Expand Up @@ -140,7 +141,7 @@ type S3_File
with_input_stream : Vector File_Access -> (Input_Stream -> Any ! File_Error) -> Any ! S3_Error | Illegal_Argument
with_input_stream self (open_options : Vector) action = if self.is_directory then Error.throw (Illegal_Argument.Error "S3 folders cannot be opened as a stream." self.uri) else
open_as_data_link = (open_options.contains Data_Link_Access.No_Follow . not) && (Data_Link.is_data_link self)
if open_as_data_link then Data_Link.read_data_link_as_stream self open_options action else
if open_as_data_link then Data_Link_Helpers.read_data_link_as_stream self open_options action else
File_Access.ensure_only_allowed_options "with_input_stream" [File_Access.Read, Data_Link_Access.No_Follow] open_options <|
response_body = translate_file_errors self <| S3.get_object self.s3_path.bucket self.s3_path.key self.credentials delimiter=S3_Path.delimiter
response_body.with_stream action
Expand All @@ -163,7 +164,7 @@ type S3_File
@format File_Format.default_widget
read : File_Format -> Problem_Behavior -> Any ! S3_Error
read self format=Auto_Detect (on_problems=Problem_Behavior.Report_Warning) =
if Data_Link.is_data_link self then Data_Link.read_data_link self format on_problems else
if Data_Link.is_data_link self then Data_Link_Helpers.read_data_link self format on_problems else
File_Format.handle_format_missing_arguments format <| case format of
Auto_Detect -> if self.is_directory then format.read self on_problems else
response = translate_file_errors self <| S3.get_object self.s3_path.bucket self.s3_path.key self.credentials delimiter=S3_Path.delimiter
Expand Down Expand Up @@ -236,17 +237,19 @@ type S3_File
- destination: the destination to move the file to.
- replace_existing: specifies if the operation should proceed if the
destination file already exists. Defaults to `False`.
copy_to : Writable_File -> Boolean -> Any ! File_Error
copy_to self (destination : Writable_File) (replace_existing : Boolean = False) =
copy_to : File_Like -> Boolean -> Any ! File_Error
copy_to self (destination : File_Like) (replace_existing : Boolean = False) = Data_Link_Helpers.disallow_links_in_copy self destination <|
if self.is_directory then Error.throw (S3_Error.Error "Copying S3 folders is currently not implemented." self.uri) else
Context.Output.if_enabled disabled_message="Copying an S3_File is forbidden as the Output context is disabled." panic=False <|
case destination.file of
destination_writable = Writable_File.from destination
case destination_writable.file of
Comment on lines +240 to +245
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed Writable_File to File_Like to avoid eagerly triggering the conversion.

The problem was - converting a data link to Writable_File 'follows it' but if it did not exist, that will fail with File_Error.Not_Found and it would fail too early, not giving us a chance to abandon the operation because of disallow_links_in_copy.

Briefly I thought that this indicates that the Writable_File 'class' is too loaded and there should be a separation of concerns between expressing the capability of 'a file that can be written to' and resolving the data link.

But actually, almost all code does want this to happen in one step. Our users and library devs, will most of the time want to write a function taking Writable_File and be able to use this file to write their fancy data formats to it, not caring how the underlying structures resolve the data links to write to the data link targets - implementors of new formats should not care about data links, only about taking a Writable_File and writing to it.

Only implementors of new file systems need to be more careful around data links and for operations that may not want to follow the link (e.g. copy_to and move_to) they will need to avoid the eager conversion like here and use some other type (I decided to use File_Like as this is our 'most generic' type class that every File type should be implementing). I think that is fine as a bit more complexity in the filesystem logic sounds better than complicating the file format logic: we will likely end up with more formats than filesystems and external contributors/users will be more likely to need to write a new format than a new filesystem altogether. Moreover it's not hard anyway as one can just copy the 'best practices' from the S3_File implementation.

# Special shortcut for more efficient handling of S3 file copying (no need to move the data to our machine)
s3_destination : S3_File ->
if replace_existing.not && s3_destination.exists then Error.throw (File_Error.Already_Exists destination) else
destination_path = s3_destination.s3_path
translate_file_errors self <| S3.copy_object self.s3_path.bucket self.s3_path.key destination_path.bucket destination_path.key self.credentials . if_not_error <| s3_destination
_ -> generic_copy self destination.file replace_existing
translate_file_errors self <| S3.copy_object self.s3_path.bucket self.s3_path.key destination_path.bucket destination_path.key self.credentials
. if_not_error destination_writable.file_for_return
_ -> generic_copy self destination_writable replace_existing

## ICON data_output
Moves the file to the specified destination.
Expand All @@ -262,8 +265,8 @@ type S3_File
- destination: the destination to move the file to.
- replace_existing: specifies if the operation should proceed if the
destination file already exists. Defaults to `False`.
move_to : Writable_File -> Boolean -> Nothing ! File_Error
move_to self (destination : Writable_File) (replace_existing : Boolean = False) =
move_to : File_Like -> Boolean -> Nothing ! File_Error
move_to self (destination : File_Like) (replace_existing : Boolean = False) = Data_Link_Helpers.disallow_links_in_move self destination <|
Context.Output.if_enabled disabled_message="File moving is forbidden as the Output context is disabled." panic=False <|
r = self.copy_to destination replace_existing=replace_existing
r.if_not_error <|
Expand Down Expand Up @@ -399,7 +402,7 @@ File_Format_Metadata.from (that : S3_File) = File_Format_Metadata.Value that.uri
File_Like.from (that : S3_File) = File_Like.Value that

## PRIVATE
Writable_File.from (that : S3_File) =
Writable_File.from (that : S3_File) = if Data_Link.is_data_link that then Data_Link_Helpers.interpret_data_link_as_writable_file that else
Writable_File.Value that S3_File_Write_Strategy.instance

## PRIVATE
Expand Down
7 changes: 4 additions & 3 deletions distribution/lib/Standard/Base/0.0.0-dev/src/Data.enso
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,8 @@ import project.Data.Pair.Pair
import project.Data.Text.Encoding.Encoding
import project.Data.Text.Text
import project.Data.Vector.Vector
import project.Enso_Cloud.Data_Link
import project.Enso_Cloud.Data_Link.Data_Link
import project.Enso_Cloud.Data_Link_Helpers
import project.Error.Error
import project.Errors.File_Error.File_Error
import project.Errors.Illegal_Argument.Illegal_Argument
Expand Down Expand Up @@ -341,8 +342,8 @@ download (uri:(URI | Text)) file:Writable_File (method:HTTP_Method=HTTP_Method.G
case Data_Link.is_data_link response.body.metadata of
True ->
# If the resource was a data link, we follow it, download the target data and try to write it to a file.
data_link = Data_Link.interpret_json_as_data_link response.decode_as_json
Data_Link.save_data_link_to_file data_link file
data_link = Data_Link_Helpers.interpret_json_as_data_link response.decode_as_json
Data_Link_Helpers.save_data_link_to_file data_link file
False ->
response.write file

Expand Down
Loading
Loading