Skip to content

Commit

Permalink
Excel_Workbook.read_many (#9759)
Browse files Browse the repository at this point in the history
- Some minor linting fixes.
- Adjust `headers` parameter so a dedicated type.
![image](https://github.com/enso-org/enso/assets/4699705/989f464d-df95-410e-a03b-36661f1c4a37)
- Fix bug with `read` on an `Excel_Workbook` so error handled more gracefully and not panicking to UI.
![image](https://github.com/enso-org/enso/assets/4699705/23b4575f-daad-4719-a5cc-30d064bd7f7a)
- Fix bug when writing to a file with an `Excel_Format` with an invalid extension which was causing a panic.
![image](https://github.com/enso-org/enso/assets/4699705/dc0e055c-c1b6-482f-b129-eb69f6554d72)
- Add `read_many` to `Excel_Workbook` allowing reading more than one sheet at a time.
  • Loading branch information
jdunkerley authored Apr 24, 2024
1 parent 717f6bb commit fb9cf38
Show file tree
Hide file tree
Showing 20 changed files with 262 additions and 105 deletions.
3 changes: 3 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -655,6 +655,8 @@
- [Added `recursive` option to `File.delete`.][9719]
- [Added `Vector.build`.][9725]
- [Added `Table.running` method][9577]
- [Added `Excel_Workbook.read_many` allowing reading more than one sheet at a
time.][9759]

[debug-shortcuts]:
https://github.com/enso-org/enso/blob/develop/app/gui/docs/product/shortcuts.md#debug
Expand Down Expand Up @@ -959,6 +961,7 @@
[9716]: https://github.com/enso-org/enso/pull/9716
[9719]: https://github.com/enso-org/enso/pull/9719
[9725]: https://github.com/enso-org/enso/pull/9725
[9759]: https://github.com/enso-org/enso/pull/9759
[9577]: https://github.com/enso-org/enso/pull/9577

#### Enso Compiler
Expand Down
10 changes: 5 additions & 5 deletions distribution/lib/Standard/AWS/0.0.0-dev/src/S3/S3_File.enso
Original file line number Diff line number Diff line change
Expand Up @@ -76,19 +76,19 @@ type S3_File
content_length = translate_file_errors self <| S3.raw_head self.s3_path.bucket self.s3_path.key self.credentials . contentLength
if content_length.is_nothing then Error.throw (S3_Error.Error "ContentLength header is missing." self.uri) else content_length

## ICON folder_add
GROUP Output
## GROUP Output
ICON folder_add
Creates the directory represented by this file if it did not exist.

It also creates parent directories if they did not exist.

? S3 Handling of Directories

S3 does not have a native concept of directories.
TODO: Add more information about how S3 handles directories.
https://github.com/enso-org/enso/issues/9704
S3 does not have a native concept of directories.
create_directory : File
create_directory self =
## TODO Add more information about how S3 handles directories.
https://github.com/enso-org/enso/issues/9704
Unimplemented.throw "Creating S3 folders is currently not implemented."

## PRIVATE
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -40,10 +40,10 @@ import project.System.Output_Stream.Output_Stream
from project.Data.Boolean import Boolean, False, True
from project.Data.Index_Sub_Range.Index_Sub_Range import Last
from project.Data.Text.Extensions import all
from project.Enso_Cloud.Public_Utils import get_required_field
from project.Enso_Cloud.Internal.Enso_File_Helpers import all
from project.System.File_Format import Auto_Detect, Bytes, File_Format, Plain_Text_Format
from project.Enso_Cloud.Public_Utils import get_required_field
from project.System.File import find_extension_from_name
from project.System.File_Format import Auto_Detect, Bytes, File_Format, Plain_Text_Format
from project.System.File.Generic.File_Write_Strategy import generic_copy

type Enso_File
Expand Down Expand Up @@ -310,8 +310,8 @@ type Enso_File
Asset_Cache.update file asset
file

## ICON folder_add
GROUP Output
## GROUP Output
ICON folder_add
Creates the directory represented by this file if it did not exist.

It also creates parent directories if they did not exist.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,8 @@ import project.Data.Json.JS_Object
import project.Enso_Cloud.Enso_File.Enso_Asset_Type
import project.Enso_Cloud.Enso_File.Enso_File
import project.Enso_Cloud.Errors.Enso_Cloud_Error
import project.Enso_Cloud.Internal.Existing_Enso_Asset.Existing_Enso_Asset
import project.Enso_Cloud.Internal.Existing_Enso_Asset.Asset_Cache
import project.Enso_Cloud.Internal.Existing_Enso_Asset.Existing_Enso_Asset
import project.Enso_Cloud.Internal.Utils
import project.Error.Error
import project.Errors.File_Error.File_Error
Expand Down
4 changes: 2 additions & 2 deletions distribution/lib/Standard/Base/0.0.0-dev/src/System/File.enso
Original file line number Diff line number Diff line change
Expand Up @@ -495,8 +495,8 @@ type File
is_directory : Boolean
is_directory self = @Builtin_Method "File.is_directory"

## ICON folder_add
GROUP Output
## GROUP Output
ICON folder_add
Creates the directory represented by this file if it did not exist.

It also creates parent directories if they did not exist.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ from Standard.Base.Widget_Helpers import make_file_read_delimiter_selector

import project.Data_Formatter.Data_Formatter
import project.Delimited.Quote_Style.Quote_Style
import project.Headers.Headers
import project.Internal.Delimited_Reader
import project.Internal.Delimited_Writer
import project.Match_Columns.Match_Columns
Expand All @@ -35,12 +36,10 @@ type Delimited_Format
does not include the header row (if applicable).
- quote_style: Specifies the style of quotes used for reading and
writing.
- headers: If set to `True`, the first row is used as column names. If
set to `False`, the column names are generated by adding increasing
numeric suffixes to the base name `Column` (i.e. `Column_1`,
`Column_2` etc.). If set to `Infer`, the process tries to infer if
headers are present on the first row. If the column names are not
unique, numeric suffixes will be appended to disambiguate them.
- headers: Specifies if the first row contains the column names. If set
to `Detect_Headers`, the process tries to infer if headers are
present. If the column names are not unique, numeric suffixes will be
appended to disambiguate them.
- value_formatter: Formatter to parse text values into numbers, dates,
times, etc. If `Nothing` values are left as Text.
- keep_invalid_rows: Specifies whether rows that contain less or more
Expand All @@ -58,7 +57,7 @@ type Delimited_Format
defaults to `Nothing` which means that comments are disabled.
@delimiter make_file_read_delimiter_selector
@encoding Encoding.default_widget
Delimited (delimiter:Text=',') (encoding:Encoding=Encoding.utf_8) (skip_rows:Integer=0) (row_limit:Integer|Nothing=Nothing) (quote_style:Quote_Style=Quote_Style.With_Quotes) (headers:Boolean|Infer=Infer) (value_formatter:Data_Formatter|Nothing=Data_Formatter.Value) (keep_invalid_rows:Boolean=True) (line_endings:Line_Ending_Style|Infer=Infer) (comment_character:Text|Nothing=Nothing)
Delimited (delimiter:Text=',') (encoding:Encoding=Encoding.utf_8) (skip_rows:Integer=0) (row_limit:Integer|Nothing=Nothing) (quote_style:Quote_Style=Quote_Style.With_Quotes) (headers:Headers=Headers.Detect_Headers) (value_formatter:Data_Formatter|Nothing=Data_Formatter.Value) (keep_invalid_rows:Boolean=True) (line_endings:Line_Ending_Style|Infer=Infer) (comment_character:Text|Nothing=Nothing)

## PRIVATE
ADVANCED
Expand Down Expand Up @@ -112,8 +111,8 @@ type Delimited_Format
## PRIVATE
Clone the instance with some properties overridden.
Note: This function is internal until such time as Atom cloning with modification is built into Enso.
clone : Quote_Style -> (Boolean|Infer) -> (Data_Formatter|Nothing) -> Boolean -> (Text|Nothing) -> (Text|Nothing) -> Delimited_Format
clone self (quote_style=self.quote_style) (headers=self.headers) (value_formatter=self.value_formatter) (keep_invalid_rows=self.keep_invalid_rows) (line_endings=self.line_endings) (comment_character=self.comment_character) =
clone : Quote_Style -> Headers -> (Data_Formatter|Nothing) -> Boolean -> (Text|Nothing) -> (Text|Nothing) -> Delimited_Format
clone self (quote_style=self.quote_style) (headers:Headers=self.headers) (value_formatter=self.value_formatter) (keep_invalid_rows=self.keep_invalid_rows) (line_endings=self.line_endings) (comment_character=self.comment_character) =
Delimited_Format.Delimited self.delimiter self.encoding self.skip_rows self.row_limit quote_style headers value_formatter keep_invalid_rows line_endings comment_character

## ICON data_input
Expand All @@ -131,13 +130,13 @@ type Delimited_Format
## ICON data_input
Create a clone of this with first row treated as header.
with_headers : Delimited_Format
with_headers self = self.clone headers=True
with_headers self = self.clone headers=Headers.Has_Headers

## ICON data_input
Create a clone of this where the first row is treated as data, not a
header.
without_headers : Delimited_Format
without_headers self = self.clone headers=False
without_headers self = self.clone headers=Headers.No_Headers

## ICON data_input
Create a clone of this with value parsing.
Expand Down
19 changes: 13 additions & 6 deletions distribution/lib/Standard/Table/0.0.0-dev/src/Errors.enso
Original file line number Diff line number Diff line change
Expand Up @@ -247,13 +247,13 @@ type Unquoted_Characters_In_Output
## Indicates that a specified location was not valid.
type Invalid_Location
## PRIVATE
Error (location : Text | Any)
Error (location : Text | Any) (message:Text|Nothing=Nothing)

## PRIVATE
Pretty print the invalid location error.
to_display_text : Text
to_display_text self =
"The location '"+self.location.to_text+"' is not valid."
self.message.if_nothing ("The location '"+self.location.to_text+"' is not valid.")

## Indicates that some values did not match the expected datatype format.

Expand All @@ -278,22 +278,29 @@ type Empty_File_Error
## PRIVATE
Pretty print the empty file error.
to_display_text : Text
to_display_text = "It is not allowed to create a Table with no columns, so an empty file could not have been loaded."
to_display_text self =
_ = self
"The file is empty so it cannot be loaded."

## PRIVATE
handle_java_exception =
Panic.catch EmptyFileException handler=(_ -> Error.throw Empty_File_Error)

## Indicates that an empty sheet was encountered, so no data could be loaded.
type Empty_Sheet_Error
type Empty_Sheet
## PRIVATE
Error

## PRIVATE
Pretty print the empty sheet error.
to_display_text : Text
to_display_text = "It is not allowed to create a Table with no columns, so an empty sheet could not have been loaded."
to_display_text self =
_ = self
"There is no data in the sheet."

## PRIVATE
handle_java_exception =
Panic.catch EmptySheetException handler=(_ -> Error.throw Empty_Sheet_Error)
Panic.catch EmptySheetException handler=(_ -> Error.throw Empty_Sheet.Error)

## Indicates that the column was already present in the table.
type Existing_Column
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ from Standard.Base.System.File_Format import parse_boolean_with_infer

import project.Excel.Excel_Range.Excel_Range
import project.Excel.Excel_Workbook.Excel_Workbook
import project.Headers.Headers
import project.Internal.Excel_Reader
import project.Internal.Excel_Section.Excel_Section
import project.Internal.Excel_Writer
Expand All @@ -18,12 +19,12 @@ import project.Table.Table

## PRIVATE
Resolve the xls_format setting to a boolean.
should_treat_as_xls_format : (Boolean|Infer) -> File -> Boolean ! Illegal_Argument
should_treat_as_xls_format xls_format file =
should_treat_as_xls_format : (Boolean|Infer) -> File_Format_Metadata -> Boolean ! Illegal_Argument
should_treat_as_xls_format xls_format file:File_Format_Metadata =
if xls_format != Infer then xls_format else
inferred_xls_format = xls_format_from_metadata file
inferred_xls_format.if_nothing <|
Error.throw (Illegal_Argument.Error ("File not recognized as Excel file (" + file.name + ")"))
Error.throw (Illegal_Argument.Error ("File extension not recognized as Excel (" + file.name + "). Specify xls_format explicitly."))

## Read the file to a `Table` from an Excel file
type Excel_Format
Expand All @@ -43,11 +44,10 @@ type Excel_Format

Arguments:
- sheet: The sheet number or name.
- headers: If set to `True`, the first row is used as column names. If
set to `False`, the column names are Excel column names. If set to
`Infer`, the process tries to infer if headers are present on the first
row. If the column names are not unique, numeric suffixes will be
appended to disambiguate them.
- headers: Specifies if the first row contains the column names. If set
to `Detect_Headers`, the process tries to infer if headers are
present. If the column names are not unique, numeric suffixes will be
appended to disambiguate them.
- skip_rows: The number of rows to skip before reading the data.
- row_limit: The maximum number of rows to read. If set to `Nothing`, all
rows are read.
Expand All @@ -56,17 +56,16 @@ type Excel_Format
If set to `False`, the file is read as an Excel 2007+ format.
`Infer` will attempt to deduce this from the extension of the filename.
@sheet (Text_Input display=Display.Always)
Sheet (sheet:(Integer|Text)=1) (headers:(Boolean|Infer)=Infer) (skip_rows:Integer=0) (row_limit:(Integer|Nothing)=Nothing) (xls_format:Boolean|Infer=Infer)
Sheet (sheet:(Integer|Text)=1) (headers:Headers=Headers.Detect_Headers) (skip_rows:Integer=0) (row_limit:(Integer|Nothing)=Nothing) (xls_format:Boolean|Infer=Infer)

## Reads a range from an Excel file as a `Table`.

Arguments:
- address: A name of a range or an Excel-style address (e.g. Sheet1!A1:B2).
- headers: If set to `True`, the first row is used as column names. If
set to `False`, the column names are Excel column names. If set to
`Infer`, the process tries to infer if headers are present on the first
row. If the column names are not unique, numeric suffixes will be
appended to disambiguate them.
- headers: Specifies if the first row contains the column names. If set
to `Detect_Headers`, the process tries to infer if headers are
present. If the column names are not unique, numeric suffixes will be
appended to disambiguate them.
- skip_rows: The number of rows to skip before reading the data.
- row_limit: The maximum number of rows to read. If set to `Nothing`, all
rows are read.
Expand All @@ -75,7 +74,7 @@ type Excel_Format
If set to `False`, the file is read as an Excel 2007+ format.
`Infer` will attempt to deduce this from the extension of the filename.
@address Text_Input
Range (address:(Text|Excel_Range)) (headers:(Boolean|Infer)=Infer) (skip_rows:Integer=0) (row_limit:(Integer|Nothing)=Nothing) (xls_format : Boolean | Infer = Infer)
Range (address:(Text|Excel_Range)) (headers:Headers=Headers.Detect_Headers) (skip_rows:Integer=0) (row_limit:(Integer|Nothing)=Nothing) (xls_format : Boolean | Infer = Infer)

## PRIVATE
ADVANCED
Expand Down
Loading

0 comments on commit fb9cf38

Please sign in to comment.