-
Notifications
You must be signed in to change notification settings - Fork 326
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Parsing values with known types (#3455)
- Loading branch information
Showing
40 changed files
with
955 additions
and
65 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
25 changes: 25 additions & 0 deletions
25
distribution/lib/Standard/Table/0.0.0-dev/src/Data/Column_Type_Selection.enso
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,25 @@ | ||
from Standard.Base import all | ||
import Standard.Base.Data.Time | ||
|
||
## The type representing inferring the column type automatically based on values | ||
present in the column. | ||
|
||
The most specific type which is valid for all values in a column is chosen: | ||
- if all values are integers, `Integer` is chosen, | ||
- if all values are decimals or integers, `Decimal` is chosen, | ||
- if all values are booleans, `Boolean` is chosen, | ||
- if the values are all the same time type (a date, a time or a date-time), | ||
the corresponding type is chosen, `Date`, `Time_Of_Day` or `Time`, | ||
respectively, | ||
- otherwise, `Text` is chosen as a fallback and the column is kept as-is | ||
without parsing. | ||
type Auto | ||
|
||
## Specifies the desired datatype for parsing a particular column. | ||
|
||
Arguments: | ||
- column: the column selector which can either be the column name or the | ||
index. | ||
- datatype: The desired datatype for the column or `Auto` to infer the type | ||
from the data. | ||
type Column_Type_Selection (column:Text|Integer) datatype:(Auto|Integer|Decimal|Date|Time|Time_Of_Day|Boolean)=Auto |
29 changes: 29 additions & 0 deletions
29
distribution/lib/Standard/Table/0.0.0-dev/src/Data/Data_Formatter.enso
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,29 @@ | ||
from Standard.Base import all | ||
|
||
## Specifies options for reading text data in a table to more specific types and | ||
serializing them back. | ||
|
||
Arguments: | ||
- trim_values: Trim whitespace before parsing. | ||
- allow_leading_zeros: Specifies how to treat numeric values starting with | ||
leading zeroes. Defaults to `False`, because converting such | ||
values to numbers is a lossy operation - after converting such a number | ||
back to text the leading zeroes will get lost. If leading zeroes are not | ||
allowed and the column contains any values with leading zeroes, it will not | ||
get automatically converted to numbers, remaining as text. However, if the | ||
column is specifically requested to be converted to a numeric column, only | ||
a warning will be issued indicating that some leading zeroes were present, | ||
but the conversion will proceed. | ||
- decimal_point: The character used to separate the integer part from the | ||
fractional part of a number. Defaults to '.'. Can be changed for example to | ||
',' to allow for European format. | ||
- thousand_separator: A separator that can be used to separate groups of | ||
digits in numbers. For example, it can be set to ',' to allow for notation | ||
like '1,000,000.0'. | ||
- datetime_formats: Expected datetime formats. | ||
- date_formats: Expected date formats. | ||
- time_formats: Expected time formats. | ||
- locale: The locale to use when parsing dates and times. | ||
- true_values: Values representing True. | ||
- false_values: Values representing False. | ||
type Data_Formatter trim_values:Boolean=True allow_leading_zeros:Boolean=False decimal_point:Text='.' thousand_separator:Text='' datetime_formats:[Text]=["yyyy-MM-dd HH:mm:ss"] date_formats:[Text]=["yyyy-MM-dd"] time_formats:[Text]=["HH:mm:ss"] locale:Locale=Locale.default true_values:[Text]=["True","true","TRUE"] false_values:[Text]=["False","false","FALSE"] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
19 changes: 19 additions & 0 deletions
19
distribution/lib/Standard/Table/0.0.0-dev/src/Internal/Parse_Values_Helper.enso
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,19 @@ | ||
from Standard.Base import all | ||
|
||
from Standard.Table.Error as Table_Errors import Invalid_Format, Leading_Zeros | ||
|
||
polyglot java import org.enso.table.parsing.problems.InvalidRow | ||
polyglot java import org.enso.table.parsing.problems.InvalidFormat | ||
polyglot java import org.enso.table.parsing.problems.LeadingZeros | ||
polyglot java import org.enso.table.parsing.problems.MismatchedQuote | ||
polyglot java import org.enso.table.parsing.problems.AdditionalInvalidRows | ||
|
||
translate_parsing_problem column_name expected_datatype problem = | ||
invalid_format = [InvalidFormat, (java_problem-> Invalid_Format column_name expected_datatype (Vector.Vector java_problem.cells))] | ||
leading_zeros = [LeadingZeros, (java_problem-> Leading_Zeros column_name expected_datatype (Vector.Vector java_problem.cells))] | ||
translations = [invalid_format, leading_zeros] | ||
found = translations.find t-> | ||
Java.is_instance problem t.first | ||
translation = found.catch _-> | ||
Error.throw (Illegal_State_Error "Reported an unknown problem type: "+problem.to_text) | ||
translation.second problem |
15 changes: 0 additions & 15 deletions
15
...e/runtime/src/main/java/org/enso/interpreter/node/expression/builtin/bool/ToTextNode.java
This file was deleted.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.