Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tokenize_to_columns or parse_to_columns results in a single column we shouldn't add the 1 #6607

Merged
merged 5 commits into from
May 9, 2023
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -123,7 +123,7 @@ fan_out_to_columns table input_column_id function column_count=Nothing on_proble
input_column = table.get input_column_id
problem_builder = Problem_Builder.new
new_columns_unrenamed = map_columns_to_multiple input_column function column_count problem_builder
new_columns = rename_new_columns table new_columns_unrenamed problem_builder
new_columns = rename_new_columns table input_column.name new_columns_unrenamed problem_builder
new_table = replace_column_with_columns table input_column new_columns
problem_builder.attach_problems_after on_problems new_table

Expand Down Expand Up @@ -315,17 +315,22 @@ map_columns_to_multiple input_column function column_count problem_builder =

builders

# Name columns. If there's only one, use the original column name.
new_column_names = case builders.length of
1 -> [input_column.name]
_ -> 0.up_to builders.length . map i-> default_column_namer input_column.name i

# Build Columns.
builders.map .seal . map_with_index i-> storage->
name = default_column_namer input_column.name i
Column.from_storage name storage
sealed = builders.map .seal
GregoryTravis marked this conversation as resolved.
Show resolved Hide resolved
new_column_names.zip sealed Column.from_storage

## PRIVATE
Rename a vector of columns to be unique when added to a table.
rename_new_columns : Table -> Vector Column -> Problem_Builder -> Vector Column
rename_new_columns table columns problem_builder =
rename_new_columns : Table -> Text -> Vector Column -> Problem_Builder -> Vector Column
rename_new_columns table removed_column_name columns problem_builder =
unique = Unique_Name_Strategy.new
unique.mark_used <| table.columns.map .name
remaining_columns = table.columns . filter (c-> c.name != removed_column_name) . map .name
unique.mark_used remaining_columns
new_columns = columns.map column->
new_name = unique.make_unique column.name
column.rename new_name
Expand Down
16 changes: 16 additions & 0 deletions test/Table_Tests/src/In_Memory/Split_Tokenize_Spec.enso
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,14 @@ spec =
t2 = t.split_to_rows "bar" "b"
t2.should_equal expected

Test.specify "can do split_to_columns with one output column, no column suffix added" <|
cols = [["foo", [0, 1, 2]], ["bar", ["abc", "cbdbef", "ghbijbu"]]]
t = Table.new cols
expected_rows = [[0, "abc"], [1, "cbdbef"], [2, "ghbijbu"]]
expected = Table.from_rows ["foo", "bar"] expected_rows
t2 = t.split_to_columns "bar" "q"
t2.should_equal expected

Test.group "Table.tokenize" <|
Test.specify "can do tokenize_to_columns" <|
cols = [["foo", [0, 1, 2]], ["bar", ["a12b34r5", "23", "2r4r55"]]]
Expand Down Expand Up @@ -75,6 +83,14 @@ spec =
t2 = t.tokenize_to_rows "bar" "\d+"
t2.should_equal expected

Test.specify "can do tokenize_to_columns with one output column, no column suffix needed" <|
cols = [["foo", [0, 1, 2]], ["bar", ["a12b", "23", "2r"]]]
t = Table.new cols
expected_rows = [[0, "12"], [1, "23"], [2, "2"]]
expected = Table.from_rows ["foo", "bar"] expected_rows
t2 = t.tokenize_to_columns "bar" "\d+"
t2.should_equal expected

Test.specify "can do tokenize_to_rows with some rows that have no matches" <|
cols = [["foo", [0, 1, 2, 3]], ["bar", ["a12b34r5", "23", "q", "2r4r55"]]]
t = Table.new cols
Expand Down