Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

throw error on dimension drop in schema evolution. #4958

Merged
merged 1 commit into from
May 16, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
44 changes: 44 additions & 0 deletions test/src/unit-cppapi-schema-evolution.cc
Original file line number Diff line number Diff line change
Expand Up @@ -107,6 +107,50 @@ TEST_CASE(
}
}

TEST_CASE(
"C++ API: SchemaEvolution, check error when dropping dimension",
"[cppapi][schema][evolution][drop]") {
using namespace tiledb;
Context ctx;
VFS vfs(ctx);

std::string array_uri = "test_schema_evolution_array";

Domain domain(ctx);
auto id1 = Dimension::create<int>(ctx, "d1", {{-100, 100}}, 10);
auto id2 = Dimension::create<int>(ctx, "d2", {{0, 100}}, 5);
domain.add_dimension(id1).add_dimension(id2);

auto a1 = Attribute::create<int>(ctx, "a1");
auto a2 = Attribute::create<int>(ctx, "a2");

ArraySchema schema(ctx, TILEDB_DENSE);
schema.set_domain(domain);
schema.add_attribute(a1);
schema.add_attribute(a2);
schema.set_cell_order(TILEDB_ROW_MAJOR);
schema.set_tile_order(TILEDB_COL_MAJOR);

if (vfs.is_dir(array_uri)) {
vfs.remove_dir(array_uri);
}

Array::create(array_uri, schema);

auto evolution = ArraySchemaEvolution(ctx);

// try to drop dimension d1
evolution.drop_attribute("d1");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the statement that should fail. Avoiding a failure here only defers detection of the problem until later.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the comments Eric. I agree that the check should happen further up in drop_attribute() however the ArraySchema object (which is required for the use of has_attribute()) is not available in drop_attribute(). It is only passed as a parameter in ArraySchemaEvolution::array_evolve() in the form of the array_uri which is then used to create the schema object

Here is the current order:

Array::create(array_uri, schema);
auto evolution = ArraySchemaEvolution(ctx);
evolution.drop_attribute("d1");
evolution.array_evolve(array_uri));

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the ArraySchema object [...] is not available in drop_attribute()

Yes, I see that now. I didn't realize the whole design was broken in this particular way.


// check that an exception is thrown
CHECK_THROWS(evolution.array_evolve(array_uri));

// Clean up
if (vfs.is_dir(array_uri)) {
vfs.remove_dir(array_uri);
}
}

TEST_CASE(
"C++ API: SchemaEvolution, add attributes and read",
"[cppapi][schema][evolution][add]") {
Expand Down
7 changes: 5 additions & 2 deletions tiledb/sm/array_schema/array_schema_evolution.cc
Original file line number Diff line number Diff line change
Expand Up @@ -152,9 +152,12 @@ shared_ptr<ArraySchema> ArraySchemaEvolution::evolve_schema(
for (auto& attr_name : attributes_to_drop_) {
bool has_attr = false;
throw_if_not_ok(schema->has_attribute(attr_name, &has_attr));
if (has_attr) {
throw_if_not_ok(schema->drop_attribute(attr_name));
if (!has_attr) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test is not quite right. The issue is with capnp deserialization. As with all input that comes in from the outside, here from the network, we need to validate inputs. We may not assume that a capnp message was generated by libtiledb necessarily; it might have been formed by other, likely malicious, software. We don't need to care why it was created, only that it's detected and rejected.

There's a check in drop_attribute to see if the attribute is the name of an attribute that was previously added. In this case, there's no check to see if the dropped attribute here also appears in the list of added attributes.

throw ArraySchemaEvolutionException(
"Cannot drop attribute; Input attribute name refers to a dimension "
"or does not exist");
}
throw_if_not_ok(schema->drop_attribute(attr_name));
}

// Drop enumerations
Expand Down
Loading