Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[#4867] feat(core): Add storage support for columns in Gravitino #5078

Merged
merged 9 commits into from
Oct 15, 2024

Conversation

jerryshao
Copy link
Contributor

What changes were proposed in this pull request?

This PR aims to add storage support for columns in Gravitino.

Why are the changes needed?

With this, we can also support managing columns in Gravitno, like set tags, do column level privileges.

Fix: #4867

Does this PR introduce any user-facing change?

No.

How was this patch tested?

UTs added.

@jerryshao jerryshao self-assigned this Oct 9, 2024
POConverters.updateTablePOWithVersion(oldTablePO, newEntity), oldTablePO));
SessionUtils.doMultipleWithCommit(
() ->
updateResult.set(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the table updateResult pass to the update column operation so that if it equals 0 can throw an IOException immediately?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe Fileset's update / delete operation can improve like this too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe Fileset's update / delete operation can improve like this too.

I can do this in a separate PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed for table update and delete.

`column_nullable` TINYINT(1) NOT NULL DEFAULT 1 COMMENT 'column nullable, 0 is not nullable, 1 is nullable',
`column_auto_increment` TINYINT(1) NOT NULL DEFAULT 0 COMMENT 'column auto increment, 0 is not auto increment, 1 is auto increment',
`column_default_value` VARCHAR(256) DEFAULT NULL COMMENT 'column default value',
`column_op_type` TINYINT(1) NOT NULL COMMENT 'column operation type, 1 is create, 2 is update, 3 is delete',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this field save the Enum type name string? Maybe it's more readable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it can be. but I think this might be more efficient and I clearly define it in Enum, so from my side the current implementation is acceptable.

-- under the License.
--
CREATE TABLE IF NOT EXISTS `table_column_version_info` (
`id` BIGINT(20) UNSIGNED NOT NULL AUTO_INCREMENT COMMENT 'auto increment id',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe the id is unnecessary, because we have column_id field here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

column_id will be duplicated when we update the column multiple times, so it cannot be a primary key. You can check the current column table design, it uses MVCC to manage the column changes, so for one column, it will have multiple same column_id records with different fields in this table.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see.

@jerryshao jerryshao requested review from xloya and yuqi1129 October 12, 2024 03:43
@xloya
Copy link
Contributor

xloya commented Oct 12, 2024

Overall LGTM, maybe also need add the recursive delete columns cases when Metalake/Catalog/Schema were deleted in the TestEntityStorage.

@jerryshao
Copy link
Contributor Author

Overall LGTM, maybe also need add the recursive delete columns cases when Metalake/Catalog/Schema were deleted in the TestEntityStorage.

Let me see how to add a UT about this.

@jerryshao
Copy link
Contributor Author

I feel like it is not that easy to validate if the column is deleted or not through entity store interface, because we columns cannot managed directly through this interface, it is handled when we do the table operations.

@xloya
Copy link
Contributor

xloya commented Oct 12, 2024

I feel like it is not that easy to validate if the column is deleted or not through entity store interface, because we columns cannot managed directly through this interface, it is handled when we do the table operations.

I see. Maybe we can refer to the TestJDBCBackend.listFilesetVersions() method, query the data through JDBC connection directly.

@jerryshao
Copy link
Contributor Author

I feel like it is not that easy to validate if the column is deleted or not through entity store interface, because we columns cannot managed directly through this interface, it is handled when we do the table operations.

I see. Maybe we can refer to the TestJDBCBackend.listFilesetVersions() method, query the data through JDBC connection directly.

OK, let me check it.

@jerryshao
Copy link
Contributor Author

@xloya @yuqi1129 @mchades can you please help to review again.

Copy link
Contributor

@yuqi1129 yuqi1129 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A minor one, others LGTM

.collect(Collectors.toList());
}

public void insertColumnPOs(TablePO tablePO, List<ColumnEntity> columnEntities) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please make this method package accessible, by the way, since the value will not do commit operation, I believe it's just for some special cases, could you add some description here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just add some comments and change the scope of the methods.

return;
}

SessionUtils.doWithoutCommit(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

Copy link
Contributor

@yuqi1129 yuqi1129 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, @mchades @xloya Would you like to take another look?

@jerryshao jerryshao merged commit 5f73e62 into apache:main Oct 15, 2024
26 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Subtask] Design and implement the column storage layout for columns.
4 participants