-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DAE-63] Handling exception when adding duplicate partitions #37
Conversation
…e in the clients user
self.add_partitions(partition_list_with_correct_location) | ||
return True | ||
except AlreadyExistsException: | ||
return False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no worries on returning boolean, but I think that it might get confusing 🤔
I mean, a "False" return implies, in my opinion, that the partition addition was unsuccessful.
makes sense?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What we think is: the method "adds partitions" to the table. If it returns True I understand it was successful and if it returns False, it means that the operation was not complete. This is the case here. If a partition does not yet exist for that table, then the method adds it and returns True. But if the partition already exists, the method will return False, indicating that this operation was not complete, since this partition was already added. As you said, the operation (partition addition) was unsuccessful, since this partition cannot be added twice.
Also, as a third case, if another exception occurs it will be thrown.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, got it! but let me bring another point of view: if the "operation was not complete", as you said, what blocks me to add another exception returning False? do you agree it can get confusing over time?
moreover, when you call this method the expected behavior is that a partition will be available for use, right?
so, re-adding it wouldn't bring any errors or misbehavior, since the operation is exactly the same.
anyways, maybe separating methods like add_partitions_if_not_exists
(that throws an exception in case it does) and another add_or_replace_partitions
could make it more explicit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, got it! but let me bring another point of view: if the "operation was not complete", as you said, what blocks me to add another exception returning False? do you agree it can get confusing over time?
Yes, if more exceptions are added in this method (with them returning false) it'd be messy. I think this shouldn't be done. We want to guarantee if we call it 2 times, it'll add the partition in the first and do nothing in the second, not throwing an error after the first time. For this reason, I added this except, to keep the behavior of adding a partition (duplicate or not) "clean" and without errors.
If more errors are raised, they should be thrown and not caught by the try block, this would silently mask them and be out against the method objective.
anyways, maybe separating methods like add_partitions_if_not_exists (that throws an exception in case it does) and another add_or_replace_partitions could make it more explicit
I don't know if I completely got your suggestion about these two methods.
If we throw an exception in add_partitions_if_not_exists
wouldn't we need to treat this exception in the user of the client?
About the add_or_replace_partitions
, I got confused about what you meant by replacing, because I see two options for the partitions: add or remove. I didn't get what you meant when you said about replacing partitions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another option for this would be to call the native add_partitions
and handle this AlreadyExistsException exception on the user side. This was one of my options at first, but I decided to keep it cleaner for lib users, creating this encapsulation on the client side.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done: 90f7b99
7e24910
to
90f7b99
Compare
|
||
:param db_name: database name where the table is at | ||
:param table_name: table name which the partitions belong to | ||
:param partition_list: list of partitions to be added to the table | ||
""" | ||
if not partition_list: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❤️
def test_add_partitions_to_table_with_invalid_partitions( | ||
self, | ||
mocked_add_partitions, | ||
mocked__format_partitions, | ||
mocked_get_table, | ||
hive_metastore_client, | ||
): | ||
# assert | ||
with raises(ValueError): | ||
# act | ||
hive_metastore_client.add_partitions_if_not_exists( | ||
db_name=ANY, table_name=ANY, partition_list=[] | ||
) | ||
mocked_get_table.assert_not_called() | ||
mocked__format_partitions.assert_not_called() | ||
mocked_add_partitions.assert_not_called() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
niceee
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🏅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
74f975a
Kudos, SonarCloud Quality Gate passed! 0 Bugs No Coverage information |
Why? 📖
The server raises an AlreadyExistsException when some partition is added twice.
We want the client doesn't raise an error in this case: if some partition already exists then nothing should be done.
This will make the code cleaner for libs users that don't need to handle errors on their side.
What? 🔧
Type of change 🗄️
How everything was tested? 📏
Unit tests
Checklist 📝