Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DAE-63] Handling exception when adding duplicate partitions #37

Merged
merged 8 commits into from
Jan 27, 2021

Conversation

LucasMMota
Copy link
Contributor

@LucasMMota LucasMMota commented Jan 20, 2021

Why? 📖

The server raises an AlreadyExistsException when some partition is added twice.
We want the client doesn't raise an error in this case: if some partition already exists then nothing should be done.
This will make the code cleaner for libs users that don't need to handle errors on their side.

What? 🔧

  • add a try block for add_partitions_to_table

Type of change 🗄️

  • enhancement (non-breaking change)
  • This change requires a documentation update

How everything was tested? 📏

Unit tests

Checklist 📝

  • I have added labels to distinguish the type of pull request.
  • My code follows the style guidelines of this project (docstrings, type hinting and linter compliance);
  • I have performed a self-review of my own code;
  • I have made corresponding changes to the documentation;
  • I have added tests that prove my fix is effective or that my feature works;
  • I have made sure that new and existing unit tests pass locally with my changes;

@LucasMMota LucasMMota added the enhancement New feature or request label Jan 20, 2021
@LucasMMota LucasMMota requested a review from a team as a code owner January 20, 2021 19:55
@LucasMMota LucasMMota self-assigned this Jan 20, 2021
@LucasMMota LucasMMota mentioned this pull request Jan 20, 2021
1 task
jufreire
jufreire previously approved these changes Jan 20, 2021
self.add_partitions(partition_list_with_correct_location)
return True
except AlreadyExistsException:
return False

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no worries on returning boolean, but I think that it might get confusing 🤔
I mean, a "False" return implies, in my opinion, that the partition addition was unsuccessful.

makes sense?

Copy link
Contributor Author

@LucasMMota LucasMMota Jan 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What we think is: the method "adds partitions" to the table. If it returns True I understand it was successful and if it returns False, it means that the operation was not complete. This is the case here. If a partition does not yet exist for that table, then the method adds it and returns True. But if the partition already exists, the method will return False, indicating that this operation was not complete, since this partition was already added. As you said, the operation (partition addition) was unsuccessful, since this partition cannot be added twice.
Also, as a third case, if another exception occurs it will be thrown.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, got it! but let me bring another point of view: if the "operation was not complete", as you said, what blocks me to add another exception returning False? do you agree it can get confusing over time?

moreover, when you call this method the expected behavior is that a partition will be available for use, right?
so, re-adding it wouldn't bring any errors or misbehavior, since the operation is exactly the same.

anyways, maybe separating methods like add_partitions_if_not_exists (that throws an exception in case it does) and another add_or_replace_partitions could make it more explicit

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, got it! but let me bring another point of view: if the "operation was not complete", as you said, what blocks me to add another exception returning False? do you agree it can get confusing over time?

Yes, if more exceptions are added in this method (with them returning false) it'd be messy. I think this shouldn't be done. We want to guarantee if we call it 2 times, it'll add the partition in the first and do nothing in the second, not throwing an error after the first time. For this reason, I added this except, to keep the behavior of adding a partition (duplicate or not) "clean" and without errors.
If more errors are raised, they should be thrown and not caught by the try block, this would silently mask them and be out against the method objective.

anyways, maybe separating methods like add_partitions_if_not_exists (that throws an exception in case it does) and another add_or_replace_partitions could make it more explicit

I don't know if I completely got your suggestion about these two methods.
If we throw an exception in add_partitions_if_not_exists wouldn't we need to treat this exception in the user of the client?
About the add_or_replace_partitions, I got confused about what you meant by replacing, because I see two options for the partitions: add or remove. I didn't get what you meant when you said about replacing partitions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another option for this would be to call the native add_partitions and handle this AlreadyExistsException exception on the user side. This was one of my options at first, but I decided to keep it cleaner for lib users, creating this encapsulation on the client side.

Copy link
Contributor Author

@LucasMMota LucasMMota Jan 22, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done: 90f7b99

@LucasMMota LucasMMota force-pushed the lucasmmota/dae-63-handle-exception branch from 7e24910 to 90f7b99 Compare January 22, 2021 19:25
docs/source/getstarted.md Outdated Show resolved Hide resolved
hive_metastore_client/hive_metastore_client.py Outdated Show resolved Hide resolved

:param db_name: database name where the table is at
:param table_name: table name which the partitions belong to
:param partition_list: list of partitions to be added to the table
"""
if not partition_list:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❤️

Comment on lines +230 to +245
def test_add_partitions_to_table_with_invalid_partitions(
self,
mocked_add_partitions,
mocked__format_partitions,
mocked_get_table,
hive_metastore_client,
):
# assert
with raises(ValueError):
# act
hive_metastore_client.add_partitions_if_not_exists(
db_name=ANY, table_name=ANY, partition_list=[]
)
mocked_get_table.assert_not_called()
mocked__format_partitions.assert_not_called()
mocked_add_partitions.assert_not_called()

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

niceee

ribaldorafael
ribaldorafael previously approved these changes Jan 26, 2021
Copy link

@ribaldorafael ribaldorafael left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🏅

kenjihiraoka
kenjihiraoka previously approved these changes Jan 26, 2021
Copy link
Contributor

@kenjihiraoka kenjihiraoka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@LucasMMota LucasMMota dismissed stale reviews from kenjihiraoka and ribaldorafael via 74f975a January 27, 2021 13:55
@sonarqubecloud
Copy link

Kudos, SonarCloud Quality Gate passed!

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 0 Code Smells

No Coverage information No Coverage information
0.0% 0.0% Duplication

@LucasMMota LucasMMota merged commit ca775b8 into main Jan 27, 2021
@LucasMMota LucasMMota deleted the lucasmmota/dae-63-handle-exception branch January 27, 2021 15:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants