-
Notifications
You must be signed in to change notification settings - Fork 28.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-8138] [SQL] Improves error message when conflicting partition columns are found #6610
Conversation
cc @rxin |
Test build #34081 has finished for PR 6610 at commit
|
Let's create a jira for this. |
Filed SPARK-8138 for this and updated PR title. |
@yhuai As discussed offline, now we give a more descriptive and help message with a list of all suspicious non-leaf partition directories. |
Test build #34389 has finished for PR 6610 at commit
|
}) | ||
val distinctPartColNames = pathsWithPartitionValues.map(_._2.columnNames).distinct | ||
|
||
def listConflictingPartitionColumns: String = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
move this out, make it accept arguments in the form of collections, and write unit test for this function.
this function is way too complicated to not have unit tests.
retest this please |
Test build #35651 has finished for PR 6610 at commit
|
Test build #35656 has finished for PR 6610 at commit
|
Merging to master. |
This PR improves the error message shown when conflicting partition column names are detected. This can be particularly annoying and confusing when there are a large number of partitions while a handful of them happened to contain unexpected temporary file(s). Now all suspicious directories are listed as below: