-
Notifications
You must be signed in to change notification settings - Fork 28.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-8379][SQL]avoid speculative tasks write to the same file #6833
Conversation
@@ -197,7 +197,6 @@ case class InsertIntoHiveTable( | |||
table.hiveQlTable.getPartCols().foreach { entry => | |||
orderedPartitionSpec.put(entry.getName, partitionSpec.get(entry.getName).getOrElse("")) | |||
} | |||
val partVals = MetaStoreUtils.getPvals(table.hiveQlTable.getPartCols, partitionSpec) | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code seems never use,so remove it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think https://github.com/apache/spark/pull/5876/files#diff-d579db9a8f27e0bbef37720ab14ec3f6L203 should remove this code. @marmbrus. Right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, i think you are right.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jeanlyn Yeah, this should be removed.
Seems the bug only existed the dynamic partition in HiveContext, @jeanlyn can you confirm that? |
also met this issue when dynamic partition in HiveContext |
@chenghao-intel ,I think it only affect the dynamic partition.Because |
ok to test |
Test build #35053 has finished for PR 6833 at commit
|
ok to test. Is this issue the same as the one reported in #6864? @liancheng |
Test build #35171 has finished for PR 6833 at commit
|
@andrewor14 They are not the same. #6864 affects dynamic partitioning feature of external data sources, while this one is about dynamic partitions of Hive. |
LGTM, thanks for fixing this! Merging to master and branch-1.4. |
The issue link [SPARK-8379](https://issues.apache.org/jira/browse/SPARK-8379) Currently,when we insert data to the dynamic partition with speculative tasks we will get the Exception ``` org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException): Lease mismatch on /tmp/hive-jeanlyn/hive_2015-06-15_15-20-44_734_8801220787219172413-1/-ext-10000/ds=2015-06-15/type=2/part-00301.lzo owned by DFSClient_attempt_201506031520_0011_m_000189_0_-1513487243_53 but is accessed by DFSClient_attempt_201506031520_0011_m_000042_0_-1275047721_57 ``` This pr try to write the data to temporary dir when using dynamic parition avoid the speculative tasks writing the same file Author: jeanlyn <[email protected]> Closes #6833 from jeanlyn/speculation and squashes the following commits: 64bbfab [jeanlyn] use FileOutputFormat.getTaskOutputPath to get the path 8860af0 [jeanlyn] remove the never using code e19a3bd [jeanlyn] avoid speculative tasks write same file (cherry picked from commit a1e3649) Signed-off-by: Cheng Lian <[email protected]>
The issue link [SPARK-8379](https://issues.apache.org/jira/browse/SPARK-8379) Currently,when we insert data to the dynamic partition with speculative tasks we will get the Exception ``` org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException): Lease mismatch on /tmp/hive-jeanlyn/hive_2015-06-15_15-20-44_734_8801220787219172413-1/-ext-10000/ds=2015-06-15/type=2/part-00301.lzo owned by DFSClient_attempt_201506031520_0011_m_000189_0_-1513487243_53 but is accessed by DFSClient_attempt_201506031520_0011_m_000042_0_-1275047721_57 ``` This pr try to write the data to temporary dir when using dynamic parition avoid the speculative tasks writing the same file Author: jeanlyn <[email protected]> Closes apache#6833 from jeanlyn/speculation and squashes the following commits: 64bbfab [jeanlyn] use FileOutputFormat.getTaskOutputPath to get the path 8860af0 [jeanlyn] remove the never using code e19a3bd [jeanlyn] avoid speculative tasks write same file (cherry picked from commit a1e3649) Signed-off-by: Cheng Lian <[email protected]>
The issue link SPARK-8379
Currently,when we insert data to the dynamic partition with speculative tasks we will get the Exception
This pr try to write the data to temporary dir when using dynamic parition avoid the speculative tasks writing the same file