You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
it's about the hdfs directory when save file to disk for mapping hive table, not about the data.
such as
/user/hive/warehouse/model/db/tableA/a=1/b=3/part-0000
/user/hive/warehouse/model/db/tableA/a=2/b=3/part-0000
....
create external table tableA (
....
) partitioned by (a string, b string)
stored as textfile ;
i wander whether has any plan on dynamic partition write function, as it is very common use case.
on cascalog, i use [templatefields and sink-template] keyword to control the dynamic partition.
currently , i convert the rdd to dataframe(by partitionBy) to complete this function.
the rdd has the saveAsHadoopFile and MultipleTextOutputFormat class,
but it need to exended, so it's very inconvenient.
http://stackoverflow.com/questions/23995040/write-to-multiple-outputs-by-key-spark-one-spark-job
The text was updated successfully, but these errors were encountered: