You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
My environment: Databrick platform: runtime 7.0 ML
Koalas: 1.0.1
I'm trying to write parquets from a koalas dataframe to S3 with partitions. The partitions are not created (I tried with a single or multiple partition cols).
If I'm using the pyspark API, the partitions are created.
code:
no partition created
df.to_parquet(path='s3://{bucket}/{Path_to_data}, mode='overwrite', compression='gzip', partition_cols=['year','month','day'])
Partition created
df.to_spark().write.mode('overwrite').partitionBy('year', 'month', 'day').parquet('s3://{bucket}/{Path_to_data}')
I would like to have the possibilities to use the partitions in koalas in the same way I'm doing it in spark.
The text was updated successfully, but these errors were encountered:
Refine Spark I/O to:
- Set `partitionBy` explicitly in `to_parquet`.
- Add `mode` and `partition_cols` to `to_csv` and `to_json`.
- Fix type hints to use `Optional`.
Resolves#1666.
My environment: Databrick platform: runtime 7.0 ML
Koalas: 1.0.1
I'm trying to write parquets from a koalas dataframe to S3 with partitions. The partitions are not created (I tried with a single or multiple partition cols).
If I'm using the pyspark API, the partitions are created.
code:
no partition created
df.to_parquet(path='s3://{bucket}/{Path_to_data}, mode='overwrite', compression='gzip', partition_cols=['year','month','day'])
Partition created
df.to_spark().write.mode('overwrite').partitionBy('year', 'month', 'day').parquet('s3://{bucket}/{Path_to_data}')
I would like to have the possibilities to use the partitions in koalas in the same way I'm doing it in spark.
The text was updated successfully, but these errors were encountered: