Releases: databricks/koalas
Version 0.5.0
We refined the package management and pushed to conda-forge as well as PyPI. Now we can install Koalas with the conda package manager:
conda install koalas -c conda-forge
We also added the following features:
koalas:
- concat (#348)
koalas.DataFrame:
koalas.Series:
- to_json (#358)
- to_csv (#358)
- dtypes (#355)
- size (#356)
- to_excel (#361)
- iloc (#364)
- all (#359)
- any (#359)
- dt (#295, #372)
- describe (#375)
Along with the following improvements:
Version 0.4.0
We rapidly improved Koalas in documentation and added new functionalities in the past week. As of this release, all functions are documented. We also added the following features:
koalas:
koalas.DataFrame:
- merge (#264)
- to_json (#238)
- to_csv (#239)
- to_excel (#288)
- to_clipboard (#257)
- clip (#297)
- to_latex (#297)
koalas.Series:
- unique (#249)
- to_clipboard (#257)
- to_latex (#297)
- clip (#297)
- fillna (#317)
- is_unique (#325)
- sample (#327)
Along with the following improvements:
Version 0.3.0
We fixed a critical bug for Python 3.5 introduced in v0.2.0. #241
Also we have added the following features:
koalas.DataFrame:
- isin
- to_dict
koalas.Series:
- isin
- to_dict
and improvements:
koalas.Series:
__add__
and__radd__
now supports string concatenation
koalas.groupby.GroupBy:
agg()
now preserves the group keys as indices
and a lot of code and document cleanups.
Version 0.2.0
We have implemented a lot of major functionalities in the past week. Here's a summary of what's new in release v0.2.0.
spark.DataFrame:
- to_koalas is monkey patched into Spark's DataFrame API when koalas package is imported
koalas.DataFrame:
- count
- corr
- dtypes
- groupby
- sort_values now supports ascending, na_position, and inplace parameters
- to_numpy
- to_pandas (with toPandas as an alias for compatibility with Spark)
- to_string
- Allow direct literal assignment to create a new column
- Various stats functions now work with boolean type
- In notebooks or REPL, automatically display the content of the DataFrame, similar to pandas
koalas.Series:
- alias (as an alias for rename function)
- count
- groupby
- to_numpy
- to_pandas (with toPandas as an alias for compatibility with Spark)
- to_string
- fillna
- Various stats functions now work with boolean type
- In notebooks or REPL, automatically display the content of the Series, similar to pandas
Significantly improved documentation of the project.
Last but not least, we have done some major refactoring of the codebase and its infrastructure to make it more amenable to changes in the future, e.g.
- Now koalas.DataFrame wraps around a Spark DataFrame, rather than directly monkey patching all methods.
- Doctests are enabled and can be run directly in PyCharm
- Mypy type hint linter is added
- Switched from nose to pytest for test infrastructure.
- Introduced utility methods to support older versions of pandas. #210
- Code coverage report
Version 0.1.0
We rewrote the internals of Koalas to make it more extensible for upcoming features. We also laid down the foundation for API reference docs in this release.
Version 0.0.6
This version significantly expands the amount of functions available. It is still meant to be a technology preview, and users are encouraged to report issues that they encounter with their current pandas code.
Noteworthy features:
- indexing is now supported
- slicing and accessing columns is much improved
- most of the methods are accessible as stubs
- support for N/A (fillna, dropna, etc.) has been added
We thank all the contributors who have contributed to this release.
Version 0.0.5
This is the initial release outside Databricks.
This release is meant to be a technology preview. See the README.md file for more information.