Release Version 0.2.0 · databricks/koalas

We have implemented a lot of major functionalities in the past week. Here's a summary of what's new in release v0.2.0.

spark.DataFrame:

to_koalas is monkey patched into Spark's DataFrame API when koalas package is imported

koalas.DataFrame:

count
corr
dtypes
groupby
sort_values now supports ascending, na_position, and inplace parameters
to_numpy
to_pandas (with toPandas as an alias for compatibility with Spark)
to_string
Allow direct literal assignment to create a new column
Various stats functions now work with boolean type
In notebooks or REPL, automatically display the content of the DataFrame, similar to pandas

koalas.Series:

alias (as an alias for rename function)
count
groupby
to_numpy
to_pandas (with toPandas as an alias for compatibility with Spark)
to_string
fillna
Various stats functions now work with boolean type
In notebooks or REPL, automatically display the content of the Series, similar to pandas

Significantly improved documentation of the project.

Last but not least, we have done some major refactoring of the codebase and its infrastructure to make it more amenable to changes in the future, e.g.

Now koalas.DataFrame wraps around a Spark DataFrame, rather than directly monkey patching all methods.
Doctests are enabled and can be run directly in PyCharm
Mypy type hint linter is added
Switched from nose to pytest for test infrastructure.
Introduced utility methods to support older versions of pandas. #210
Code coverage report

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Version 0.2.0