Releases · databricks/koalas

22 May 08:39

ueshin

v0.5.0

ac175a3

Version 0.5.0

We refined the package management and pushed to conda-forge as well as PyPI. Now we can install Koalas with the conda package manager:

conda install koalas -c conda-forge

We also added the following features:

koalas:

concat (#348)

koalas.DataFrame:

astype (#349)
to_records (#298)
size (#356)
iloc (#364)
describe (#375)

koalas.Series:

to_json (#358)
to_csv (#358)
dtypes (#355)
size (#356)
to_excel (#361)
iloc (#364)
all (#359)
any (#359)
dt (#295, #372)
describe (#375)

Along with the following improvements:

Explicitly marked functions deprecated in pandas which we won't support without a special reason. (#342)
Introduced Index/MultiIndex corresponding to pandas', instead of reusing Series. (#341)

Assets 2

15 May 07:17

rxin

v0.4.0

acdadfb

Version 0.4.0

We rapidly improved Koalas in documentation and added new functionalities in the past week. As of this release, all functions are documented. We also added the following features:

koalas:

range (#254) - for generating a distributed sequence of data
sql (#256) - for running SQL queries

koalas.DataFrame:

merge (#264)
to_json (#238)
to_csv (#239)
to_excel (#288)
to_clipboard (#257)
clip (#297)
to_latex (#297)

koalas.Series:

unique (#249)
to_clipboard (#257)
to_latex (#297)
clip (#297)
fillna (#317)
is_unique (#325)
sample (#327)

Along with the following improvements:

Design Principles and Contribution Guide (#246, #255)
DataFrame.drop now supports columns parameter (#253)
repr and repr_html improvements (#258) - only shows top 1000 when the number of values/rows in DataFrame and Series exceed 1000.

Assets 2

07 May 04:27

ueshin

v0.3.0

8cac02b

Version 0.3.0

We fixed a critical bug for Python 3.5 introduced in v0.2.0. #241

Also we have added the following features:

koalas.DataFrame:

isin
to_dict

koalas.Series:

isin
to_dict

and improvements:

koalas.Series:

__add__ and __radd__ now supports string concatenation

koalas.groupby.GroupBy:

agg() now preserves the group keys as indices

and a lot of code and document cleanups.

Assets 2

02 May 00:56

rxin

v0.2.0

720759f

Version 0.2.0

We have implemented a lot of major functionalities in the past week. Here's a summary of what's new in release v0.2.0.

spark.DataFrame:

to_koalas is monkey patched into Spark's DataFrame API when koalas package is imported

koalas.DataFrame:

count
corr
dtypes
groupby
sort_values now supports ascending, na_position, and inplace parameters
to_numpy
to_pandas (with toPandas as an alias for compatibility with Spark)
to_string
Allow direct literal assignment to create a new column
Various stats functions now work with boolean type
In notebooks or REPL, automatically display the content of the DataFrame, similar to pandas

koalas.Series:

alias (as an alias for rename function)
count
groupby
to_numpy
to_pandas (with toPandas as an alias for compatibility with Spark)
to_string
fillna
Various stats functions now work with boolean type
In notebooks or REPL, automatically display the content of the Series, similar to pandas

Significantly improved documentation of the project.

Last but not least, we have done some major refactoring of the codebase and its infrastructure to make it more amenable to changes in the future, e.g.

Now koalas.DataFrame wraps around a Spark DataFrame, rather than directly monkey patching all methods.
Doctests are enabled and can be run directly in PyCharm
Mypy type hint linter is added
Switched from nose to pytest for test infrastructure.
Introduced utility methods to support older versions of pandas. #210
Code coverage report

Assets 2

23 Apr 17:03

rxin

v0.1.0

d2bb9ed

Version 0.1.0

We rewrote the internals of Koalas to make it more extensible for upcoming features. We also laid down the foundation for API reference docs in this release.

Assets 2

19 Apr 17:55

thunterdb

v0.0.6

178bc0c

Version 0.0.6 Pre-release

Pre-release

This version significantly expands the amount of functions available. It is still meant to be a technology preview, and users are encouraged to report issues that they encounter with their current pandas code.

Noteworthy features:

indexing is now supported
slicing and accessing columns is much improved
most of the methods are accessible as stubs
support for N/A (fillna, dropna, etc.) has been added

We thank all the contributors who have contributed to this release.

Assets 3

26 Mar 13:50

thunterdb

v0.0.5

ecb2040

Version 0.0.5 Pre-release

Pre-release

This is the initial release outside Databricks.

This release is meant to be a technology preview. See the README.md file for more information.

Assets 4

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Releases: databricks/koalas

Version 0.5.0

Version 0.4.0

Version 0.3.0

Version 0.2.0

Version 0.1.0

Version 0.0.6

Version 0.0.5