Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix median to support multi-index columns. #995

Merged
merged 3 commits into from
Nov 5, 2019

Conversation

ueshin
Copy link
Collaborator

@ueshin ueshin commented Nov 1, 2019

This PR fixes _Frame.median() to:

  • follow _scol for Series.
  • support multi-index columns.
  • avoid attaching default index for DataFrame.

@codecov-io
Copy link

codecov-io commented Nov 1, 2019

Codecov Report

Merging #995 into master will not change coverage.
The diff coverage is 100%.

Impacted file tree graph

@@           Coverage Diff           @@
##           master     #995   +/-   ##
=======================================
  Coverage   94.84%   94.84%           
=======================================
  Files          34       34           
  Lines        6519     6519           
=======================================
  Hits         6183     6183           
  Misses        336      336
Impacted Files Coverage Δ
databricks/koalas/generic.py 96.17% <100%> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b161428...04d8618. Read the comment docs.

@softagram-bot
Copy link

Softagram Impact Report for pull/995 (head commit: e1d0b07)

⭐ Change Overview

Showing the changed files, dependency changes and the impact - click for full size
(Open in Softagram Desktop for full details)

💡 Insights

  • Co-change Alert: You modified test_stats.py. Often frame.py (databricks/koalas) is modified at the same time.

📄 Full report

Impact Report explained. Give feedback on this report to [email protected]

sdf = sdf.select([median(col).alias(col) for col in kdf._internal.data_columns])

# Attach a dummy column for index to avoid default index.
sdf = sdf.withColumn('__DUMMY__', F.monotonically_increasing_id())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm .. maybe we should just document __...__ columns are reserved in Koalas .. of course in a separate PR..

@HyukjinKwon HyukjinKwon merged commit b3add38 into databricks:master Nov 5, 2019
@ueshin ueshin deleted the median branch November 5, 2019 19:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants