Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Series.combine_first #1290

Merged
merged 14 commits into from
Mar 25, 2020
Merged

Conversation

itholic
Copy link
Contributor

@itholic itholic commented Feb 18, 2020

Implement Series.combine_first

  • basic example
>>> s1 = ks.Series([1, np.nan])
>>> s2 = ks.Series([3, 4])
>>> s1.combine_first(s2)
0    1.0
1    4.0
Name: 0, dtype: float64
  • MultiIndex
>>> midx1 = pd.MultiIndex([['lama', 'cow', 'falcon', 'koala'],
...                        ['speed', 'weight', 'length', 'power']],
...                       [[0, 3, 1, 1, 1, 2, 2, 2],
...                        [0, 2, 0, 3, 2, 0, 1, 3]])
>>> midx2 = pd.MultiIndex([['lama', 'cow', 'falcon'],
...                        ['speed', 'weight', 'length']],
...                       [[0, 0, 0, 1, 1, 1, 2, 2, 2],
...                        [0, 1, 2, 0, 1, 2, 0, 1, 2]])
>>> kser1 = ks.Series([45, 200, 1.2, 30, 250, 1.5, 320, 1], index=midx1)
>>> kser2 = ks.Series([-45, 200, -1.2, 30, -250, 1.5, 320, 1, -0.3], index=midx2)
>>> kser1
lama    speed      45.0
koala   length    200.0
cow     speed       1.2
        power      30.0
        length    250.0
falcon  speed       1.5
        weight    320.0
        power       1.0
Name: 0, dtype: float64
>>> kser2
lama    speed     -45.0
        weight    200.0
        length     -1.2
cow     speed      30.0
        weight   -250.0
        length      1.5
falcon  speed     320.0
        weight      1.0
        length     -0.3
Name: 0, dtype: float64

>>> kser1.combine_first(kser2)
cow     length    250.0
        power      30.0
        speed       1.2
        weight   -250.0
falcon  length     -0.3
        power       1.0
        speed       1.5
        weight    320.0
koala   length    200.0
lama    length     -1.2
        speed      45.0
        weight    200.0
Name: 0, dtype: float64

@itholic
Copy link
Contributor Author

itholic commented Feb 18, 2020

i considered putting Series.combine and Series.combine_first in single PR,

but their implementation concept was way different than i thought, so i separated them.

@codecov-io
Copy link

codecov-io commented Mar 1, 2020

Codecov Report

Merging #1290 into master will increase coverage by 0.01%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1290      +/-   ##
==========================================
+ Coverage   95.25%   95.26%   +0.01%     
==========================================
  Files          34       34              
  Lines        7541     7559      +18     
==========================================
+ Hits         7183     7201      +18     
  Misses        358      358              
Impacted Files Coverage Δ
databricks/koalas/missing/series.py 100.00% <ø> (ø)
databricks/koalas/series.py 96.86% <100.00%> (+0.07%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 45c325b...c4fb5d0. Read the comment docs.

@HyukjinKwon
Copy link
Member

I think you should rebase and sync to the current master, @itholic .

@HyukjinKwon
Copy link
Member

Looks fine otherwise.

@HyukjinKwon HyukjinKwon merged commit d4012b6 into databricks:master Mar 25, 2020
@itholic itholic deleted the s_combine_first branch March 25, 2020 13:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants