Fix reset_index with the default index is "distributed-sequence". #1193

ueshin · 2020-01-14T22:29:12Z

No description provided.

ueshin · 2020-01-14T22:30:12Z

databricks/koalas/frame.py

+            index_map=index_map,
+            column_index=column_index,
+            column_scols=([scol_for(sdf, name_like_string(name)) for _, name in new_index_map]
+                          + [scol_for(sdf, col) for col in self._internal.data_columns]))


Essentially this line is the fix.

softagram-bot · 2020-01-14T22:30:48Z

Softagram Impact Report for pull/1193 (head commit: `ae203c8`)

⚠️ Copy paste found

ℹ️ test_dataframe.py: Copy paste fragment inside the same file on lines 1312, 1332:

                       pdf.replace([0, 1, 2, 3, 5, 6], 4))

        self.assert_eq(kdf.replace([0, 1, 2, 3, 5, 6], [6, 5, 4, 3, 2, 1]),
     ...(truncated 253 chars)

ℹ️ test_dataframe.py: Copy paste fragment inside the same file on lines 1537, 1606:

                                  \"bar\", \"bar\", \"bar\", \"bar\"],
                            \"B\": [\"one\", \"one\", \"one\", \"two\", \"two\",...(truncated 486 chars)

ℹ️ test_dataframe.py: Copy paste fragment inside the same file on lines 2149, 2198:


        self.assert_eq(kdf.filter(like='b', axis='index'), pdf.filter(like='b', axis='index'))
        self.assert_eq(kdf.filter(like=...(truncated 312 chars)

ℹ️ test_dataframe.py: Copy paste fragment inside the same file on lines 606, 653:

        self.assert_eq(pdf.fillna(method='ffill'), kdf.fillna(method='ffill'))
        self.assert_eq(pdf.fillna(method='ffill', limit=2...(truncated 215 chars)

ℹ️ test_dataframe.py: Copy paste fragment inside the same file on lines 523, 2116, 2125:

        pdf = pd.DataFrame({'x': [np.nan, 2, 3, 4, np.nan, 6],
                            'y': [1, 2, np.nan, 4, np.nan, np.nan],
                            'z': [1, 2, 3,...(truncated 159 chars)

ℹ️ test_dataframe.py: Copy paste fragment inside the same file on lines 523, 596, 2116, 2125:

        pdf = pd.DataFrame({'x': [np.nan, 2, 3, 4, np.nan, 6],
                            'y': [1, 2, np.nan, 4, np.nan, np.nan],
                            'z': [1, 2, 3,...(truncated 159 chars)

ℹ️ test_dataframe.py: Copy paste fragment inside the same file on lines 1910, 1939:

        pdf = pd.DataFrame({
            'col1': [False, False, False],
            'col2': [True, False, False],
            'col3': [0, 0, 1],
            'col4': [0, 1, 2],
...(truncated 200 chars)

ℹ️ test_dataframe.py: Copy paste fragment inside the same file on lines 80, 147:

        pdf = pd.DataFrame({
            ('x', 'a', '1'): [1, 2, 3],
            ('x', 'b', '2'): [4, 5, 6],
            ('y.z', 'c.d', '3'): [7, 8, 9]...(truncated 114 chars)

ℹ️ test_dataframe.py: Copy paste fragment inside the same file on lines 1407, 1568:

        pdf = pd.DataFrame({'a': [4, 2, 3, 4, 8, 6],
                            'b': [1, 2, 2, 4, 2, 4],
                            'e': [1, 2, 2, 4, 2, 4],
  ...(truncated 137 chars)

ℹ️ test_dataframe.py: Copy paste fragment inside the same file on lines 1682, 1700, 1718:

        arrays = [np.array(['A', 'A', 'B', 'B']),
                  np.array(['one', 'two', 'one', 'two'])]
        pdf = pd.DataFrame(np.random.randn(3, ...(truncated 153 chars)

ℹ️ test_series.py: Copy paste fragment inside the same file on lines 306, 328:

        kser = ks.from_pandas(pser)

        self.assert_eq(kser.value_counts(normalize=True),
                       pser.value_counts(normaliz...(truncated 1204 chars)

ℹ️ test_series.py: Copy paste fragment inside the same file on lines 328, 350:

        kser = ks.from_pandas(pser)

        self.assert_eq(kser.value_counts(normalize=True),
                    ...(truncated 1114 chars)

ℹ️ test_series.py: Copy paste fragment inside the same file on lines 328, 350, 374:

        kser = ks.from_pandas(pser)

        self.assert_eq(kser.value_counts(normalize=True),
                    ...(truncated 1114 chars)

ℹ️ test_series.py: Copy paste fragment inside the same file on lines 306, 350, 374:

        kser = ks.from_pandas(pser)

        self.assert_eq(kser.value_counts(normalize=True),
                       pser.value_counts(normaliz...(truncated 1085 chars)

ℹ️ test_series.py: Copy paste fragment inside the same file on lines 255, 307, 329, 351, 375:


        self.assert_eq(kser.value_counts(normalize=True),
                       pser.value_counts(normalize=True), almost=True)
        self.assert_eq(k...(truncated 1039 chars)

ℹ️ test_series.py: Copy paste fragment inside the same file on lines 294, 316, 338:


        self.assert_eq(kser.index.value_counts(normalize=True),
                       pser.index.value_counts(no...(truncated 595 chars)

ℹ️ test_series.py: Copy paste fragment inside the same file on lines 264, 294, 360, 384:


        self.assert_eq(kser.index.value_counts(normalize=True),
                       pser.index.value_counts(nor...(truncated 505 chars)

ℹ️ test_series.py: Copy paste fragment inside the same file on lines 439, 456:

                     pd.Series([True, False], name='x'),
                     pd.Series([0, 1], name='x'),
                     pd.Series([1, 2,...(truncated 330 chars)

ℹ️ test_series.py: Copy paste fragment inside the same file on lines 924, 1065:

        midx = pd.MultiIndex([['lama', 'cow', 'falcon'],
                              ['speed', 'weight', 'length']],
                             [[0, 0, 0, 1, 1, 1, 2, 2, 2]...(truncated 280 chars)

ℹ️ test_series.py: Copy paste fragment inside the same file on lines 925, 1066, 1102:

                              ['speed', 'weight', 'length']],
                             [[0, 0, 0, 1, 1, 1, 2, 2, 2],
                      ...(truncated 256 chars)

ℹ️ test_series.py: Copy paste fragment inside the same file on lines 896, 1099:


        # For MultiIndex
        midx = pd.MultiIndex([['lama', 'cow', 'falcon'],
                              ['speed', 'weight', 'length']],
                             [[...(truncated 167 chars)

ℹ️ test_series.py: Copy paste fragment inside the same file on lines 899, 925, 1066:

                              ['speed', 'weight', 'length']],
                             [[0, 0, 0, 1, 1, 1, 2, 2, 2],
                      ...(truncated 117 chars)

ℹ️ frame.py: Copy paste fragment on line 5755 shared with ../namespace.py:

              on: Union[str, List[str], Tuple[str, ...], List[Tuple[str, ...]]] = None,
              left_on: Union[str, List[str], Tuple[s...(truncated 273 chars)

ℹ️ frame.py: Copy paste fragment inside the same file on lines 7257, 7340:


        # TODO: there is a similar logic to transpose in, for instance,
        #  DataFrame.any, Series.quantile. Maybe ...(truncated 1065 chars)

ℹ️ frame.py: Copy paste fragment inside the same file on lines 4916, 4938:

            sdf = self._sdf.select(
                self._internal.index_scols +
                [self._internal.scol_for(idx...(truncated 505 chars)

Now that you are on the file, it would be easier to pay back some tech. debt.

⭐ Change Overview

(Open in Softagram Desktop for full details)

💡 Insights

Co-change Alert: You modified frame.py. Often series.py (databricks/koalas) is modified at the same time.
Co-change Alert: You modified test_series.py. Often series.py (databricks/koalas) is modified at the same time.

📄 Full report

Permalink: Full report for pull/1193

Impact Report explained. Give feedback on this report to [email protected]

codecov-io · 2020-01-14T23:16:53Z

Codecov Report

Merging #1193 into master will decrease coverage by <.01%.
The diff coverage is 100%.

@@            Coverage Diff             @@
##           master    #1193      +/-   ##
==========================================
- Coverage   95.22%   95.22%   -0.01%     
==========================================
  Files          35       35              
  Lines        7141     7138       -3     
==========================================
- Hits         6800     6797       -3     
  Misses        341      341

Impacted Files	Coverage Δ
databricks/koalas/frame.py	`97% <100%> (-0.01%)`	⬇️
databricks/koalas/internal.py	`95.72% <0%> (ø)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 1e1da27...ae203c8. Read the comment docs.

HyukjinKwon

Nice, LGTM

Fix reset_index with the default index is "distributed-sequence".

ae203c8

ueshin requested a review from HyukjinKwon January 14, 2020 22:29

ueshin commented Jan 14, 2020

View reviewed changes

HyukjinKwon approved these changes Jan 15, 2020

View reviewed changes

HyukjinKwon merged commit f94ef62 into databricks:master Jan 15, 2020

ueshin deleted the reset_index branch January 15, 2020 23:35

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix reset_index with the default index is "distributed-sequence". #1193

Fix reset_index with the default index is "distributed-sequence". #1193

ueshin commented Jan 14, 2020

ueshin Jan 14, 2020

softagram-bot commented Jan 14, 2020

codecov-io commented Jan 14, 2020

HyukjinKwon left a comment

Fix reset_index with the default index is "distributed-sequence". #1193

Fix reset_index with the default index is "distributed-sequence". #1193

Conversation

ueshin commented Jan 14, 2020

ueshin Jan 14, 2020

Choose a reason for hiding this comment

softagram-bot commented Jan 14, 2020

Softagram Impact Report for pull/1193 (head commit: ae203c8)

⚠️ Copy paste found

⭐ Change Overview

💡 Insights

📄 Full report

codecov-io commented Jan 14, 2020

Codecov Report

HyukjinKwon left a comment

Choose a reason for hiding this comment

Softagram Impact Report for pull/1193 (head commit: `ae203c8`)