Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Go][Parquet] File writer only tracks the number of rows written in the last row group #38516

Closed
tschaub opened this issue Oct 30, 2023 · 0 comments · Fixed by #38517
Closed

Comments

@tschaub
Copy link
Contributor

tschaub commented Oct 30, 2023

Describe the bug, including details regarding any error messages, version, and platform.

The file.Writer currently increments its nrows field in the writer.Close() method.

if fw.rowGroupWriter != nil {
fw.nrows += fw.rowGroupWriter.nrows
fw.rowGroupWriter.Close()
}

This only accounts for rows written by the last row group writer. In cases where multiple row groups are appended, the total number of rows written is not reported by the writer.NumRows() method.

Component(s)

Go, Parquet

zeroshade pushed a commit that referenced this issue Nov 13, 2023
…ending a new row group (#38517)

### Rationale for this change

This makes it so the `NumRows` method on the `file.Writer` reports the total number of rows written across multiple row groups.

### Are these changes tested?

A regression test is added that asserts that the total number of rows written matches expectations.

* Closes: #38516

Authored-by: Tim Schaub <[email protected]>
Signed-off-by: Matt Topol <[email protected]>
@zeroshade zeroshade added this to the 15.0.0 milestone Nov 13, 2023
loicalleyne pushed a commit to loicalleyne/arrow that referenced this issue Nov 13, 2023
…en appending a new row group (apache#38517)

### Rationale for this change

This makes it so the `NumRows` method on the `file.Writer` reports the total number of rows written across multiple row groups.

### Are these changes tested?

A regression test is added that asserts that the total number of rows written matches expectations.

* Closes: apache#38516

Authored-by: Tim Schaub <[email protected]>
Signed-off-by: Matt Topol <[email protected]>
dgreiss pushed a commit to dgreiss/arrow that referenced this issue Feb 19, 2024
…en appending a new row group (apache#38517)

### Rationale for this change

This makes it so the `NumRows` method on the `file.Writer` reports the total number of rows written across multiple row groups.

### Are these changes tested?

A regression test is added that asserts that the total number of rows written matches expectations.

* Closes: apache#38516

Authored-by: Tim Schaub <[email protected]>
Signed-off-by: Matt Topol <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants