Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GH-37592: [MATLAB] Add NumRows property to arrow.tabular.RecordBatch #38215

Merged
merged 2 commits into from
Oct 11, 2023

Conversation

kevingurney
Copy link
Member

@kevingurney kevingurney commented Oct 11, 2023

Rationale for this change

Currently, there is a NumColumns property on arrow.tabular.RecordBatch, but no NumRows property. It would be useful to be able to query the number of rows in a RecordBatch.

This pull request adds a NumRows property to arrow.tabular.RecordBatch to mirror the design of arrow.tabular.Table.

What changes are included in this PR?

  1. Added new NumRows property to arrow.tabular.RecordBatch

Example

>> matlabTable = array2table(rand(10, 5))           

matlabTable =

  10x5 table

      Var1        Var2       Var3       Var4        Var5  
    ________    ________    _______    _______    ________

     0.76062     0.12009    0.98898    0.29974     0.42165
     0.64994     0.85116    0.71768    0.58693     0.31061
     0.33593     0.87823    0.87766    0.38206     0.45742
    0.031364      0.8336    0.71528    0.14987      0.3618
      0.5986     0.81193    0.25784    0.21073     0.76715
     0.46493     0.40281    0.39729    0.16737     0.94521
     0.18738     0.16351    0.46437    0.45545     0.40774
     0.67682      0.3577    0.94882     0.1295    0.022501
     0.29368     0.47122    0.99682    0.46011     0.34275
      0.6849    0.064717    0.89719    0.38302      0.4523

>> arrowRecordBatch = arrow.recordBatch(matlabTable);

>> arrowRecordBatch.NumRows

ans =

  int64

   10

Are these changes tested?

Yes.

  1. Added NumRows test to tRecordBatch test class.
  2. Updated EmptyTable test (renamed to EmptyRecordBatch) in tRecordBatch test class.
  3. Added FromArraysNoInputs test to mirror the FromArraysNoInputs test in tTable test class.

Are there any user-facing changes?

Yes.

This pull request adds a new public NumRows property to the arrow.tabular.RecordBatch class. Users can query the number of rows in an arrow.tabular.RecordBatch by accessing the NumRows property.

Future Directions

  1. [MATLAB] Add a common arrow.tabular.Tabular MATLAB interface #38214
  2. [MATLAB] Create a superclass for tabular type MATLAB tests (i.e. for Table and RecordBatch) #38213

@kevingurney
Copy link
Member Author

+1

@kevingurney kevingurney merged commit ef02417 into apache:main Oct 11, 2023
10 checks passed
@kevingurney kevingurney deleted the GH-37592 branch October 11, 2023 19:07
@kevingurney kevingurney removed the awaiting committer review Awaiting committer review label Oct 11, 2023
llama90 pushed a commit to llama90/arrow that referenced this pull request Oct 12, 2023
…ordBatch` (apache#38215)

### Rationale for this change

Currently, there is a `NumColumns` property on `arrow.tabular.RecordBatch`, but no `NumRows` property. It would be useful to be able to query the number of rows in a `RecordBatch`.

This pull request adds a `NumRows` property to `arrow.tabular.RecordBatch` to mirror the design of `arrow.tabular.Table`.

### What changes are included in this PR?

1. Added new `NumRows` property to `arrow.tabular.RecordBatch`

**Example**
```matlab
>> matlabTable = array2table(rand(10, 5))           

matlabTable =

  10x5 table

      Var1        Var2       Var3       Var4        Var5  
    ________    ________    _______    _______    ________

     0.76062     0.12009    0.98898    0.29974     0.42165
     0.64994     0.85116    0.71768    0.58693     0.31061
     0.33593     0.87823    0.87766    0.38206     0.45742
    0.031364      0.8336    0.71528    0.14987      0.3618
      0.5986     0.81193    0.25784    0.21073     0.76715
     0.46493     0.40281    0.39729    0.16737     0.94521
     0.18738     0.16351    0.46437    0.45545     0.40774
     0.67682      0.3577    0.94882     0.1295    0.022501
     0.29368     0.47122    0.99682    0.46011     0.34275
      0.6849    0.064717    0.89719    0.38302      0.4523

>> arrowRecordBatch = arrow.recordBatch(matlabTable);

>> arrowRecordBatch.NumRows

ans =

  int64

   10
```

### Are these changes tested?

Yes.

1. Added `NumRows` test to `tRecordBatch` test class.
3. Updated `EmptyTable` test (renamed to `EmptyRecordBatch`) in `tRecordBatch` test class.
4. Added  `FromArraysNoInputs` test to mirror the `FromArraysNoInputs` test in `tTable` test class.

### Are there any user-facing changes?

Yes.

This pull request adds a new public `NumRows` property to the `arrow.tabular.RecordBatch` class. Users can query the number of rows in an `arrow.tabular.RecordBatch` by accessing the `NumRows` property.

### Future Directions

1. apache#38214
3. apache#38213 
* Closes: apache#37592

Authored-by: Kevin Gurney <[email protected]>
Signed-off-by: Kevin Gurney <[email protected]>
@conbench-apache-arrow
Copy link

After merging your PR, Conbench analyzed the 5 benchmarking runs that have been run so far on merge-commit ef02417.

There were no benchmark performance regressions. 🎉

The full Conbench report has more details. It also includes information about 2 possible false positives for unstable benchmarks that are known to sometimes produce them.

JerAguilon pushed a commit to JerAguilon/arrow that referenced this pull request Oct 23, 2023
…ordBatch` (apache#38215)

### Rationale for this change

Currently, there is a `NumColumns` property on `arrow.tabular.RecordBatch`, but no `NumRows` property. It would be useful to be able to query the number of rows in a `RecordBatch`.

This pull request adds a `NumRows` property to `arrow.tabular.RecordBatch` to mirror the design of `arrow.tabular.Table`.

### What changes are included in this PR?

1. Added new `NumRows` property to `arrow.tabular.RecordBatch`

**Example**
```matlab
>> matlabTable = array2table(rand(10, 5))           

matlabTable =

  10x5 table

      Var1        Var2       Var3       Var4        Var5  
    ________    ________    _______    _______    ________

     0.76062     0.12009    0.98898    0.29974     0.42165
     0.64994     0.85116    0.71768    0.58693     0.31061
     0.33593     0.87823    0.87766    0.38206     0.45742
    0.031364      0.8336    0.71528    0.14987      0.3618
      0.5986     0.81193    0.25784    0.21073     0.76715
     0.46493     0.40281    0.39729    0.16737     0.94521
     0.18738     0.16351    0.46437    0.45545     0.40774
     0.67682      0.3577    0.94882     0.1295    0.022501
     0.29368     0.47122    0.99682    0.46011     0.34275
      0.6849    0.064717    0.89719    0.38302      0.4523

>> arrowRecordBatch = arrow.recordBatch(matlabTable);

>> arrowRecordBatch.NumRows

ans =

  int64

   10
```

### Are these changes tested?

Yes.

1. Added `NumRows` test to `tRecordBatch` test class.
3. Updated `EmptyTable` test (renamed to `EmptyRecordBatch`) in `tRecordBatch` test class.
4. Added  `FromArraysNoInputs` test to mirror the `FromArraysNoInputs` test in `tTable` test class.

### Are there any user-facing changes?

Yes.

This pull request adds a new public `NumRows` property to the `arrow.tabular.RecordBatch` class. Users can query the number of rows in an `arrow.tabular.RecordBatch` by accessing the `NumRows` property.

### Future Directions

1. apache#38214
3. apache#38213 
* Closes: apache#37592

Authored-by: Kevin Gurney <[email protected]>
Signed-off-by: Kevin Gurney <[email protected]>
loicalleyne pushed a commit to loicalleyne/arrow that referenced this pull request Nov 13, 2023
…ordBatch` (apache#38215)

### Rationale for this change

Currently, there is a `NumColumns` property on `arrow.tabular.RecordBatch`, but no `NumRows` property. It would be useful to be able to query the number of rows in a `RecordBatch`.

This pull request adds a `NumRows` property to `arrow.tabular.RecordBatch` to mirror the design of `arrow.tabular.Table`.

### What changes are included in this PR?

1. Added new `NumRows` property to `arrow.tabular.RecordBatch`

**Example**
```matlab
>> matlabTable = array2table(rand(10, 5))           

matlabTable =

  10x5 table

      Var1        Var2       Var3       Var4        Var5  
    ________    ________    _______    _______    ________

     0.76062     0.12009    0.98898    0.29974     0.42165
     0.64994     0.85116    0.71768    0.58693     0.31061
     0.33593     0.87823    0.87766    0.38206     0.45742
    0.031364      0.8336    0.71528    0.14987      0.3618
      0.5986     0.81193    0.25784    0.21073     0.76715
     0.46493     0.40281    0.39729    0.16737     0.94521
     0.18738     0.16351    0.46437    0.45545     0.40774
     0.67682      0.3577    0.94882     0.1295    0.022501
     0.29368     0.47122    0.99682    0.46011     0.34275
      0.6849    0.064717    0.89719    0.38302      0.4523

>> arrowRecordBatch = arrow.recordBatch(matlabTable);

>> arrowRecordBatch.NumRows

ans =

  int64

   10
```

### Are these changes tested?

Yes.

1. Added `NumRows` test to `tRecordBatch` test class.
3. Updated `EmptyTable` test (renamed to `EmptyRecordBatch`) in `tRecordBatch` test class.
4. Added  `FromArraysNoInputs` test to mirror the `FromArraysNoInputs` test in `tTable` test class.

### Are there any user-facing changes?

Yes.

This pull request adds a new public `NumRows` property to the `arrow.tabular.RecordBatch` class. Users can query the number of rows in an `arrow.tabular.RecordBatch` by accessing the `NumRows` property.

### Future Directions

1. apache#38214
3. apache#38213 
* Closes: apache#37592

Authored-by: Kevin Gurney <[email protected]>
Signed-off-by: Kevin Gurney <[email protected]>
dgreiss pushed a commit to dgreiss/arrow that referenced this pull request Feb 19, 2024
…ordBatch` (apache#38215)

### Rationale for this change

Currently, there is a `NumColumns` property on `arrow.tabular.RecordBatch`, but no `NumRows` property. It would be useful to be able to query the number of rows in a `RecordBatch`.

This pull request adds a `NumRows` property to `arrow.tabular.RecordBatch` to mirror the design of `arrow.tabular.Table`.

### What changes are included in this PR?

1. Added new `NumRows` property to `arrow.tabular.RecordBatch`

**Example**
```matlab
>> matlabTable = array2table(rand(10, 5))           

matlabTable =

  10x5 table

      Var1        Var2       Var3       Var4        Var5  
    ________    ________    _______    _______    ________

     0.76062     0.12009    0.98898    0.29974     0.42165
     0.64994     0.85116    0.71768    0.58693     0.31061
     0.33593     0.87823    0.87766    0.38206     0.45742
    0.031364      0.8336    0.71528    0.14987      0.3618
      0.5986     0.81193    0.25784    0.21073     0.76715
     0.46493     0.40281    0.39729    0.16737     0.94521
     0.18738     0.16351    0.46437    0.45545     0.40774
     0.67682      0.3577    0.94882     0.1295    0.022501
     0.29368     0.47122    0.99682    0.46011     0.34275
      0.6849    0.064717    0.89719    0.38302      0.4523

>> arrowRecordBatch = arrow.recordBatch(matlabTable);

>> arrowRecordBatch.NumRows

ans =

  int64

   10
```

### Are these changes tested?

Yes.

1. Added `NumRows` test to `tRecordBatch` test class.
3. Updated `EmptyTable` test (renamed to `EmptyRecordBatch`) in `tRecordBatch` test class.
4. Added  `FromArraysNoInputs` test to mirror the `FromArraysNoInputs` test in `tTable` test class.

### Are there any user-facing changes?

Yes.

This pull request adds a new public `NumRows` property to the `arrow.tabular.RecordBatch` class. Users can query the number of rows in an `arrow.tabular.RecordBatch` by accessing the `NumRows` property.

### Future Directions

1. apache#38214
3. apache#38213 
* Closes: apache#37592

Authored-by: Kevin Gurney <[email protected]>
Signed-off-by: Kevin Gurney <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[MATLAB] Add NumRows property to arrow.tabular.RecordBatch
2 participants