-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GH-38015: [MATLAB] Add arrow.buffer.Buffer
class to the MATLAB Interface
#38020
Conversation
Thanks for opening a pull request! If this is not a minor PR. Could you open an issue for this pull request on GitHub? https://github.com/apache/arrow/issues/new/choose Opening GitHub issues ahead of time contributes to the Openness of the Apache Arrow project. Then could you also rename the pull request title in the following format?
or
In the case of PARQUET issues on JIRA the title also supports:
See also: |
arrow.buffer.Buffer
class to the MATLAB Interfacearrow.buffer.Buffer
class to the MATLAB Interface
proxies in the arrow.array.* namespace
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
Sorry, I had some pending comments last week - but I accidentally never published them. |
No worries! I was out myself. |
+1 |
After merging your PR, Conbench analyzed the 5 benchmarking runs that have been run so far on merge-commit c37059a. There were no benchmark performance regressions. 🎉 The full Conbench report has more details. It also includes information about 3 possible false positives for unstable benchmarks that are known to sometimes produce them. |
…B Interface (apache#38020) ### Rationale for this change To unblock use cases that are not satisfied by the default Arrow -> MATLAB conversions (i.e. the `toMATLAB()` on `arrow.array.Array`), we would like expose the underlying Arrow data representation as a property on `arrow.array.Array`. One possible name for this property would be `DataLayout`, which would be an `arrow.array.DataLayout` object. Note, this class does not yet exist, so we would have to add it. For example, the `DataLayout` property for temporal array types would return an object of the following class type: ```matlab classdef TemporalDataLayout < arrow.array.DataLayout properties Values % an arrow.array.Int32Array or an arrow.array.Int64Array Valid % an arrow.buffer.Buffer end end ``` However, the `Valid` property on this class would need to be an `arrow.buffer.Buffer` object, which does not yet exist in the MATLAB interface. Therefore, it would be helpful to first add the `arrow.buffer.Buffer` class before adding the `DataLayout` property/class hierarchy. It's worth mentioning that adding `arrow.buffer.Buffer` will open up additional advanced use cases in the future. ### What changes are included in this PR? Added `arrow.buffer.Buffer` MATLAB class. *Properties of `arrow.buffer.Buffer`* 1. `NumBytes` - a scalar `int64` value representing the size of the buffer in bytes. *Methods of `arrow.buffer.Buffer`* 1. `toMATLAB` - returns the data in the buffer as `Nx1` `uint8` vector, where `N` is the number of bytes. 2. `fromMATLAB(data)` - Static method that creates an `arrow.buffer.Buffer` from a numeric array. **Example:** ```matlab >> dataIn = [1 2]; >> buffer = arrow.buffer.Buffer.fromMATLAB(dataIn) buffer = Buffer with properties: NumBytes: 16 >> dataOut = toMATLAB(buffer) dataOut = 16×1 uint8 column vector 0 0 0 0 0 0 240 63 0 0 0 0 0 0 0 64 % Reinterpret bit pattern as a double array >> toDouble = typecast(dataOut, "double") toDouble = 1 2 ``` ### Are these changes tested? Yes. Added a new test class called `tBuffer.m` ### Are there any user-facing changes? Yes. Users can now create `arrow.buffer.Buffer` objects via the `fromMATLAB` static method. However, there's not much users can do with this object as of now. We implemented this class to facilitate adding `DataLayout` property to `arrow.array.Array`, as described in the **Rational for this change** section. * Closes: apache#38015 Authored-by: Sarah Gilmore <[email protected]> Signed-off-by: Kevin Gurney <[email protected]>
…B Interface (apache#38020) ### Rationale for this change To unblock use cases that are not satisfied by the default Arrow -> MATLAB conversions (i.e. the `toMATLAB()` on `arrow.array.Array`), we would like expose the underlying Arrow data representation as a property on `arrow.array.Array`. One possible name for this property would be `DataLayout`, which would be an `arrow.array.DataLayout` object. Note, this class does not yet exist, so we would have to add it. For example, the `DataLayout` property for temporal array types would return an object of the following class type: ```matlab classdef TemporalDataLayout < arrow.array.DataLayout properties Values % an arrow.array.Int32Array or an arrow.array.Int64Array Valid % an arrow.buffer.Buffer end end ``` However, the `Valid` property on this class would need to be an `arrow.buffer.Buffer` object, which does not yet exist in the MATLAB interface. Therefore, it would be helpful to first add the `arrow.buffer.Buffer` class before adding the `DataLayout` property/class hierarchy. It's worth mentioning that adding `arrow.buffer.Buffer` will open up additional advanced use cases in the future. ### What changes are included in this PR? Added `arrow.buffer.Buffer` MATLAB class. *Properties of `arrow.buffer.Buffer`* 1. `NumBytes` - a scalar `int64` value representing the size of the buffer in bytes. *Methods of `arrow.buffer.Buffer`* 1. `toMATLAB` - returns the data in the buffer as `Nx1` `uint8` vector, where `N` is the number of bytes. 2. `fromMATLAB(data)` - Static method that creates an `arrow.buffer.Buffer` from a numeric array. **Example:** ```matlab >> dataIn = [1 2]; >> buffer = arrow.buffer.Buffer.fromMATLAB(dataIn) buffer = Buffer with properties: NumBytes: 16 >> dataOut = toMATLAB(buffer) dataOut = 16×1 uint8 column vector 0 0 0 0 0 0 240 63 0 0 0 0 0 0 0 64 % Reinterpret bit pattern as a double array >> toDouble = typecast(dataOut, "double") toDouble = 1 2 ``` ### Are these changes tested? Yes. Added a new test class called `tBuffer.m` ### Are there any user-facing changes? Yes. Users can now create `arrow.buffer.Buffer` objects via the `fromMATLAB` static method. However, there's not much users can do with this object as of now. We implemented this class to facilitate adding `DataLayout` property to `arrow.array.Array`, as described in the **Rational for this change** section. * Closes: apache#38015 Authored-by: Sarah Gilmore <[email protected]> Signed-off-by: Kevin Gurney <[email protected]>
…B Interface (apache#38020) ### Rationale for this change To unblock use cases that are not satisfied by the default Arrow -> MATLAB conversions (i.e. the `toMATLAB()` on `arrow.array.Array`), we would like expose the underlying Arrow data representation as a property on `arrow.array.Array`. One possible name for this property would be `DataLayout`, which would be an `arrow.array.DataLayout` object. Note, this class does not yet exist, so we would have to add it. For example, the `DataLayout` property for temporal array types would return an object of the following class type: ```matlab classdef TemporalDataLayout < arrow.array.DataLayout properties Values % an arrow.array.Int32Array or an arrow.array.Int64Array Valid % an arrow.buffer.Buffer end end ``` However, the `Valid` property on this class would need to be an `arrow.buffer.Buffer` object, which does not yet exist in the MATLAB interface. Therefore, it would be helpful to first add the `arrow.buffer.Buffer` class before adding the `DataLayout` property/class hierarchy. It's worth mentioning that adding `arrow.buffer.Buffer` will open up additional advanced use cases in the future. ### What changes are included in this PR? Added `arrow.buffer.Buffer` MATLAB class. *Properties of `arrow.buffer.Buffer`* 1. `NumBytes` - a scalar `int64` value representing the size of the buffer in bytes. *Methods of `arrow.buffer.Buffer`* 1. `toMATLAB` - returns the data in the buffer as `Nx1` `uint8` vector, where `N` is the number of bytes. 2. `fromMATLAB(data)` - Static method that creates an `arrow.buffer.Buffer` from a numeric array. **Example:** ```matlab >> dataIn = [1 2]; >> buffer = arrow.buffer.Buffer.fromMATLAB(dataIn) buffer = Buffer with properties: NumBytes: 16 >> dataOut = toMATLAB(buffer) dataOut = 16×1 uint8 column vector 0 0 0 0 0 0 240 63 0 0 0 0 0 0 0 64 % Reinterpret bit pattern as a double array >> toDouble = typecast(dataOut, "double") toDouble = 1 2 ``` ### Are these changes tested? Yes. Added a new test class called `tBuffer.m` ### Are there any user-facing changes? Yes. Users can now create `arrow.buffer.Buffer` objects via the `fromMATLAB` static method. However, there's not much users can do with this object as of now. We implemented this class to facilitate adding `DataLayout` property to `arrow.array.Array`, as described in the **Rational for this change** section. * Closes: apache#38015 Authored-by: Sarah Gilmore <[email protected]> Signed-off-by: Kevin Gurney <[email protected]>
Rationale for this change
To unblock use cases that are not satisfied by the default Arrow -> MATLAB conversions (i.e. the
toMATLAB()
onarrow.array.Array
), we would like expose the underlying Arrow data representation as a property onarrow.array.Array
. One possible name for this property would beDataLayout
, which would be anarrow.array.DataLayout
object. Note, this class does not yet exist, so we would have to add it.For example, the
DataLayout
property for temporal array types would return an object of the following class type:However, the
Valid
property on this class would need to be anarrow.buffer.Buffer
object, which does not yet exist in the MATLAB interface. Therefore, it would be helpful to first add thearrow.buffer.Buffer
class before adding theDataLayout
property/class hierarchy. It's worth mentioning that addingarrow.buffer.Buffer
will open up additional advanced use cases in the future.What changes are included in this PR?
Added
arrow.buffer.Buffer
MATLAB class.Properties of
arrow.buffer.Buffer
NumBytes
- a scalarint64
value representing the size of the buffer in bytes.Methods of
arrow.buffer.Buffer
toMATLAB
- returns the data in the buffer asNx1
uint8
vector, whereN
is the number of bytes.fromMATLAB(data)
- Static method that creates anarrow.buffer.Buffer
from a numeric array.Example:
Are these changes tested?
Yes. Added a new test class called
tBuffer.m
Are there any user-facing changes?
Yes. Users can now create
arrow.buffer.Buffer
objects via thefromMATLAB
static method. However, there's not much users can do with this object as of now. We implemented this class to facilitate addingDataLayout
property toarrow.array.Array
, as described in the Rational for this change section.arrow.buffer.Buffer
class to the MATLAB Interface #38015