Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GH-38211: [MATLAB] Add support for creating an empty arrow.tabular.RecordBatch by calling arrow.recordBatch with no input arguments #38274

Conversation

Divyansh200102
Copy link
Contributor

@Divyansh200102 Divyansh200102 commented Oct 15, 2023

@github-actions
Copy link

Thanks for opening a pull request!

If this is not a minor PR. Could you open an issue for this pull request on GitHub? https://github.com/apache/arrow/issues/new/choose

Opening GitHub issues ahead of time contributes to the Openness of the Apache Arrow project.

Then could you also rename the pull request title in the following format?

GH-${GITHUB_ISSUE_ID}: [${COMPONENT}] ${SUMMARY}

or

MINOR: [${COMPONENT}] ${SUMMARY}

In the case of PARQUET issues on JIRA the title also supports:

PARQUET-${JIRA_ISSUE_ID}: [${COMPONENT}] ${SUMMARY}

See also:

@Divyansh200102 Divyansh200102 changed the title GH:38211 [MATLAB] Add support for creating an empty arrow.tabular.RecordBatch by calling arrow.recordBatch with no input arguments MINOR: [MATLAB] Set the input argument T to default to table.empty(0,0) Oct 16, 2023
@kevingurney
Copy link
Member

kevingurney commented Oct 16, 2023

@Divyansh200102 - thanks very much for the pull request!

Just a few small comments:

  1. Could you please add a MATLAB test which verifies that passing in no arguments to the arrow.recordBatch function results in an empty arrow.tabular.RecordBatch being returned? You could add this to tRecordBatch.m. In general, we try our best to always include tests whenever we add/change behavior. In case you haven't already seen it, it may be helpful to review the New Contributor's Guide.

  2. Although this is a small, one-line change - it isn't a documentation update. Therefore, I don't think this would normally be classified as a minor PR. Could you please update the title of this pull request to the following:

GH-38211: [MATLAB] Add support for creating an empty arrow.tabular.RecordBatch by calling arrow.recordBatch with no input arguments?


Please don't hesitate to let me know if you have any questions.

Thank you!

@@ -16,7 +16,7 @@
% permissions and limitations under the License.
function rb = recordBatch(T)
arguments
T table
T table= table.empty(0, 0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Could you add {istable} here?

Example:

matlabTable {istable} = table.empty(0, 0)

This will ensure an error is thrown if non-table inputs are provided to the arrow.recordBatch function.

  1. For clarity (and consistency with the arrow.table function), perhaps we should rename T to matlabTable?

@github-actions github-actions bot added awaiting changes Awaiting changes awaiting review Awaiting review and removed awaiting review Awaiting review labels Oct 16, 2023
@Divyansh200102 Divyansh200102 changed the title MINOR: [MATLAB] Set the input argument T to default to table.empty(0,0) GH-38211: [MATLAB] Add support for creating an empty arrow.tabular.RecordBatch by calling arrow.recordBatch with no input arguments Oct 17, 2023
@Divyansh200102 Divyansh200102 deleted the creating-an-empty-arrow.tabular.RecordBatch branch October 18, 2023 05:34
@Divyansh200102 Divyansh200102 restored the creating-an-empty-arrow.tabular.RecordBatch branch October 18, 2023 08:10
@Divyansh200102
Copy link
Contributor Author

Divyansh200102 commented Oct 18, 2023

hi, @kevingurney i am trying to setup the project locally using the docker-compose.yml file by using the command "docker compose up -d" but this is error is showing up (Error response from daemon: manifest for amd64/maven:3.5.4-eclipse-temurin-8 not found: manifest unknown: manifest unknown) i think this version of amd64/maven is not in the registry .i tried pulling manually as well but still nothing can you help me out to solve this?i checked docker hub as well but this version of amd64/maven is not there.

@kevingurney
Copy link
Member

kevingurney commented Oct 18, 2023

@Divyansh200102 - generally speaking, adding code to the MATLAB interface shouldn't require using docker-compose. Is there any specific reason you are trying to use this?


Looking through the eclipse-java part of the docker-compose.yml file I see the following:

  eclipse-java:
    # Usage:
    #   docker-compose build eclipse-java
    #   docker-compose run eclipse-java
    # Parameters:
    #   MAVEN: 3.9.4
    #   JDK: 17, 21
    image: ${ARCH}/maven:${MAVEN}-eclipse-temurin-${JDK}
    shm_size: *shm-size
    volumes: *java-volumes
    command: *java-command

It looks like the values of the MAVEN and JDK environment variables that are used in computing the Docker image name are being pulled from the default .env file. The default values in the .env file are MAVEN=3.5.4 and JDK=8, which would explain why you are seeing amd64/maven:3.5.4-eclipse-temurin-8 in the error message.

Perhaps you could try modifying these values to 3.9.4 and 21 based on the comments in the eclipse-java code block above?

You could also try overriding these variables directly on the command line, like:

$ MAVEN=3.9.4 JDK=21 docker-compose build eclipse-java

Someone who contributes to the Docker / Java parts of the project like @danepitkin might be able to offer more guidance here. Maybe the default values in the .env file should be bumped to 3.9.4 and 21?

@kevingurney
Copy link
Member

@Divyansh200102 - do you need any further assistance with updating this pull request?

@github-actions github-actions bot added awaiting review Awaiting review and removed awaiting review Awaiting review awaiting changes Awaiting changes labels Oct 24, 2023
@Divyansh200102
Copy link
Contributor Author

@Divyansh200102 - do you need any further assistance with updating this pull request?

Please check if it's fine now @kevingurney

@@ -16,7 +16,7 @@
% permissions and limitations under the License.
function rb = recordBatch(T)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since you renamed the first input argument to matlabTable in the arguments block, you will also need to rename T to matlabTable.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, the rest of the code in this function needs to be updated to use matlabTable instead of T. For example:

columnNames = string(T.Properties.VariableNames);

would become:

columnNames = string(matlabTable.Properties.VariableNames);


% Create an empty expected RecordBatch
schema = arrow.schema();
expected = arrow.tabular.RecordBatch(schema, {});
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The arrow.tabular.RecordBatch constructor does not support any syntax that accepts an arrow.tabular.Schema and an empty cell array (i.e. {}).



function testArrowRecordBatchNoArgs()
% Call arrow.recordBatch with no arguments
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The indentation of the code in this test if off.

@@ -494,9 +494,23 @@ function TestIsEqualFalse(testCase)
% Call isequal with more than two arguments
testCase.verifyFalse(isequal(rb1, rb2, rb3, rb4));
end


function testArrowRecordBatchNoArgs()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please use UpperCamelCase for the test name to be consistent with the rest of the test cases in this file? i.e. TestArrowRecordBatchNoArgs instead of testArrowRecordBatchNoArgs.

expected = arrow.tabular.RecordBatch(schema, {});

% Use isequal to check if the actual and expected are equal
if isequal(expected, actual)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should be able to just use testCase.verifyEqual(actual, expected) here.

schema = arrow.schema();
expected = arrow.tabular.RecordBatch(schema, {});

% Use isequal to check if the actual and expected are equal
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this comment could be removed.

@github-actions github-actions bot added awaiting changes Awaiting changes and removed awaiting review Awaiting review labels Oct 24, 2023
@kevingurney
Copy link
Member

@Divyansh200102 - I added some more code review feedback.

Currently, the MATLAB CI workflow is failing. We need to ensure that all of the appropriate test cases are passing as expected.

@danepitkin
Copy link
Member

Hey, thanks for reporting the docker compose issue. I created #38439.

@Divyansh200102
Copy link
Contributor Author

Divyansh200102 commented Oct 24, 2023

@Divyansh200102 - I added some more code review feedback.

Currently, the MATLAB CI workflow is failing. We need to ensure that all of the appropriate test cases are passing as expected.

Can you tell me how do I do this? @kevingurney .I am a beginner.Any documents I need to refer to ?

@danepitkin
Copy link
Member

danepitkin commented Oct 24, 2023

Can you tell me how do I do this?

Hey @Divyansh200102, you can click on Details in the CI jobs with red X's (screenshot as an example)
image

Then, look through the logs and expand the output to view the errors that are reported. You will need to make sure your code passes the test cases successfully.

It looks like a reviewer has already left feedback that you should review as well. Maybe they pointed out some of the bugs in the code that are causing issues?

@kevingurney
Copy link
Member

Thank you @danepitkin for explaining how to investigate CI failures.

@Divyansh200102 - as @danepitkin pointed out, I did provide code review feedback that may help provide guidance on how to address some of the CI failures. If you have specific questions about any of the comments, please let me know.

@Divyansh200102
Copy link
Contributor Author

Thank you @danepitkin for explaining how to investigate CI failures.

@Divyansh200102 - as @danepitkin pointed out, I did provide code review feedback that may help provide guidance on how to address some of the CI failures. If you have specific questions about any of the comments, please let me know.

Thanks for the response @kevingurney and @danepitkin .I will let you know @kevingurney if I need any help

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants