Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Swift][FlightSQL] Processing valid RecordBatch leads to invalid offset from body's data. #37726

Closed
mgrazianoc opened this issue Sep 14, 2023 · 11 comments · Fixed by #37764
Closed

Comments

@mgrazianoc
Copy link

Describe the bug, including details regarding any error messages, version, and platform.

Bug

During FlightClient.doGet call, ArrowReader process an empty dataBody on a valid flight data, raising Fatal error: UnsafeRawBufferPointer.load with negative offset. However, dataHeader is present.


Details

A valid flight call tested both in python and rust was also tested in the not yet released Swift package. The ticket is a simple TEST raw byte, in order for the server to start a simple stream of RecordBatch.


Error Messages

  • Fatal error: UnsafeRawBufferPointer.load with negative offset

System details

  • Using Xcode Version 15.0 beta 8 (15A5229m)
  • Building for Apple Vision Pro Simulator.
  • Swift-Driver version: 1.87.1 Apple Swift version 5.9 (swiftlang-5.9.0.128.106 clang-1500.0.40.1)

Screenshots

s1
s2

Component(s)

Swift

@mgrazianoc mgrazianoc changed the title [Swift][Arrow-Flight] [Swift][Arrow-Flight] Processing valid RecordBatch leads to invalid offset from body's data. Sep 14, 2023
@kou kou changed the title [Swift][Arrow-Flight] Processing valid RecordBatch leads to invalid offset from body's data. [Swift][FlightSQL] Processing valid RecordBatch leads to invalid offset from body's data. Sep 15, 2023
@kou
Copy link
Member

kou commented Sep 15, 2023

@abandy Could you take a look at this?

@abandy
Copy link
Contributor

abandy commented Sep 15, 2023

Yes, will do.

@abandy
Copy link
Contributor

abandy commented Sep 15, 2023

@mgrazianoc Can you share the test case that hits this error?

@mgrazianoc
Copy link
Author

mgrazianoc commented Sep 15, 2023

Hi, I think the issue was due to the flat buffer data. I'm still learning about the weeds, so here's what I've tried that worked:

  • Instead of using readerResultClosure which expects fileData format (maybe from IPC File Format extension, I'm not sure), we call flightDataClosure, which read the bytes successfully.

Now, I'm trying to figure it out how to convert from an array of FlightData into an ArrowTable.

@abandy
Copy link
Contributor

abandy commented Sep 15, 2023

Hi @mgrazianoc, it looks like a test for calling doGet with a flightDataClosure is missing. I will submit a PR to add the missing test case. In the meantime, here is the test code for getting the data from the flightDataClosure into a RecordBatch:

func doGetTestFlightData() async throws {
    let ticket = FlightTicket("flight_ticket test".data(using: .utf8)!)
    var num_call = 0
    try await client?.doGet(ticket, flightDataClosure: { flightData in
        let reader = ArrowReader();
        let result = reader.fromStream(flightData.dataBody)
        switch result {
        case .success(let rb):
            XCTAssertEqual(rb.schema?.fields.count, 3)
            XCTAssertEqual(rb.batches[0].length, 4)
            num_call += 1
        case .failure(let error):
            throw error
        }
    })
    
    XCTAssertEqual(num_call, 1)
}

Once you have the record batches you should be able to call ArrowTable.from(recordBatchs: [RecordBatch]) to build the ArrowTable.

@mgrazianoc
Copy link
Author

mgrazianoc commented Sep 15, 2023

Now I'm getting back the Fatal error: UnsafeRawBufferPointer.load with negative offset, raised inside the ArrowReader, with the code:

func doGet() async throws {
    let ticket = FlightTicket("TEST".data(using: .utf8)!)
    var recordBatches = [RecordBatch]()
    try await client?.doGet(ticket, flightDataClosure: { flightData in
        let reader = ArrowReader();
        let result = reader.fromStream(flightData.dataBody)
        switch result {
        case .success(let readerResult):
            for rb in readerResult.batches {
                recordBatches.append(rb)
            }
        case .failure(let error):
            throw error
        }
    })
    let table = ArrowTable.from(recordBatches: recordBatches)
    print(table)
}

Arrow table being sent (it also raises with more complex ones), through https://arrow.apache.org/docs/python/generated/pyarrow.flight.RecordBatchStream.html:

pyarrow.Table
TEST_STRING: string
----
TEST_STRING: [["testString"]]

s3

@abandy
Copy link
Contributor

abandy commented Sep 17, 2023

@mgrazianoc Seems the behavior of the swift impl and existing flight implementations were different. I have added a PR to update the swift behavior to match existing implementations. Please test against it when you get a chance: #37764.

To run the flight test against "python/examples/flight/server.py" you will need to change the data sent to the doGetTest* methods to (as the python test server uses an encoding of the ticketdescriptor to build the lookup keys):

        try await clientImpl.doGetTest(Data("(2, b'flight_ticket', ())".utf8))
        try await clientImpl.doGetTestFlightData(Data("(2, b'flight_another', ())".utf8))

abandy added a commit to abandy/arrow that referenced this issue Sep 18, 2023
@mgrazianoc
Copy link
Author

I think just before we can close the issue, after the great work made by @abandy, there's one more bug relating to the arrow-flight doGet.

Sending utf-8 string seems to be crashing in some scenarios. I'm running the rust implementation, which according with the documentation is considered stable.

Sending the testString will result in a Swift/UnsafeRawPointer.swift:1203: Fatal error: load from misaligned raw pointer

@kou
Copy link
Member

kou commented Sep 23, 2023

Could you open a new issue for the another bug?
If we have one issue for one problem, we can fix one problem by one pull request. It's easy to review.

@kou
Copy link
Member

kou commented Sep 25, 2023

I reopen this because #37764 isn't merged yet.

@kou kou reopened this Sep 25, 2023
@mgrazianoc
Copy link
Author

I've opened the other error in the #37884 issue.

abandy added a commit to abandy/arrow that referenced this issue Oct 25, 2023
abandy added a commit to abandy/arrow that referenced this issue Oct 25, 2023
abandy added a commit to abandy/arrow that referenced this issue Oct 25, 2023
abandy added a commit to abandy/arrow that referenced this issue Oct 25, 2023
abandy added a commit to abandy/arrow that referenced this issue Oct 25, 2023
abandy added a commit to abandy/arrow that referenced this issue Oct 25, 2023
abandy added a commit to abandy/arrow that referenced this issue Oct 25, 2023
abandy added a commit to abandy/arrow that referenced this issue Oct 25, 2023
abandy added a commit to abandy/arrow that referenced this issue Oct 25, 2023
kou pushed a commit that referenced this issue Oct 26, 2023
…g impls (#37764)

Swift Flight implementation behavior didn't match existing flight implementations.  The updates in this PR have been tested against python/examples/flight/server.py except for getSchemaTest and doExchangeTest which are not implemented in server.py.  I will follow up this PR with a cleanup refactoring PR.

This difference in behavior was found due to issue #37726.  
* Closes: #37726

Authored-by: Alva Bandy <[email protected]>
Signed-off-by: Sutou Kouhei <[email protected]>
@kou kou added this to the 15.0.0 milestone Oct 26, 2023
loicalleyne pushed a commit to loicalleyne/arrow that referenced this issue Nov 13, 2023
…xisting impls (apache#37764)

Swift Flight implementation behavior didn't match existing flight implementations.  The updates in this PR have been tested against python/examples/flight/server.py except for getSchemaTest and doExchangeTest which are not implemented in server.py.  I will follow up this PR with a cleanup refactoring PR.

This difference in behavior was found due to issue apache#37726.  
* Closes: apache#37726

Authored-by: Alva Bandy <[email protected]>
Signed-off-by: Sutou Kouhei <[email protected]>
dgreiss pushed a commit to dgreiss/arrow that referenced this issue Feb 19, 2024
…xisting impls (apache#37764)

Swift Flight implementation behavior didn't match existing flight implementations.  The updates in this PR have been tested against python/examples/flight/server.py except for getSchemaTest and doExchangeTest which are not implemented in server.py.  I will follow up this PR with a cleanup refactoring PR.

This difference in behavior was found due to issue apache#37726.  
* Closes: apache#37726

Authored-by: Alva Bandy <[email protected]>
Signed-off-by: Sutou Kouhei <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants