Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IPC Writer Fails to Account for Sliced ListArray #3748

Closed
emcake opened this issue Feb 22, 2023 · 2 comments · Fixed by #4186
Closed

IPC Writer Fails to Account for Sliced ListArray #3748

emcake opened this issue Feb 22, 2023 · 2 comments · Fixed by #4186
Assignees
Labels
arrow Changes to the arrow crate bug

Comments

@emcake
Copy link
Contributor

emcake commented Feb 22, 2023

Describe the bug
Same idea as #3496 - slicing a record-batch with a list array inside doesn't correctly take into account the offset.

To Reproduce
This test will reproduce:

    #[test]
    fn encode_lists() {
        let val_inner = Field::new("item", DataType::UInt32, true);
        let val_list_field = Field::new("val", DataType::List(Box::new(val_inner)), false);

        let schema = Arc::new(Schema::new(vec![val_list_field]));

        let values = {
            let u32 = UInt32Builder::new();
            let mut ls = ListBuilder::new(u32);

            for list in vec![vec![1u32, 2, 3], vec![4, 5, 6], vec![7, 8, 9, 10]] {
                for value in list {
                    ls.values().append_value(value);
                }
                ls.append(true)
            }

            ls.finish()
        };

        let batch = RecordBatch::try_new(Arc::clone(&schema), vec![Arc::new(values)]).unwrap();
        let batch = batch.slice(1, 1);

        let mut writer = FileWriter::try_new(Vec::<u8>::new(), &schema).unwrap();
        writer.write(&batch).unwrap();
        writer.finish().unwrap();
        let data = writer.into_inner().unwrap();

        let mut reader = FileReader::try_new(Cursor::new(data), None).unwrap();
        let batch2 = reader.next().unwrap().unwrap();
        assert_eq!(batch, batch2);
    }

Expected behavior
The sliced record batch should match its roundtrip.

@emcake emcake added the bug label Feb 22, 2023
@tustvold
Copy link
Contributor

This sounds like #2080

@tustvold tustvold changed the title Slicing list arrays doesn't respect the list contents IPC Writer Fails to Account for Sliced ListArray Feb 23, 2023
tustvold added a commit to tustvold/arrow-rs that referenced this issue May 9, 2023
alamb pushed a commit that referenced this issue May 9, 2023
@tustvold tustvold self-assigned this May 11, 2023
@tustvold
Copy link
Contributor

label_issue.py automatically added labels {'arrow'} from #4186

@tustvold tustvold added the arrow Changes to the arrow crate label May 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arrow Changes to the arrow crate bug
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants