Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Union types not handled when building AnalysisContext::try_from_column #8499

Closed
thinkharderdev opened this issue Dec 11, 2023 · 5 comments · Fixed by #9683
Closed

Union types not handled when building AnalysisContext::try_from_column #8499

thinkharderdev opened this issue Dec 11, 2023 · 5 comments · Fixed by #9683
Labels
bug Something isn't working

Comments

@thinkharderdev
Copy link
Contributor

Describe the bug

AnalysisContext will try and create a ScalarValue when there are no statistics for a given column. Since there is no ScalarValue representation for union types this fails.

The constructor is fallible so this isn't a bug per se, but in other contexts this value is unwrapped and can cause panics during query planning

To Reproduce

Minimized example of what we have observed causing panics during query planning:

    #[test]
    fn test_equivalence_properties_union_type() -> Result<()> {
        let union_type = DataType::Union(
            UnionFields::new(
                vec![0,1],
                vec![
                    Field::new("f1", DataType::Int32, true),
                    Field::new("f2", DataType::Utf8, true),
                ],
            ),
            UnionMode::Sparse,
        );

        let schema = Arc::new(Schema::new(vec![
            Field::new("c1", DataType::Int32, true),
            Field::new("c2", union_type, true),
        ]));

        let exec = FilterExec::try_new(
            binary(
                binary(col("c1", &schema)?, Operator::GtEq, lit(1u32), &schema)?,
                Operator::And,
                binary(col("c1", &schema)?, Operator::LtEq, lit(4u32), &schema)?,
                &schema,
            )?,
            Arc::new(EmptyExec::new(true, schema.clone()))
        )?;

        exec.statistics().unwrap();

        Ok(())
    }

Since FilterExec::equivalence_properties unwraps the result of FilterExec::statistics this just panics.

Expected behavior

FilterExec::equuivalence_properties is infallible so it shouldn't panic :)

Additional context

There is probably a way to patch this but seems like we should just add a ScalarValue::Union variant.

@thinkharderdev thinkharderdev added the bug Something isn't working label Dec 11, 2023
@thinkharderdev
Copy link
Contributor Author

cc @Dandandan

@alamb
Copy link
Contributor

alamb commented Dec 12, 2023

There is a similar report in #8262 (related to Map type)

@alamb
Copy link
Contributor

alamb commented Dec 12, 2023

cc @mustafasrepo

@berkaysynnada
Copy link
Contributor

With the new design of statistics underway, this problem may be resolved by simply using an absent interval. Unless it's an emergency, I don't think we should apply a temporary patch to the analysis context. It would be better to add missing scalar value variants.

@thinkharderdev
Copy link
Contributor Author

With the new design of statistics underway, this problem may be resolved by simply using an absent interval. Unless it's an emergency, I don't think we should apply a temporary patch to the analysis context. It would be better to add missing scalar value variants.

Yeah, I agree. I think adding the missing scalar value variants is probably the better approach either way as there are probably other contexts where it would be useful to have them

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants