Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unexpected "missing-label" error with option header_case = False #1635

Closed
amelie-rondot opened this issue Feb 5, 2024 · 0 comments · Fixed by #1641
Closed

Unexpected "missing-label" error with option header_case = False #1635

amelie-rondot opened this issue Feb 5, 2024 · 0 comments · Fixed by #1641

Comments

@amelie-rondot
Copy link
Contributor

Overview

In the of migration from v4 to v5 of frictionless-py in validata.fr, we experienced an unexpected missing-label error when validating a tabular data with header_case=False dialect option and using a column which is lower case instead of upper case as in the schema fields.

For example:

data = [["aa", "BB"], ["a", "b"]]
schema = {
        "$schema": "https://frictionlessdata.io/schemas/table-schema.json",
        "fields": [
            {"name": "AA", "constraints": {"required": True}},
            {"name": "bb", "constraints": {"required": True}}
        ]
    }

Using python, the validation report is invalid containting two missing-label errors:

if __name__ == "__main__":
    schema = frictionless.Schema.from_descriptor(schema)
    report = frictionless.validate(resources.Resource(
        source=source,
        schema=frictionless.Schema.from_descriptor(schema),
        dialect=frictionless.Dialect(header_case=False),
        detector=frictionless.Detector(schema_sync=True)
    ))

    # Expect valid report
    print(report)

Output:

{'valid': False,
 'stats': {'tasks': 1, 'errors': 2, 'warnings': 0, 'seconds': 0.004},
 'warnings': [],
 'errors': [],
 'tasks': [{'name': 'memory',
            'type': 'table',
            'valid': False,
            'place': '<memory>',
            'labels': ['aa', 'BB'],
            'stats': {'errors': 2,
                      'warnings': 0,
                      'seconds': 0.004,
                      'fields': 4,
                      'rows': 1},
            'warnings': [],
            'errors': [{'type': 'missing-label',
                        'title': 'Missing Label',
                        'description': 'Based on the schema there should be a '
                                       "label that is missing in the data's "
                                       'header.',
                        'message': "There is a missing label in the header's "
                                   'field "AA" at position "3"',
                        'tags': ['#table', '#header', '#label'],
                        'note': '',
                        'labels': ['aa', 'BB'],
                        'rowNumbers': [1],
                        'label': '',
                        'fieldName': 'AA',
                        'fieldNumber': 3},
                       {'type': 'missing-label',
                        'title': 'Missing Label',
                        'description': 'Based on the schema there should be a '
                                       "label that is missing in the data's "
                                       'header.',
                        'message': "There is a missing label in the header's "
                                   'field "bb" at position "4"',
                        'tags': ['#table', '#header', '#label'],
                        'note': '',
                        'labels': ['aa', 'BB'],
                        'rowNumbers': [1],
                        'label': '',
                        'fieldName': 'bb',
                        'fieldNumber': 4}]}]}

Expected behaviour

According to the documentation of HeaderCase Dialect parameter, I was expected a valid report.

Other details and experimentations

Used Frictionless version 5.16.1, last commit on main branch

Same result with command line validation.
I have put "schema-sync" to reproduce more closely our use case, but it does not seem to be related with the actual issue.

@pierrecamilleri pierrecamilleri changed the title Unexpected missing-label error with false header_case Unexpected missing-label error with option header_case = False Aug 30, 2024
@pierrecamilleri pierrecamilleri changed the title Unexpected missing-label error with option header_case = False Unexpected "missing-label" error with option header_case = False Aug 30, 2024
pierrecamilleri added a commit that referenced this issue Sep 2, 2024
- fix: deprecated dependencies ([PR
1674](#1674))
- fix: unexpected "missing-label" error with option `header_case =
False`
([#1635](#1635))
- fix: KeyError when a "primaryKey" is missing
([#1633](#1633))
- fix: unexpected field-error for a boolean "example" with "trueValues"
or
"falseValues" properties
([#1610](#1610))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment