Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CT-3316] [Bug] Invalid JSON schema for manifest v11 #8991

Closed
2 tasks done
mars-lan opened this issue Nov 2, 2023 · 15 comments · Fixed by #9155
Closed
2 tasks done

[CT-3316] [Bug] Invalid JSON schema for manifest v11 #8991

mars-lan opened this issue Nov 2, 2023 · 15 comments · Fixed by #9155
Assignees
Labels
backport 1.7.latest bug Something isn't working High Severity bug with significant impact that should be resolved in a reasonable timeframe regression

Comments

@mars-lan
Copy link

mars-lan commented Nov 2, 2023

Is this a new bug in dbt-core?

  • I believe this is a new bug in dbt-core
  • I have searched the existing issues, and I could not find an existing issue for this bug

Current Behavior

Manifest v11 JSON schema has been released with dbt 1.7.0. However, it's not a valid JSON schema:

{
  "$ref": "#/$defs/WritableManifest",
  "$defs": {
    "ManifestMetadata": {
      "type": "object",
      "title": "ManifestMetadata",
      ...
  }
}

Seems like $defs.ManifestMetadata contains the actual JSON schema that's supposed to be published.

Expected Behavior

A valid JSON schema to be published.

Steps To Reproduce

  1. Copy https://schemas.getdbt.com/dbt/manifest/v11.json
  2. Paste it into a JSON schema validator, e.g. https://www.jsonschemavalidator.net/

Relevant log output

No response

Environment

- OS:
- Python:
- dbt:

Which database adapter are you using with dbt?

No response

Additional Context

No response

@mars-lan mars-lan added bug Something isn't working triage labels Nov 2, 2023
@github-actions github-actions bot changed the title [Bug] Invalid JSON schema for manifest v11 [CT-3316] [Bug] Invalid JSON schema for manifest v11 Nov 2, 2023
@dbeatty10 dbeatty10 self-assigned this Nov 2, 2023
@dbeatty10
Copy link
Contributor

@mars-lan I see what you are saying about the $defs.ManifestMetadata addition in v11 versus v10.

I'll check if that change was intentional or accidental.

@dbeatty10 dbeatty10 removed their assignment Nov 2, 2023
@dbeatty10 dbeatty10 removed the triage label Nov 2, 2023
@martynydbt
Copy link

martynydbt commented Nov 3, 2023

@mars-lan thank you for surfacing this and thank you to Metaphor Data for sponsoring Coalesce! I hope the conference went well for you all. Wanted to update you and let you know that engineers are looking into this and will update shortly.

@peterallenwebb
Copy link
Contributor

@mars-lan Thanks for opening this issue. We have reproduced the behavior you discovered using https://www.jsonschemavalidator.net/, and we find it concerning, but we are curious whether you have encountered problems with the schema with any other tools, or other parts of your workflow. In other validators we have tried, and with the libraries we use internally for testing, the schema performed as we expected.

We'll follow this up regardless, but knowing the above would help us better understand the scope of the issue.

@mars-lan
Copy link
Author

mars-lan commented Nov 3, 2023

@mars-lan Thanks for opening this issue. We have reproduced the behavior you discovered using https://www.jsonschemavalidator.net/, and we find it concerning, but we are curious whether you have encountered problems with the schema with any other tools, or other parts of your workflow. In other validators we have tried, and with the libraries we use internally for testing, the schema performed as we expected.

We'll follow this up regardless, but knowing the above would help us better understand the scope of the issue.

We used https://github.com/koxudaxi/datamodel-code-generator to generate Python classes from the JSON schema and it also had issues parsing the schema.

@graciegoheen graciegoheen added High Severity bug with significant impact that should be resolved in a reasonable timeframe and removed High Severity bug with significant impact that should be resolved in a reasonable timeframe labels Nov 7, 2023
@martynydbt
Copy link

@emmyoop can you throw some point on this one? thank you!

@mars-lan
Copy link
Author

Ping @emmyoop?

@QMalcolm
Copy link
Contributor

@mars-lan I am currently working on this

@QMalcolm
Copy link
Contributor

@mars-lan I'm currently unable to reproduce the issue 🤔 Last week I thought I had reproduced it, but I think I was doing so incorrectly. I'm not sure if the manifest v11.json has changed since this issue was opened, perhaps it has, but how we produce it has not. I'm gonna see what issues I run into with datamodel-code-generator y'all use. Are you still experiencing issues? Below is a recording of me trying to reproduce the issue.

@QMalcolm
Copy link
Contributor

As a followup, I used the currently supplied v11.json manifest artifact with the datamodel-code-generator tool, and it seems to be working as expected. Below is a loom of what I did

@dbeatty10
Copy link
Contributor

@QMalcolm I was able to reproduce with catalog/v1.json. The original poster reported issues with the schema for manifest.json, but I had a small catalog.json handy, and it exhibits the same underlying issue with https://www.jsonschemavalidator.net/.

Reprex

  1. Copy the contents of https://schemas.getdbt.com/dbt/catalog/v1.json (or here) into the Select schema text area on the left side of https://www.jsonschemavalidator.net/
    • Select "Custom" from the drop-down
  2. Copy the following contents into the Input JSON text area on the right side
    {
      "metadata": {
        "dbt_schema_version": "https://schemas.getdbt.com/dbt/catalog/v1.json",
        "dbt_version": "1.7.0b2",
        "generated_at": "2023-10-07T13:34:04.627535Z",
        "invocation_id": "c09b4da4-35c5-418d-8a4b-447d2925f075",
        "env": {}
      },
      "nodes": {},
      "sources": {},
      "errors": null
    }
  3. You should see an error like the following:
    image

But if you replace the Select schema with the contents from here, it should validate:

image

@QMalcolm
Copy link
Contributor

I synced with @dbeatty10. Although the v11.json manifest artifact can be used to produce models with the datamodel-code-generator. What is currently broken is using the artifact as custom schema to validate input json (which it used to do). In my demo, I put the schema in the right hand box on https://www.jsonschemavalidator.net/ which verified that it was valid jsonschema. What doesn't work is putting it in the left hand box of https://www.jsonschemavalidator.net/ to show that it can be used as a custom schema to validate json.

@mars-lan
Copy link
Author

mars-lan commented Dec 5, 2023

Confirmed that the latest JSON schema is passing. Feel free to close this now.

@QMalcolm
Copy link
Contributor

QMalcolm commented Dec 5, 2023

@mars-lan Just to confirm, are you saying that v11 manifest currently hosted is working for you or that you pulled the v12 manifest from the PR and that is working for you?

@mars-lan
Copy link
Author

mars-lan commented Dec 5, 2023

@mars-lan Just to confirm, are you saying that v11 manifest currently hosted is working for you or that you pulled the v12 manifest from the PR and that is working for you?

The latest published v11 is working. I haven't tested v12 yet, and doesn't seem like there's one published either: https://schemas.getdbt.com/dbt/manifest/v12/index.html

@Fatal1ty
Copy link

Fatal1ty commented Dec 5, 2023

What doesn't work is putting it in the left hand box of https://www.jsonschemavalidator.net/ to show that it can be used as a custom schema to validate json.

Just wanted to break in and say that https://www.jsonschemavalidator.net/ doesn't support JSON Schema 2020-12 and seems not to be the best validator. The v11.json schema looks good in the eyes of these validators:

All of them support 2022-12 dialect and are listed on the official list of the online JSON Schema validators: https://json-schema.org/implementations#web-(online).

Also I'd like to note that using "$ref" and "$defs" keywords together is valid, as can be seen from this example:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 1.7.latest bug Something isn't working High Severity bug with significant impact that should be resolved in a reasonable timeframe regression
Projects
None yet
8 participants