Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Validator for 3D Tiles Next #563

Closed
pjcozzi opened this issue Nov 14, 2021 · 12 comments
Closed

Validator for 3D Tiles Next #563

pjcozzi opened this issue Nov 14, 2021 · 12 comments

Comments

@pjcozzi
Copy link
Contributor

pjcozzi commented Nov 14, 2021

Just like the glTF Validator, we should plan a validator for 3D Tiles Next.

I believe the glTF Validator is a key reason that the glTF ecosystem is robust. It is a handy tool to use ad hoc and some applications even embed it, e.g., validate after uploading a model, validate if import fails, integrate into unit tests, etc. Also, developing the glTF validator significantly helped harden the glTF specification and fix probably dozens of my own mistakes. 😟

Seems like we should evaluate how we would go about this, e.g.,

It would be great to brainstorm a game plan. I don't think it would be complete for #556.

@donmccurdy
Copy link
Contributor

Extend what is already in 3d-tiles-validator...

This feels like the most natural starting point – were there any reasons that continuing to develop and extend the existing 3D Tiles validator might not be ideal?

The glTF Validator doesn't currently have a clear process for third-party extensions (other than to fork the repository) that is probably a limitation to be aware of and/or request changes if needed.

@pjcozzi
Copy link
Contributor Author

pjcozzi commented Nov 15, 2021

Extend what is already in 3d-tiles-validator...

This feels like the most natural starting point – were there any reasons that continuing to develop and extend the existing 3D Tiles validator might not be ideal?

The current validator may or may not be a solid foundation. I am not aware as to how complete it is or how well written it is or how well it would integrate into other tools. @lilleyse may know best.

@donmccurdy
Copy link
Contributor

donmccurdy commented Nov 15, 2021

@ptrgags
Copy link
Contributor

ptrgags commented Nov 15, 2021

Some other notes from past discussions about this:

  • Where do we want to use this? a command line tool would be useful. But being able to use it even in the browser is nice too (for example https://github.khronos.org/glTF-Validator/)
  • TypeScript would be nice, or at least NodeJS. This would allow use in the browser and from Node.js projects
  • if we made this a C++ executable it could be integrated into tiling pipelines

@donmccurdy
Copy link
Contributor

I'd feel pretty strongly that a typed language is valuable here. E.g. TypeScript has exhaustiveness checks that can help ensure edge cases are not missed. From there it probably comes down to tradeoffs between having code that runs in any environment (hard to beat C/C++ here) and something that is practical to maintain.

Targeting WASM builds could be a good compromise. Languages with good WASM tooling include:

But certain tasks, like disk read/write, don't happen in WASM and would require environment-specific wrappers.

@lilleyse
Copy link
Contributor

Since the 3D Tiles validator calls the glTF validator it makes the C++/Rust route a little less attractive knowing we won't have a full native solution. For that reason I'm leaning towards TypeScript.

The current validator doesn't do a whole lot, and much of what it does do - validating batch tables, feature tables, etc - will no longer be relevant. So I think starting fresh is the right course of action here.

@javagl
Copy link
Contributor

javagl commented Aug 10, 2022

There are a bunch questions regarding a 'revival' of a 3D Tiles validator, on different levels.

Referring to the current 3d-tiles-validator repository, there is the question of whether the validator, tools and samples-generator will be separated (and if so: how). I think that the tools will be crucial for some future developments. This refers to the difficulties that people may encounter when glTF 1.0 is deprecated, or when they want to convert some of the "legacy" tile formats to glTF. (As such, the tools are a topic on their own - for now, this refers only to the repo structure).

The question about the programming language and environment is not answered yet. The glTF validator is written in Dart. The language is not sooo widely used, but it apparently has some desirable capabilities for such a project: It is possible to publish the validator as an npm package, for node and the browser (to create the drag-and-drop validator), or to compile it into an executable (yes, really ... an .EXE file). I don't know how well it plays with non-Dart dependencies, but it seems to have some standard libraries for things like vector math (Vector3, Quaternion, Matrix4...), so we might be reasonably covered here.

TypeScript was mentioned a few times. I have not yet used it extensively for actual projects, and don't know some things that one would have to know in order to make an informed decision. This refers to everything that is related to deployment, testing, but also to IO. (Is it possible to build something, with reasonable effort, that can either run at the command line and receive local file paths or receive data from a drag-and-drop into a browser? I'll have to read more about that...).

I started a few experiments. These are currently drafted in TypeScript. But some of the following points are technical ones that are just so above the level of the language.


The validator should have some sort of standardized, structured format for reporting issues.

The current validator only prints a single error message, returned as a string. When there are multiple errors, this would imply some sort of "Fix-Check-Fix-Check...." cycle. An alternative would be to have something like this (from my current drafts - yes, it's certainly similar to the glTF validator output, but ... when it makes sense, then it does make sense...) :

export class ValidationIssue {
  private _type: number; // A type (more on that below)
  private _path: string; // Something like a JSON path, "tileset/root/children[2]/example"
  private _message: string; // A human-readable message
  private _severity: number; // To differentiate between WARNING and ERROR

The type could be an ID for each form of validation issue (similar to what the glTF validator is doing). But one could also consider something that indicates a "category" - like SCHEMA_VALIDATION or INCONSISTENT_BOUNDING_VOLUMES or whatnot. The latter could also be a separate _category field. A list of these issues should be collected, and offered in a machine-processable form for downstream pipelines (yeah, it's gonna be JSON...).

Note: The validator currently has a function validateGlb that uses the glTF validator, but only returns the result as a single error. One could even consider to add something like a

  private _contentIssues: ValidationIssue[];

to the ValidatationIssue to eventually have a report like

    {
      "_type": CONTENT_ERROR,
      "_path": "tileset/root/children[2]/content/uri/",
      "_message": "There's something wrong with the content",
      "_severity": 0,
      "_contentIssues": {
          // Issues from ONE content (GLB, B3DM, ...) go here
      }
    }

But whether or not (or how) the list of issues from the glTF validator should or could be integrated into such a report has to be discussed.


The JSON schema validation could leverage an existing validator.

Validating the JSON against the JSON schema is tedious. Manually implementing a validation that covers what could be covered by the schema is nearly impossible. (For example: The current validator checks whether the geometricError is present, but ... I just inserted a geometric error of -240, and this is not detected...).

We have put a lot of effort into the JSON schemas, and we could now benefit from that (and maybe even improve them if we encounter any issues). In my local experiments, I just wrapped ajv into a class (to not let it appear in an interface, and to have the option to switch to another validator later), and translated its error messages into the aforementioned ValidationIssue. Something like a negative geometricError is then detected....

Validating tileset:
{
  asset: { version: '1.0' },
  geometricError: -2,
  root: { boundingVolume: { box: [Array] }, geometricError: 1 }
}

Validation result:
{
  "_issues": [
    {
      "_type": 0,
      "_path": "/geometricError",
      "_message": "must be >= 0",
      "_severity": 0
    }
  ]
}

and this essentially comes for free. (Except for some quirks to feed the schema into ajv, but nothing big). It would fix issues like CesiumGS/3d-tiles-validator#126 , CesiumGS/3d-tiles-validator#208 , and probably many others that are prefixed with Validator ..., and could also be applied to the tile format JSON.

If we did this, we could focus the remaining validation work on things that can not be covered with the JSON schema. It may be hard to carve out what that is, exactly, but some of that is already known...:


Some parts of the current validator should be salvaged.

Even if the validator is supposed to be "started fresh", there are things like the binary header checks or bounding volume in bounding volume checks that either should be used directly, or be ported to the target language.


The validator should play nicely with extensions

This is important, but a very general achitectural question. (It is also not yet fully answered for the glTF validator. I consider this as a hint that it is not only 'not trivial', but potentially really difficult...). It might be necessary to think about some sort of a 'plugin concept' here, to easily add validation steps or passes for new extensions in a modular, non-invasive form.

@lilleyse
Copy link
Contributor

Referring to the current 3d-tiles-validator repository, there is the question of whether the validator, tools and samples-generator will be separated (and if so: how).

Ideally they would be separate repos, though I'm not sure how to do that without losing the commit history and issues. It would also be nice if the new validator started as a clean slate. I wonder if we could rename 3d-tiles-validator to 3d-tiles-validator-legacy and create a new repo called 3d-tiles-validator for the new validator. I wonder if github would allow that and what the redirect behavior would be like for old links.

The validator should have some sort of standardized, structured format for reporting issues.

Agreed, and the proposal here seems like a good start, and similar to the glTF validator.

The JSON schema validation could leverage an existing validator.

That's what we originally did but it becomes easy to lose track of what the schema validation checks vs. what the code checks. It also makes it hard to have personalized error messages for schema validation failures. It can be a good place to start though.

Some parts of the current validator should be salvaged.

Agreed - any utility functions can be ported over.

The validator should play nicely with extensions

👍

@javagl
Copy link
Contributor

javagl commented Aug 21, 2022

That's what we originally did but it becomes easy to lose track of what the schema validation checks vs. what the code checks. It also makes it hard to have personalized error messages for schema validation failures. It can be a good place to start though.

That's generally true. There is an option for addressing this, to some extent: With ajv, it is possible to configure the error messages. I tried this out with one example: A tile may have either a content or a contents. The way how this is established in the schema causes this ajv message:

  {
    instancePath: '/root',
    schemaPath: '#/not',
    keyword: 'not',
    params: {},
    message: 'must NOT be valid'
  }

The message is ... confusing, to say the least. (It makes sense, due to the keyword:not, but is still confusing)

But it is possible to sneak a custom error message into ajv (using the ajv-errors library), like this:

    const tileCustomErrorMessage = {
      not: "may either have a content or contents, but not both",
    };
    ...
    schema.errorMessage = tileCustomErrorMessage;

which turns the error into this:

  {
    instancePath: '/root',
    schemaPath: '#/errorMessage',
    keyword: 'errorMessage',
    params: { errors: [Array] },
    message: 'may either have a content or contents, but not both'
  }

But there are still some valid concerns:

  • The case of handling the not here is still relatively simple. But if we had to do this for "many" properties or validation aspects, then configuring these custom error messages may become tricky.

  • The remaining validation code still has to do checks for things that are already covered by the schema validation. The only advantage is that they don't have to generate error messages any more. Some pseudocode:

    validateTileContent(tileContent) {
        if (!defined(tileContent.uri)) {
            // The `content.uri` is required, but the error message already generated
            // by the schema validation, so there is nothing left to do here
        } else {
           validateData(tileContent.uri);
        }
    }
    
  • Using the schema validation loses the specific issue information. I'd really like to have specific constants for issues, like CONTENT_URI_MISSING or TILE_GEOMETRIC_ERROR_NEGATIVE. But when using the schema validation, these can (at best) be mashed together into a JSON_SCHEMA_ERROR with the (specific but generic) "message": "/root/content/uri missing".

  • Using a generic schema validation removes the type information

The latter refers to the fact that in typescript, the method signatures are still looking like

validateAsset(asset: any, context: ValidationContext): void {

with any as the asset type.

There are several options now:

  1. Anticipate and ignore these shortcomings
  2. Define a full-fledged implementation of 3D Tiles, defining dedicated types like Tileset, Tile, Asset. This is what the glTF validator does in https://github.com/KhronosGroup/glTF-Validator/tree/main/lib/src/base (and yes, things like accessor.dart then have a whopping 37KB with ~1200 lines of basically comment-free code...).
    (An aside: One could now be tempted to say: "No problem: Google json schema to typescript, hit 'I'm feeling lucky', and just use https://github.com/bcherny/json-schema-to-typescript". But this does not work: Any auto-generated class will already anticipate that the input matches the schema. We're trying to exactly detect the inputs where this is not the case. So we'd have to parse the JSON into a plain object, and then manually walk through all properties, as in tileset.root.content.uri = parsedJson?.root?.content?.uri;...)
  3. One could do the validation on plain JSON objects, including a manual validation against the schema

I'm not sure whether introducing the specific types is really worth the effort, considering the manual overhead for building the actual instances. On the other hand, having a (typed) implementation of 3D Tiles could be nice, and offer a sensible context for types like ImplicitTiling, where the binary data comes into play, and that could serve as a stub implementation of the functionality that is required for the traversal of implicit tilesets (which, for itself, is still to be addressed...). In any case: Both options 2. and 3. will involve a considerable amount of manual work. Of course, it's doable, but something to keep in mind.

@javagl
Copy link
Contributor

javagl commented Aug 23, 2022

A short update, to summarize the current direction that I've taken from the previous comment:

  • I started creating basic "plain old data objects" for the elements of a tileset: Tileset, Tile, BoundingVolume. While these are classes and seem to contain type information like Asset { version: string }, this is not enforced on the level of deserialization: It must still be possible to deserialize an "asset": { "version": 1234 } into this structure, and later report an asset.version PROPERTY_TYPE_MISMATCH error.
  • The generic ajv-based JSON schema validation is still active, but will probably be dropped, while being gradually replaced with the manual checks. This is tedious, but for now (!) the trade-off in view of the points mentioned in the previous comment seems to be acceptable.

@lilleyse
Copy link
Contributor

Draft PR of the new validator here: CesiumGS/3d-tiles-validator#222

Looking for any early feedback!

@javagl
Copy link
Contributor

javagl commented Jan 18, 2023

The validator for 3D Tiles 1.1 now lives at https://github.com/CesiumGS/3d-tiles-validator/ - further discussion can happen there.

@javagl javagl closed this as completed Jan 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants