Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(repository-json-schema): resolve the circular reference #2690

Merged
merged 1 commit into from
Apr 22, 2019

Conversation

jannyHou
Copy link
Contributor

@jannyHou jannyHou commented Apr 4, 2019

Fixes #2628

Description

The solution here is a little bit different than the proposal in the original issue:

  • Ignore this item. The definitions contains both Category and Product.(This is not true) So that
...
properties: {
    category: {
        $ref: '#/definitions/Category',
    },
},
...

refers to a valid definition entry.

  • The visited field is an array a set tracks the title of visited items instead of caching the schema. This is because getJsonSchema is called during each iteration of the property schema generation, not after the completed model schema(without definitions) get built.

Checklist

👉 Read and sign the CLA (Contributor License Agreement) 👈

  • npm test passes on your machine
  • New tests added or existing tests modified to cover all changes
  • Code conforms with the style guide
  • API Documentation in code was updated
  • Documentation in /docs/site was updated
  • Affected artifact templates in packages/cli were updated
  • Affected example projects in examples/* were updated

👉 Check out how to submit a PR 👈

@jannyHou jannyHou force-pushed the fix/circular-reference branch from 74edbba to 3b350c4 Compare April 4, 2019 19:40
@jannyHou jannyHou changed the title fix(repository-json-schema): resolve the circular reference [WIP]fix(repository-json-schema): resolve the circular reference Apr 4, 2019
@jannyHou jannyHou force-pushed the fix/circular-reference branch from 3b350c4 to 59447bd Compare April 4, 2019 20:15
@jannyHou jannyHou changed the title [WIP]fix(repository-json-schema): resolve the circular reference fix(repository-json-schema): resolve the circular reference Apr 4, 2019
@@ -14,18 +14,27 @@ import {
import {JSONSchema6 as JSONSchema} from 'json-schema';
import {JSON_SCHEMA_KEY} from './keys';

export interface JsonSchemaOptions {
// Track the models/titles that have been visited.
visited?: string[];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's use Set instead to include visited TypeScript classes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to use a Set instead of an array.

FYI: In a very near feature, we will be emitting multiple schemas for a single model class:

  • Vanilla model schema as today
  • Model schema including relations
  • Model schema excluding certain properties (id, _rev, etc.)

Using TypeScript classes as the key will not work for that. I think it's better to use Schema title because that has to be unique. See #2653, #2652 and https://github.com/strongloop/loopback-next/pull/2646/files#diff-432476dd2d984e441d8b4e0af39a39be.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bajtos The set tracks title instead of the class name, see code

Or do you want to store the whole generated schema in the options.visited?
That would not be possible IMO, since generating the referenced definitions happen in the iteration, before all the properties are visited.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The set tracks title instead of the class name

Cool, that should work well 👍

do you want to store the whole generated schema in the options.visited?

IIRC, in my PoC, I was storing the whole generated schema in options.visited to allow me to load it from options.visited and put it to result.definitions.

@raymondfeng
Copy link
Contributor

Please run npm run update-package-locks or npm run update-packages to fix package-lock.json files.

Copy link
Member

@bajtos bajtos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great start!

@@ -14,18 +14,27 @@ import {
import {JSONSchema6 as JSONSchema} from 'json-schema';
import {JSON_SCHEMA_KEY} from './keys';

export interface JsonSchemaOptions {
// Track the models/titles that have been visited.
visited?: string[];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to use a Set instead of an array.

FYI: In a very near feature, we will be emitting multiple schemas for a single model class:

  • Vanilla model schema as today
  • Model schema including relations
  • Model schema excluding certain properties (id, _rev, etc.)

Using TypeScript classes as the key will not work for that. I think it's better to use Schema title because that has to be unique. See #2653, #2652 and https://github.com/strongloop/loopback-next/pull/2646/files#diff-432476dd2d984e441d8b4e0af39a39be.

@@ -148,6 +160,8 @@ export function modelToJsonSchema(ctor: Function): JSONSchema {
}

result.title = meta.title || ctor.name;
const isVisited =
options && options.visited && options.visited.includes(result.title!);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's set options to {} by default so that we don't have to check for its existence.

export function modelToJsonSchema(
  ctor: Function,
  options: JsonSchemaOptions = {}
): JSONSchema {
  // ...

  const isVisited =
    options.visited && options.visited.includes(result.title!);

  // ...
}

@@ -148,6 +160,8 @@ export function modelToJsonSchema(ctor: Function): JSONSchema {
}

result.title = meta.title || ctor.name;
const isVisited =
options && options.visited && options.visited.includes(result.title!);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please avoid using ! operator to silence TypeScript checks.

const title = meta.title || ctor.name;
const isVisited = options.visited && options.visited.includes(title);

result.title = title;

// If the model is already visited in the call stack, it implies one or more
// of its referenced properties refer back to it.

if (isVisited) break;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am confused. Why are we checking isVisited deep inside the loop iterating over model properties?

In the spike 2256435, I was able to return early in the schema generation. Once we know the schema title, we can check whether we have already seen such schema and immediately return.

What am I missing?

The visited field is an array tracks the title of visited items instead of caching the schema. This is because getJsonSchema is called during each iteration of the property schema generation, not after the completed model schema(without definitions) get built.

This should not matter. In our cache (both visited and metadata-based cache used by getJsonSchema), we are storing a schema object by reference. Even if we initially store any empty schema object first with no properties, the code executed by modelToJsonSchema will eventually fill in those properties directly in the cached object.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right. My expected JSON schema was wrong, I thought the definitions field from modelToJsonSchema(Category) should contain the schema of Category itself.

I refactored the code to return an empty array when the model is visited, which is still slightly different from your PoC.

In our cache (both visited and metadata-based cache used by getJsonSchema), we are storing a schema object by reference. Even if we initially store any empty schema object first with no properties, the code executed by modelToJsonSchema will eventually fill in those properties directly in the cached object.

Sounds reasonable to me...I am trying the code in PoC while run into the circular error again. I might miss something, need more time to figure it out.


if (isVisited) break;

// Use object assign to avoid polluting the original `options`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Please add a test to verify.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WIP, will add it in the next commit.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added

};

it('handles circular references', () => {
const schema = modelToJsonSchema(Category);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a test to verify getJsonSchema too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't realize getJsonSchema is the entry point of the schema generation, thought it starts with modelToJsonSchema.

Good catch! Tests added.

@@ -1,5 +1,7 @@
{
"editor.rulers": [80],
"editor.rulers": [
80
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unrelated change, please revert.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bajtos The format is changed when auto save. And if you check the other settings, breaking it into multiple the lines is the expected format.

I can try to move the change into a separate commit.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The format is changed when auto save. And if you check the other settings, breaking it into multiple the lines is the expected format.

Fair enough 👍

I can try to move the change into a separate commit.

Yes please. A new PR would be even better, that way we can land this small fix sooner.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It turns out it's just my vscode's weird settings.
The other settings are broken down into multiple lines due to the max characters per line.
Reverting.

getJsonSchemaOptions.visited = getJsonSchemaOptions.visited || [];
getJsonSchemaOptions.visited.push(result.title!);

const propSchema = getJsonSchema(referenceType, getJsonSchemaOptions);

if (propSchema && Object.keys(propSchema).length > 0) {
result.definitions = result.definitions || {};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One idea: since we are effectively storing the schema of visited models in result.definitions; if we need to access schema of a model that has been already visited, we can perhaps obtain it from those definitions object? Or maybe we should remove options.visited and pass the definitions key-value map in options instead?

Copy link
Contributor Author

@jannyHou jannyHou Apr 10, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One idea: since we are effectively storing the schema of visited models in result.definitions

Hmm, do you mean storing the schema of the current model in the definitions?

Like

{
  title: 'Category',
  properties: {
      name: {
        type: 'string'
      },
  },
  definitions: {
       Category: {
          title: 'Category',
          properties: {
             name: {
               type: 'string'
             },
          },
        },
  },
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, do you mean storing the schema of the current model in the definitions?

No, only schema of models used by properties (i.e. Product when emitting schema for Category).

@jannyHou jannyHou force-pushed the fix/circular-reference branch 5 times, most recently from 6bedae4 to fdd1cc4 Compare April 11, 2019 04:30
@jannyHou
Copy link
Contributor Author

@bajtos @raymondfeng Thanks for the review, feedback applied.

@bajtos My implementation is still slightly different than your PoC, see my comment

I may need some time to explore you original approach then compare.

@bajtos
Copy link
Member

bajtos commented Apr 11, 2019

My implementation is still slightly different than your PoC, see my comment

I may need some time to explore you original approach then compare.

Sure, take your time. My spike code is just a prototype, it's fine to end up with a different implementation. As long as it supports the use cases we need to support.

@jannyHou
Copy link
Contributor Author

@bajtos I figured out the difference. Your PoC skipped prompting visited definition to the top level with this line of code, which simplifies everything!

I still didn't get the idea of tracking the visited model with its generated schema like {CategoryModelOnly: ...CategoryModelOnlySchema}. IMO using a set containing all the visited models(titles to be more accurate) is enough, it's light than storing the entire schema in the option. Thought?

Copy link
Member

@bajtos bajtos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The implementation looks good at high level, let's improve few implementation details now.

result.title = meta.title || ctor.name;
const title = meta.title || ctor.name;

if (options.visited.has(title)) return {};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC, by returning an empty object, you are trying to satisfy the return value type JSONSchema while returning an invalid (or at least useless) value. That's effectively bypassing type checks :(

I think this was the reason why I was keeping the visited schemas in the cache?

Anyhow. Please change this line to return undefined to make it clear that when a model has been already visited, then we are not returning it's schema.

To preserve backwards compatibility, I am proposing the following signature:

export function modelToJsonSchema(ctor: Function): JSONSchema;
export function modelToJsonSchema(
  ctor: Function,
  options: JsonSchemaOptions = {},
): JSONSchema | undefined;

export function modelToJsonSchema(
  ctor: Function,
  options: JsonSchemaOptions = {},
): JSONSchema | undefined {
  // the implementation
}

Alternatively, and I think this is a better option when it comes to type safety, promote visited to a top-level argument and keep the JsonSchemaOptions as an empty interface for now.

export function modelToJsonSchema(
  ctor: Function,
  options: JsonSchemaOptions = {}
): JSONSchema;

export function modelToJsonSchema(
  ctor: Function,
  options: JsonSchemaOptions,
  visited: Set<string>;
): JSONSchema | undefined;

export function modelToJsonSchema(
  ctor: Function,
  options: JsonSchemaOptions = {},
  visited?: Set<string>;
): JSONSchema | undefined {
  // the implementation
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's effectively bypassing type checks :(
I think this was the reason why I was keeping the visited schemas in the cache?

Oh...that's true!

by returning an empty object, you are trying to satisfy the return value type JSONSchema while returning an invalid (or at least useless) value.

I agree.
And undefined and {} both return useless value which logically isn't as perfect as returning the cache value.

And since the cached schema is just a reference, tracking with a Set won't save space compared with tracking with cache. I switched to the approach in your PoC.

ctor: Function,
options: JsonSchemaOptions = {},
): JSONSchema {
options.visited = options.visited || new Set<string>();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please don't modify original options.

Suggested change
options.visited = options.visited || new Set<string>();
if (!options.visited) options = {...options, visited: new Set<string>()};

Please add a test to verify.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please don't modify original options.

Hmm, any reason?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's confusing for callers when you modify the options object. The caller may use the same options instance for multiple calls.

For example:

const SCHEMA_OPTIONS = {includeRelations: true};

const productSchema = getJsonSchema(Product, SCHEMA_OPTIONS);
const categorySchema = getJsonSchema(Category, SCHEMA_OPTIONS);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 got it thank you.

// error if itemType/type is not a string or a function
resolveType(metaProperty.itemType as string | Function)
// error if itemType/type is not a string or a function
resolveType(metaProperty.itemType as string | Function)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like an unrelated whitespace change to me, PTAL

ctor: Function,
options: JsonSchemaOptions = {},
): JSONSchema {
options.visited = options.visited || {};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please don't modify original options, see the discussion above.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

applied and new test case added.

@jannyHou jannyHou force-pushed the fix/circular-reference branch 3 times, most recently from d007850 to 20bb8a2 Compare April 16, 2019 19:32
Copy link
Contributor

@b-admike b-admike left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍

@@ -641,5 +680,55 @@ describe('build-schema', () => {
},
});
});
it('does not pollute the JSON schema options', () => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we can treat this as a unit test instead. Feel free to ignore.

Copy link
Contributor

@nabdelgadir nabdelgadir left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just have one minor question, but LGTM 👍

}

const JSON_SCHEMA_OPTIONS = {};
// tslint:disable-next-line:no-unused
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question, why do you need the const categorySchema = part instead of just calling getJsonSchema(Category, JSON_SCHEMA_OPTIONS);?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch!

@raymondfeng
Copy link
Contributor

@jannyHou Do we have a test to make sure such json schemas are converted to OpenAPI spec correctly? For example, using:

components
  - schemas
     - Product

@jannyHou
Copy link
Contributor Author

@raymondfeng Yeah I believe we have it in the openapi-v3 package, see test

@jannyHou jannyHou force-pushed the fix/circular-reference branch from 20bb8a2 to acf6da6 Compare April 22, 2019 15:00
@jannyHou jannyHou merged commit 9b49773 into master Apr 22, 2019
@jannyHou jannyHou deleted the fix/circular-reference branch April 25, 2019 14:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Fix repository-json-schema to handle circular references
5 participants