Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Alternative GraphQL schema abstraction #823

Closed
wants to merge 22 commits into from

Conversation

pcmanus
Copy link
Contributor

@pcmanus pcmanus commented Jun 17, 2021

WIP: open to early comments, but not to be considered ready for a full review.

This adds a new abstraction that is meant to be alternative to GrapQLSchema for federation code with the following main differences:

  • Directives (particularly thinking directive applications) are first class citizen.
  • It is mutable, has a notion of SchemaElement and has links between elements that reference each others. With the goal of making transformation of schemas and schema navigation a lot easier/safer.
  • It allows to work with federation specific "built-in" a bit more conveniently.

Copy link
Contributor

@martijnwalraven martijnwalraven left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great overall! I'll have to finish the review later, but wanted to submit the comments so far.

core-js/src/buildSchema.ts Outdated Show resolved Hide resolved
core-js/src/definitions.ts Outdated Show resolved Hide resolved
core-js/src/definitions.ts Show resolved Hide resolved
core-js/src/definitions.ts Outdated Show resolved Hide resolved
}

applyDirective(directive: Directive): Directive;
applyDirective(definition: DirectiveDefinition, args?: Map<string, any>): Directive;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although Map is preferable for storage nowadays, I think an object (Record<string, any>) makes for a more natural API for args because you can pass in an object literal.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also think we'll want DirectiveDefinition<TArgs> so we can have (optional) strong typing for directive arguments.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think an object (Record<string, any>) makes for a more natural API for args

Fair point.

I also think we'll want DirectiveDefinition<TArgs> so we can have (optional) strong typing for directive arguments.

I'll check how that could work in practice (I can see how strong typing for known directives would be nice; I just wonder how carrying this between DirectiveDefinition and Directive will work in practice).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I though, I can't really find a way to handle DirectiveDefinition<TArgs> in a safe way. I mean, the args of a definition are complex objects that can have directives and whatnot, and so I don't see a way to have DirectiveDefinition take a generic that reflect those properly and could be transformed to type-check the applications of that directives.

What I did do is add a DirectiveDefinition<TApplicationArgs>, where TApplicationArgs is directly the type of application arguments. But this does mean there is no way to ensure (statically or not really) that this generic argument is valid. However, assuming it is valid, it does provide nice type-checking for the related directives and I think that's still sufficiently useful to be worth the slight risk of getting this wrong.

Please have a look at a969ad5 for what I did do, see if this look anywhere near what you had in mind.

core-js/src/definitions.ts Outdated Show resolved Hide resolved
core-js/src/definitions.ts Outdated Show resolved Hide resolved
/**
* Removes this type definition from its parent schema.
*
* After calling this method, this type will be "detached": it wil have no parent, schema, fields,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand a detached type won't have a parent, but why would fields, values, directives, etc. be removed as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a comment in SchemaElement.checkUpdate() that goes to try to explain the reasoning a bit. But long story short, having to deal with detached elements can be a bit messy.

If we only detach the parent, we may end up with a detached field whose type is still attached, and that sound messy/error prone. But more generally, the problem is around behaviour when you "attach" (add) an detached element that itself has a deep hierarchy of elements. Think: I'm adding a field to a type, but that field link to another detached Object Type that itself may have fields linking to other detached types, implements detached interfaces and sprinkled in there there is some directe applications that indirectly link to "some" definition. First, we'd have to recursively explore that hierarchy to ensure nothing is attached somewhere else. But even then, we'd have to properly attach everything, which concretely means that the user might have call addField on a schema, and as a result 10 new types and whatnot are added to the schema on top of the field. Of course, maybe that's desired if you know what you're doing, but to me, this sounds like complex behaviour that is more likely to lead to surprising results.

Overall, the idea of the current patch is that while detached element are something we unfortunately have to deal, it tries to make it as innocuous/predictable as possible by ensure that a detached element is always "just an empty shell". Which is done effectively by:

  1. no allowing any modification methods on a detached element (SchemaElement.checkUpdate).
  2. cleaning up elements when they are removed.

Copy link
Contributor

@martijnwalraven martijnwalraven Jun 18, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, dealing with detached elements is messy unfortunately. I agree we should avoid bringing in a whole hierarchy when re-attaching an element. (If only for the difficulty of dealing with duplicate elements, we don't want to have to merge definitions or handle conflicts.) I'm not sure the 'empty shell' semantics make sense to me though. My idea was that references to other elements would always be by name, so if you remove a field for example, you can attach it to a type in another schema but it would be your responsibility to ensure referenced elements are also added.

Comment on lines +351 to +458
private readonly _builtInTypes: Map<string, NamedType> = new Map();
private readonly _types: Map<string, NamedType> = new Map();
private readonly _builtInDirectives: Map<string, DirectiveDefinition> = new Map();
private readonly _directives: Map<string, DirectiveDefinition> = new Map();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the distinction between builtInTypes/types and builtInDirectives/directives? As built-in elements live in the same namespace as user-defined elements, can't they share a Map?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is somewhat opinionated but I have a strong hunch that much more code that will end up using this library will primarily not want to iterate about builtins (it may care they exists, but will not iterate/manipulate them). For instance, printing doesn't print those, composition will not merge those, any inaccessible removal won't care about those, etc... I'm sure at some point some code will want to iterate over all types/directives include builtins, but those example don't come easily to mind to me.

That's why allChildElements don't return builtins for instance (though I suppose maybe an optional argument to include them could make sense) and why Schema.types and Schema.directives don't include them as well.

So that's the justification for the external API, and the argument can be shorten as "I think it's more useful this way and that if we only always return all types/directives, everyone will have to filter built-ins out all the time".

Of course, internally, we can do whatever we want. We could have all in the same map, yet have the same API and filter built-in out within the accessor methods. But that just feel less efficient for no good reason.

/**
* Removes this type definition from its parent schema.
*
* After calling this method, this type will be "detached": it wil have no parent, schema, fields,
Copy link
Contributor

@martijnwalraven martijnwalraven Jun 18, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, dealing with detached elements is messy unfortunately. I agree we should avoid bringing in a whole hierarchy when re-attaching an element. (If only for the difficulty of dealing with duplicate elements, we don't want to have to merge definitions or handle conflicts.) I'm not sure the 'empty shell' semantics make sense to me though. My idea was that references to other elements would always be by name, so if you remove a field for example, you can attach it to a type in another schema but it would be your responsibility to ensure referenced elements are also added.

core-js/src/definitions.ts Outdated Show resolved Hide resolved
core-js/src/definitions.ts Outdated Show resolved Hide resolved
core-js/src/definitions.ts Outdated Show resolved Hide resolved
core-js/src/definitions.ts Outdated Show resolved Hide resolved
core-js/src/definitions.ts Outdated Show resolved Hide resolved
}
}

// TODO: How do we deal with default values? It feels like it would make some sense to have `argument('x')` return the default
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good question. I would say it makes sense for arguments (or perhaps argumentValues) to return the resolved argument values that take default values into account (that's also consistent with getArgumentValues()/getDirectiveValues() in graphql-js. But I don't think default values should be included in the internal representation, because you want to differentiate between explicitly providing a value and relying on the default value. That also means changing the default value in the directive definition should affect argumentValues, making that a function that depends on the directive definition (again, like getArgumentValues).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also see graphql/graphql-js#3049 for some background on default value semantics.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I definitively agree about not storing the default values internally, that wasn't my intention.

I'll note thought that you're suggestion about having optional TArgs for directives make this a tad less obvious. Because with that, it made sense to remove the Directive.argument(name: string): any method, and always rely on get Directive.arguments(): TArgs (I mean, what's the point of doing directive.argument('foo') and potentially losing type checking, when you can do directive.arguments.foo and not lose type checking).

But it's a tad less clear how to handle defaults in that case (API wise I mean).

I can think of a few options:

  1. stop having Directive.arguments be a getter, and be Directive.arguments(withDefaultValues: boolean = false): TArgs. If you pass withDefaultValues = true, we'd copy the underlying args object and set undefined keys to their defaults if necessary.
  2. keep Directive.arguments a getter, have it always include default values but have an additional isDefaultValue(name: string) for if you want to know if a particular value was provided.
  3. keep Directive.arguments a getter, have it not include default values and have a separate method argumentOrDefault(na me: string): any that can alternative give you values with defaults. I like this less because you don't get type-checked access if you also want defaults.

Overall, leaning towards option 1.

import { BuiltIns, Schema, DirectiveDefinition, NonNullType } from "./definitions";

// TODO: Need a way to deal with the fact that the _Entity type is built after validation.
export class FederationBuiltIns extends BuiltIns {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should consider the Federation directives + types as built-in. I suspect this is where we'll want to introduce some notion of Feature to take care of the namespace mapping in core schemas. We still need to support the current composition behavior when dealing with subgraphs that do not use core schema, but perhaps we can handle that through a more general mechanism as well (optionally marking a feature as built-in?).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need both Feature and built-ins in some fashion (reminder that this PR doesn't pretend to be ready). And for good or bad, we do need to support existing federation directives as "built-in" for backward compatibility (I mean built-in in the very practical sense that they exists "magically" without the user having defined them in any way).

So I think we need this, at least for now. Hopefully, in some future version, we'll remove FederationBuiltIns and all federation types and directives will comes through Features, but we're not there yet.

I will note that I include @inacessible in there and we may not want to expose that one as "built-in" and instead only support it as a feature. So please don't read anything into that one other than "this is work in progress".

but perhaps we can handle that through a more general mechanism as well

Sure, maybe once we add Feature support in this, we've want to refactor a bit, that's ok. But I don't think the way it's currently done is too bad either, because currently, when you use federation, those types and directives are essentially built-in.

Sylvain Lebresne added 17 commits June 22, 2021 11:17
A few reasons (in rough order of importance):
1. Schema is more precise. A document references to any graphQL document
   which may comprise any kind of definition, including executable ones.
   Here we're truly represent a "schema", that is a coherent set of type
   system definitions.
2. It was not great that a SchemaElement was not really really an
   element of a Schema, but rather an element of a Document. It fixes
   that (in a imo better way than renaming SchemaElement to
   DocumentElement).
3. Schema maps a bit better to CQLSchema, so make it clearer what this
   essentially replaces.
4. Document is actually a builtin type name in javascript so that was
   problematic, which is why it was actually named `GraphQLDocument`
   but the `GraphQL` prefixing was inconcistent with the rest of the
   naming (and fwiw, I don't think prefixing everything with
   `GraphQL` offers good value; you know you are dealing with graphQL).

This does mean the existing `Schema` type was renamed to
`SchemaDefinition`.
This ended up confusing the type system in a few too many places and
added a fair amount of complexity/hackery.

Instead, only a mutable version is preserved.
Since `Schema` exposes `Schema.type(name)`, `Schema.types` was only ever
use to iterate on the type values and the fact it was a map was more
getting in the way than anything. Having it return an iterable/generator
avoid the cruft of always calling `values()` on the return.

The patch also make those functions actual methods of `Schema` instead
of getters, as having getters returning generators feels weird. This
also also have an arg to those method to optionally include the
built-ins (even if the default is still not too).
The "design" of extensions tries to optimize for extensions being
supported, but not getting in the way. The assumption is that the
most code are likely to not want to care if something is an extension or
a proper definition. So a type is still just one object, regardless of
whether it's components comes from both a definition, an extension or
a mix of those. However, each type lists how many extensions it's
been built from, and each direct component of the type lists if it
comes from an extension.

In other wordds, it's easy to ignore whether something comes from
extensions or not, but it's easy to know if a particular type element
was originally defined in an extension or not. Rebuilding a particular
extension however do require to iterate on the elements of the type
and filter those that belong to the extension we care about.

Alternative designs have been considered:
- having extensions be completely distinct objects from definitions.
  However, it feels painful that for a given type name you'll always
  have to potentially deal with multiple objects (a definition and a
  list of potential extensions).
- Have a hierachy where we have an object for each type, but also
  separate object for defintiion and extensions. The "type" object
  would point to it's constituent (definition and/or extensions)
  but would also have methods to conveniently access all the
  elements of the type. In other world, we'd have `ObjectTypeDefinition`
  and `ObjecTypeExtensions`, both of which my have `fields`, and
  then we have `ObjectType` that points to (at most) an
  `ObjectTypeDefinition` and some `ObjectTypeExtensions` and also
  expose `fields` which is just a merged iterator over the fields
  in the definition and extensions.
  In term of exposing extension, this would be fairly clean.
  However, this require a lot of reorganisation of the library
  and quite a bit of new absctractions. Concept like `SchemaElement`
  gets a bit more complicated (you don't want `schema.allSchemaElement`).
  to return both `ObjectType` _and_ `ObjectTypeDefinition`/`ObjectTypeExtension`
  because that would duplicate things, but you still want to think of
  all of those as "schema elements" in other context, and it's unclear
  how to not make all of this overly complex. The 'parent' of elements
  like fields also become a more complex thing, etc...
- A mix of previous points where you only have `ObjectTypeDefinition`
  and `ObjectTypeExtension`, but `ObjectTypeDefinition` has methods
  like `ownFields` and `fields` (the later return both `ownFields` and
  the extensions fields).
  While this simplify some of the question of the previous point
  but not all of them (this still complicate the hiearchy quite a bit)
  and this introduces other awkwardness (for instance, having
  `directives` on an `ObjectTypeDefinition` also including the
  directives of the extensions can be a tad suprising at first, but if
  you call it `allDirectives` to avoid the confusion, now you get some
  inconsistencies with other elements so it's not perfect either.

Tl;dr, while the current design has minor downside, the alternative have
other ones and feels more complex overall.
@pcmanus
Copy link
Contributor Author

pcmanus commented Dec 16, 2021

Closing as this is drastically outdated (and a more up-to-date version of this is now on main).

@pcmanus pcmanus closed this Dec 16, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants