Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] Implementation of June 2023 incremental delivery format with @defer #1074

Draft
wants to merge 13 commits into
base: benjie/incremental-common
Choose a base branch
from

Conversation

benjie
Copy link
Member

@benjie benjie commented Feb 6, 2024

This RFC introduces an alternative solution to incremental delivery,
implementing the
June 2023 response format.

This solution aims to minimize changes to the existing execution algorithm, when
comparing you should compare against benjie/incremental-common
(#1039) to make the diff easier to
understand. I've raised this PR against that branch to make it clearer.

The RFC aims to avoid mutations and side effects across algorithms, so as to fit
with the existing patterns in the GraphQL spec. It also aims to leverage the
features we already have in the spec to minimize the introduction of new
concepts.

WORK IN PROGRESS: there's likely mistakes all over this currently; and a lot
will need to be done to maintain consistency of the prose and algorithms.

This RFC works by adjusting the execution algorithms in a few small ways:

  1. It introduces the concept of "delivery groups".
    Previously GraphQL can be thought of has having just a single delivery group
    (called the "root delivery group" in this RFC) - everything was delivered at
    once. With "incremental delivery", we're deliverying the data in multiple
    phases, or groups. A "delivery group" keeps track of which fields belong to
    which @defer, such that we can complete one delivery group before moving on
    to its children.
  2. CollectFields() now returns a map of "field digests" rather than just
    fields.

    CollectFields() used to generate a map between response key and field
    selection (Record<string, FieldNode>), but now it creates a map between
    response key and a "field digest", an object which contains both the field
    selection and the delivery group to which it belongs
    (Record<string, { field: FieldNode, deliveryGroup: DeliveryGroup }>). As
    such, CollectFields() is now passed the current path and delivery group as
    arguments.
  3. ExecuteRootSelectionSet() may return an "incremental event stream".
    If there's no @defer then ExecuteRootSelectionSet() will return data/errors
    as before. However, if there are active @defers then it will instead return
    an event stream which will consist of multiple incremental delivery payloads.
  4. ExecuteGroupedFieldSet() runs against a set of "current delivery
    groups".

    If multiple sibling delivery groups overlap, the algorithm will first run the
    fields common to all the overlapping delivery groups, and only when these are
    complete will it execute the remaining fields in each delivery group (in
    parallel). This might happen over multiple layers. This is tracked via a set
    of "current delivery groups", and only fields which exist in all of these
    current delivery groups will be executed by ExecuteGroupedFieldSet().
  5. ExecuteGroupedFieldSet() returns the currently executed data, as before,
    plus details of incremental fields yet to be delivered.

    When there exists fields not executed in ExecuteGroupedFieldSet()
    (because they aren't in every one of the "current delivery groups"), we store
    "incremental details" of the current grouped field set (by its path), for
    later execution. The incremental details consists of:
    • objectType - the type of the concrete object the field exists on (i.e.
      the object type passed to ExecuteGroupedFieldSet())
    • objectValue - the value of this object (as would be passed as the first
      argument to the resolver for the field)
    • groupedFieldSet - similar to the result of CollectFields(), but only
      containing the response keys that have not yet been executed
  6. CompleteValue() continues execution in the "current delivery groups".
    We must pass the path and current delivery groups so that we can execute the
    current delivery groups recursively.
  7. CompleteValue() returns the field data, as before, plus details of
    incremental subfields yet to be delivered.

    As with ExecuteGroupedFieldSet(), CompleteValue() must pass down details
    of any incremental subfields that need to be executed later.

At a @defer boundary, a new DeliveryGroup is created, and field collection
then happens within this new delivery group. This can happen multiple times in
the same level, for example:

{
  # root delivery group
  currentUser {
    name
  }
  ... @defer {
    # Child delivery group
    expensiveField {
      id
    }
    ... @defer {
      # Grandchild delivery group
      veryExpensiveField {
        title
      }
    }
  }
}

If no @defer exists then no new delivery groups are created, and thus the
request executes as it would have done previously. However, if there is at least
one active @defer then the client will be sent the initial response along with
a list of pending delivery groups. We will then commence executing the
delivery groups, delivering them as they are ready.

Note: when an error occurs in a non-null field, the incremental details gathered
in that selection set will be blown up alongside the sibling fields - we use the
existing error handling mechanisms for this.

This PR is nowhere near complete. I've spent 2 days on this latest iteration
(coming up with the new stream and partition approach as the major breakthrough)
but I've had to stop and I'm not sure if I've left gaps. Further, I need to
integrate Rob's hard work in #742 into it.

To make life a bit easier on myself, I've written some TypeScript-style
declarations of the various algorithms used in execute, according to this RFC.
This may not be correct and is definitely non-normative, but might be useful to
ease understanding.

type RawVariables = { [variableName: string]: any };
type CoercedVariables = { [variableName: string]: any };

function ExecuteRequest(
  schema: GraphQLSchema,
  document: Document,
  operationName: string | null,
  variableValues: RawVariables,
  initialValue: any
):
  | ReturnType<typeof ExecuteQuery>
  | ReturnType<typeof ExecuteMutation>
  | ReturnType<typeof Subscribe>;

function GetOperation(
  document: Document,
  operationName: string | null
): Operation;

function CoerceVariableValues(
  schema: GraphQLSchema,
  operation: Operation,
  variableValues: RawVariables
): CoercedVariables;

function ExecuteQuery(
  query: Document,
  schema: GraphQLSchema,
  variableValues: CoercedVariables,
  initialValue: any
): ReturnType<typeof ExecuteRootSelectionSet>;

function ExecuteMutation(
  mutation: Document,
  schema: GraphQLSchema,
  variableValues: CoercedVariables,
  initialValue: any
): ReturnType<typeof ExecuteRootSelectionSet>;

function Subscribe(
  subscription: Document,
  schema: GraphQLSchema,
  variableValues: CoercedVariables,
  initialValue: any
): ReturnType<typeof MapSourceToResponseEvent>;

function CreateSourceEventStream(
  subscription: Document,
  schema: GraphQLSchema,
  variableValues: CoercedVariables,
  initialValue: any
): ReturnType<typeof ResolveFieldEventStream>;

function ResolveFieldEventStream(
  subscriptionType: GraphQLObjectType,
  rootValue: any,
  fieldName: string,
  argumentValues: { [argumentName: string]: any }
): EventStream<any>;

interface ExecutionResult {
  data?: any;
  errors?: GraphQLError[];
}

type Path = Array<string | number>;

interface Pending {
  id: number;
  path: Path;
  label?: string;
}

interface InitialIncrementalResult {
  data: Record<string, any>;
  errors?: GraphQLError[];
  hasNext: true;
  pending: Pending[];
}

interface Completed {
  id: number;
  // Errors that bubbled to the root of the defer/stream
  errors?: GraphQLError[];
}

type IncrementalPayload = DeferredPayload | StreamedPayload;

interface DeferredPayload {
  id: number;
  subpath?: Path;
  data: Record<string, any>;
  // Errors that happened _successfully_ (i.e. did not bubble up to the @defer)
  errors?: GraphQLError[];
}

interface StreamedPayload {
  id: number;
  items: Array<any | null>;
  // Errors that happened _successfully_ (i.e. did not invalidate the stream)
  errors?: GraphQLError[];
  data: any;
}

interface SubsequentIncrementalResult {
  hasNext: boolean;
  pending?: Pending[];
  completed?: Completed[];
  incremental?: IncrementalPayload[];
}

function MapSourceToResponseEvent(
  sourceStream: EventStream<any>,
  subscription: Document,
  schema: GraphQLSchema,
  variableValues: CoercedVariables
): EventStream<
  ExecutionResult | InitialIncrementalResult | SubsequentIncrementalResult
>;

function ExecuteSubscriptionEvent(
  subscription: Document,
  schema: GraphQLSchema,
  variableValues: CoercedVariables,
  initialValue: any
): ExecutionResult | ReturnType<typeof IncrementalEventStream>;

function Unsubscribe(responseStream: EventStream<any>): void;

function ExecuteRootSelectionSet(
  variableValues: CoercedVariables,
  initialValue: any,
  objectType: GraphQLObjectType,
  selectionSet: SelectionSet,
  serial: boolean = false
): ExecutionResult | ReturnType<typeof IncrementalEventStream>;

interface DeliveryGroup {
  id: number;
  path: Path;
  parent: DeliveryGroup | null;
  label?: string;
}

interface IncrementalDetails {
  groupedFieldSet: FieldSet;
  objectType: GraphQLObjectType;
  objectValue: any;
}

type IncrementalDetailsByPath = { [path: Path]: IncrementalDetails };

function ExecuteGroupedFieldSet(
  groupedFieldSet: GroupedFieldSet,
  objectType: GraphQLObjectType,
  objectValue: any,
  variableValues: CoercedVariables,
  path: Path,
  currentDeliveryGroups: DeliveryGroup[]
): [
  resultMap: Record<string, any>,
  incrementalDetailsByPath: IncrementalDetailsByPath
];

interface FieldDigest {
  selection: FieldNode;
  deliveryGroup: DeliveryGroup;
}

function CollectFields(
  objectType: GraphQLObjectType,
  selectionSet: SelectionSet,
  variableValues: CoercedVariables,
  path: Path,
  deliveryGroup: DeliveryGroup,
  visitedFragments: Set<string> = new Set()
): { [responseKey: string]: FieldDigest[] };

function DoesFragmentTypeApply(
  objectType: GraphQLObjectType,
  fragmentType: GraphQLObjectType | GraphQLInterfaceType | GraphQLUnionType
): boolean;

function ExecuteField(
  objectType: GraphQLObjectType,
  objectValue: any,
  fieldType: GraphQLType,
  fieldDigests: FieldDigest[], // All for this field
  variableValues: CoercedVariables,
  path: Path,
  currentDeliveryGroups: DeliveryGroup[]
): ReturnType<typeof CompleteValue>;

function CompleteValue(
  fieldType: GraphQLType,
  fieldDigests: FieldDigest[], // All for this field
  result: any,
  variableValues: CoercedVariables,
  path: Path,
  currentDeliveryGroups: DeliveryGroup[]
): [
  result:
    | null
    | ScalarValue
    | EnumValue
    | Record<string, any>
    | Array<null | ScalarValue | EnumValue | Record<string, any>>,
  incrementalDetailsByPath: IncrementalDetailsByPath
];

function CoerceResult(
  leafType: GraphQLScalarType | GraphQLEnumType,
  value: any
): any;

function ResolveAbstractType(
  abstractType: GraphQLInterfaceType | GraphQLUnionType,
  objectValue: any
): GraphQLObjectType;

function CollectSubfields(
  objectType: GraphQLObjectType,
  fieldDigests: FieldDigest[],
  variableValues: CoercedVariables,
  path: Path
): { [responseKey: string]: FieldDigest[] };

function IncrementalEventStream(
  data: Record<string, any>,
  errors: GraphQLError[] | undefined,
  initialIncrementalDetailsByPath: IncrementalDetailsByPath,
  variableValues: CoercedVariables
): EventStream<InitialIncrementalResult | SubsequentIncrementalResult>;

function CollectDeliveryGroups(
  incrementalDetailsByPath: IncrementalDetailsByPath,
  excludingDeliveryGroups: Set<DeliveryGroup> = new Set()
): DeliveryGroup[];

function MakePending(deliveryGroups: DeliveryGroup[]): Pending[];

function IncrementalStreams(
  incrementalDetailsByPath: IncrementalDetailsByPath
): EventStream<SubsequentIncrementalResult>;

function PartitionDeliveryGroupsSets(
  incrementalDetailsByPath: IncrementalDetailsByPath
): Array<Set<DeliveryGroup>>;

function IncrementalStream(
  incrementalDetailsByPath: IncrementalDetailsByPath,
  deliveryGroupsSet: Set<DeliveryGroup>
): EventStream<SubsequentIncrementalResult>;

function SplitRunnable(
  incrementalDetailsByPath: IncrementalDetailsByPath,
  runnableDeliveryGroupsSet: Set<DeliveryGroup>
): [
  remainingIncrementalDetailsByPath: IncrementalDetailsByPath,
  runnable: IncrementalDetailsByPath
];

function MergeIncrementalDetailsByPath(
  incrementalDetailsByPath1: IncrementalDetailsByPath,
  incrementalDetailsByPath2: IncrementalDetailsByPath
): IncrementalDetailsByPath;

@yaacovCR
Copy link
Contributor

@benjie

Just checking — does this algorithm handles the test case in graphql/graphql-js#3997 correctly?

Can inclusion of a field in a nested deferred fragment — where that field is present in a parent result and so will never be delivered with the child — muck with how the delivery groups are created?

@benjie
Copy link
Member Author

benjie commented May 14, 2024

Can inclusion of a field in a nested deferred fragment — where that field is present in a parent result and so will never be delivered with the child — muck with how the delivery groups are created?

It shouldn't cause an issue because it's based on field collection, so both of the shouldBeWithNameDespiteAdditionalDefer will be grouped together at the same time (with different "defer paths") - they aren't treated as separate fields - we visit deferred and non-deferred fields alike in the same selection set, and then partition their execution based on defers.

(Note this may not actually be the case in the current algorithm because it may have bugs, but this is the intent.)

query HeroNameQuery {
  ... @defer {
    hero {
      id
    }
  }
  ... @defer {
    hero {
      name
      shouldBeWithNameDespiteAdditionalDefer: name
      ... @defer {
        shouldBeWithNameDespiteAdditionalDefer: name
      }
    }
  }
}

First group does nothing, but notes that hero exists and is deferred (twice).

Next is creates two new groups for the defers, and a "shared" group. The shared group executes the hero field, and then the subfields are executed in the two separate groups afterwards.

When grouping the subfields on the second of these groups it's noted that shouldBeWithNameDespiteAdditionalDefer exists twice (but these two usages are collected together) and the simpler "defer" wins, such that it's evaluated in the parent and the deferred-defer evaporates since it doesn't contain any new field selections.

@yaacovCR
Copy link
Contributor

Note this may not actually be the case in the current algorithm because it may have bugs, but this is the intent.

In my spec and TS implementation, we handle this by having each DeferUsage save its parent DeferUsage if it exists, and then performing some filtering downstream.

I have the sense that your current algorithm does not correctly handle this case — but I am hoping that it does, because if it does, it manages to do so without that tracking, which I would want to emulate if possible.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants