Skip to content

GraphQL plugin generates invalid operations against large schemas (missing required nested args, composite fields without selections) #1146

Description

@dmorn

Summary

The GraphQL plugin auto-generates a query/mutation string for every root field
at connection-sync time. For non-trivial schemas the generated operations are
not valid GraphQL and are rejected by the server, so the resulting tools fail on
every call.

Two distinct, independently fatal problems were observed:

  1. Nested fields that require arguments are selected with no arguments.
  2. Composite (object/connection) fields are emitted without a sub-selection
    when the generator hits its depth limit or its cycle guard.

Both are produced by the selection-set builder regardless of the connection's
configuration (endpoint and auth were correct, introspection completed, tools
synced).

Environment

  • GraphQL endpoint: a GitLab GraphQL API, server version 19.1.1.
    (The plugin ships a bundled "GitLab" preset pointing at the public
    https://gitlab.com/api/graphql, which exposes the same schema family.)
  • Plugin: packages/plugins/graphql.

Steps to reproduce

  1. Create a GraphQL connection against a GitLab GraphQL endpoint (the bundled
    GitLab preset works).
  2. Let the plugin introspect and sync tools.
  3. Call any generated tool whose return type is a rich object, e.g.
    query.metadata or query.currentUser.

Expected

The generated operation is valid GraphQL and the call returns data (or at least
a usable subset of fields).

Actual

The call fails with GraphQL validation errors from the server, because the
generated operation string is invalid.

Example 1: required nested argument omitted (query.metadata)

Generated operation string:

query Metadata {
  metadata {
    enterprise
    featureFlags { enabled name }
    kas { enabled externalK8sProxyUrl externalUrl version }
    revision
    version
  }
}

Server response:

Field 'featureFlags' is missing required arguments: names

metadata.featureFlags requires a names argument; the generator selected it
with none.

Example 2: composite fields emitted bare + invalid leaf selections (query.currentUser)

Generated operation string (excerpt):

query CurrentUser {
  currentUser {
    active
    admin
    assignedMergeRequests {
      count
      edges { cursor node }        # `node` (MergeRequest) has no sub-selection
      nodes {
        allowCollaboration
        allowsMultipleAssignees
        allowsMultipleReviewers
        approvalState              # rejected: field does not exist on MergeRequest
        approvalsLeft              # rejected: field does not exist on MergeRequest
        approvalsRequired          # rejected: field does not exist on MergeRequest
        approved
        approvedBy                 # connection type, emitted with no sub-selection
        assignees
        author
        ...
      }
      ...
    }
    ...
  }
}

Server responses include:

Field must have selections (field 'node' returns MergeRequest but has no selections)
Field 'approvalState' doesn't exist on type 'MergeRequest'
Field 'approvalsLeft' doesn't exist on type 'MergeRequest'
Field 'approvalsRequired' doesn't exist on type 'MergeRequest'
Field must have selections (field 'approvedBy' returns UserCoreConnection but has no selections)

node and approvedBy are composite types emitted with no sub-selection, which
is invalid. (Several leaf fields are also reported as nonexistent on the live
type; see note below.)

Where this comes from in the codebase

packages/plugins/graphql/src/sdk/plugin.ts, buildSelectionSet
(lines ~220-249):

const buildSelectionSet = (
  ref: IntrospectionTypeRef,
  types: ReadonlyMap<string, IntrospectionType>,
  depth: number,
  seen: Set<string>,
): string => {
  if (depth > 2) return "";

  const leafName = unwrapTypeName(ref);
  if (seen.has(leafName)) return "";

  const objectType = types.get(leafName);
  if (!objectType?.fields) return "";

  const kind = objectType.kind;
  if (kind === "SCALAR" || kind === "ENUM") return "";

  seen.add(leafName);

  const subFields = objectType.fields
    .filter((f: IntrospectionField) => !f.name.startsWith("__"))
    .slice(0, 12)
    .map((f: IntrospectionField) => {
      const sub = buildSelectionSet(f.type, types, depth + 1, seen);
      return sub ? `${f.name} ${sub}` : f.name;
    });

  seen.delete(leafName);

  return subFields.length > 0 ? `{ ${subFields.join(" ")} }` : "";
};

Two mechanisms in this function map directly to the failures above:

  1. No handling of nested field arguments. Each selected sub-field is emitted
    as just f.name (or f.name { ... }). The field's args are never
    inspected, so any nested field with a required argument (like
    metadata.featureFlags(names:)) produces an invalid selection. Required
    arguments are only threaded for the root field elsewhere
    (buildOperationStringForField builds varDefs/argPasses from
    field.args), not for nested fields.

  2. Composite fields can be emitted without a selection set. When the
    recursive call returns "", the field is emitted bare via
    return sub ? \${f.name} ${sub}` : f.name. The recursive call returns ""`
    in several cases that include composite types:

    • if (depth > 2) return ""; (depth cutoff), and
    • if (seen.has(leafName)) return ""; (cycle guard).

    In both cases the parent still prints the field name with no { ... }, which
    is invalid for object/connection types (e.g. node, approvedBy).

Additionally, .slice(0, 12) silently truncates each level to the first 12
fields, so generated selections are also arbitrarily partial.

Note on the "field does not exist" errors

The approvalState / approvalsLeft / approvalsRequired errors indicate the
generated selection referenced fields the live schema rejects. Whichever the
cause (a mismatch between the introspected schema snapshot used to generate the
selection and the live schema at call time), it is a further consequence of
freezing a single machine-generated selection at sync time and reusing it
verbatim on every call.

Impact

On a large real-world schema (GitLab being a representative example), the
generated GraphQL tools are unusable: every call against a rich object type
fails validation before any data is returned.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions