Skip to content

MCP connection liveness health checks#1112

Open
RhysSullivan wants to merge 1 commit into
claude/health-checks-identityfrom
claude/health-checks-mcp
Open

MCP connection liveness health checks#1112
RhysSullivan wants to merge 1 commit into
claude/health-checks-identityfrom
claude/health-checks-mcp

Conversation

@RhysSullivan

@RhysSullivan RhysSullivan commented Jun 23, 2026

Copy link
Copy Markdown
Owner

Stacked on #1111. Gives MCP connections a liveness probe (dial the server, list tools, classify 401/403 as expired). MCP is liveness-only: no usable identity source, so no identity is derived and the operation/identity editor stays hidden; the generic status dot + "Check now" still render. Factors HTTP-status extraction into a shared helper so the probe can classify auth failures.

Stack

  1. Fix combobox inside the health-check editor sheet #1110
  2. Connection health checks (liveness) with OpenAPI backing #1108
  3. Connection account info: derive identity from the health-check probe #1111
  4. MCP connection liveness health checks #1112 👈 current
  5. Health checks for Microsoft Graph and Google #1109

@cloudflare-workers-and-pages

cloudflare-workers-and-pages Bot commented Jun 23, 2026

Copy link
Copy Markdown

Deploying with  Cloudflare Workers  Cloudflare Workers

The latest updates on your project. Learn more about integrating Git with Workers.

Status Name Latest Commit Preview URL Updated (UTC)
✅ Deployment successful!
View logs
executor-marketing 1327086 Commit Preview URL

Branch Preview URL
Jun 26 2026, 02:29 AM

@github-actions

github-actions Bot commented Jun 23, 2026

Copy link
Copy Markdown
Contributor

Cloudflare preview

Console https://executor-preview-pr-1112.executor-e2e.workers.dev
MCP https://executor-preview-pr-1112.executor-e2e.workers.dev/mcp
Deployed commit 1327086

Sign-in is Cloudflare Access (one-time PIN to an allowed email). The preview has its own database and encryption key; it is destroyed when this PR closes.

@cloudflare-workers-and-pages

cloudflare-workers-and-pages Bot commented Jun 23, 2026

Copy link
Copy Markdown

Deploying with  Cloudflare Workers  Cloudflare Workers

The latest updates on your project. Learn more about integrating Git with Workers.

Status Name Latest Commit Updated (UTC)
✅ Deployment successful!
View logs
executor-cloud 1327086 Jun 26 2026, 02:30 AM

@pkg-pr-new

pkg-pr-new Bot commented Jun 23, 2026

Copy link
Copy Markdown

Open in StackBlitz

@executor-js/cli

npm i https://pkg.pr.new/@executor-js/cli@1112

@executor-js/config

npm i https://pkg.pr.new/@executor-js/config@1112

@executor-js/execution

npm i https://pkg.pr.new/@executor-js/execution@1112

@executor-js/sdk

npm i https://pkg.pr.new/@executor-js/sdk@1112

@executor-js/codemode-core

npm i https://pkg.pr.new/@executor-js/codemode-core@1112

@executor-js/runtime-quickjs

npm i https://pkg.pr.new/@executor-js/runtime-quickjs@1112

@executor-js/plugin-file-secrets

npm i https://pkg.pr.new/@executor-js/plugin-file-secrets@1112

@executor-js/plugin-graphql

npm i https://pkg.pr.new/@executor-js/plugin-graphql@1112

@executor-js/plugin-keychain

npm i https://pkg.pr.new/@executor-js/plugin-keychain@1112

@executor-js/plugin-mcp

npm i https://pkg.pr.new/@executor-js/plugin-mcp@1112

@executor-js/plugin-onepassword

npm i https://pkg.pr.new/@executor-js/plugin-onepassword@1112

@executor-js/plugin-openapi

npm i https://pkg.pr.new/@executor-js/plugin-openapi@1112

executor

npm i https://pkg.pr.new/executor@1112

commit: 1327086

@greptile-apps

greptile-apps Bot commented Jun 23, 2026

Copy link
Copy Markdown

Greptile Summary

This PR adds a liveness-only health check for MCP connections: it dials the server and lists tools (reusing the discoverTools path), classifies HTTP 401/403 responses as "expired" credentials, and factors the HTTP-status extraction logic into a new shared http-status.ts module consumed by both the invoke and connect paths.

  • http-status.ts is extracted cleanly from invoke.ts and extended to connection.ts so the handshake error message now carries the HTTP status suffix needed for classification.
  • checkHealth in plugin.ts wires buildConnectorInputcreateMcpConnectordiscoverTools and maps errors through mcpLivenessFailureStatus; MCP has no identity source, so identity is not derived and only the status dot and "Check now" surface.
  • The e2e scenario validates the full healthy → expired transition using an in-process gated server, but pins remoteTransport: \"streamable-http\" to avoid the auto-fallback masking issue that affects default-configured connections.

Confidence Score: 4/5

Safe to merge for the refactor and shared-helper extraction; the health check will misclassify revoked credentials as degraded for the majority of saved connections that use the default auto transport.

The auto-transport fallback in connection.ts catches all non-OAuth connection errors — including ones that now carry "(HTTP 401)" — and retries via SSE before propagating. When SSE also fails (as it will on any streamable-http-only server), the error reaching mcpLivenessFailureStatus is the SSE transport failure, without the HTTP status suffix. The classification returns "degraded" instead of "expired" for revoked credentials on default-configured connections. The e2e test acknowledges this by pinning remoteTransport: "streamable-http", but that pin is not present in user connections.

packages/plugins/mcp/src/sdk/plugin.ts and packages/plugins/mcp/src/sdk/connection.ts — the auto-fallback logic needs to propagate auth-failure errors rather than retrying SSE when streamable-http returns a definitive 401 or 403.

Important Files Changed

Filename Overview
packages/plugins/mcp/src/sdk/plugin.ts New checkHealth implementation classifies 401/403 as expired vs. degraded, but the auto-transport fallback discards the auth-failure signal for default-configured connections, causing revoked credentials to report "degraded" instead of "expired".
packages/plugins/mcp/src/sdk/connection.ts Connect error handler now embeds the HTTP status suffix so the liveness classifier can distinguish auth failures; the auto-SSE-fallback path still catches all non-OAuth errors and retries, discarding the embedded status when SSE fails differently.
packages/plugins/mcp/src/sdk/http-status.ts New shared helper that extracts HTTP status from both StreamableHTTPError and SSE POST-failure messages; gracefully returns undefined on format drift, well-commented with re-verification note on SDK bumps.
packages/plugins/mcp/src/sdk/invoke.ts Straightforward refactor: local duplicates of statusFromSsePostError, statusFromStreamableHttpError, and httpStatusFromCause removed in favour of the new shared http-status.ts import; no behaviour change.
e2e/scenarios/health-checks-mcp.test.ts New e2e scenario covering healthy/expired/validated states with an in-process gated MCP server; correctly uses Effect.ensuring for cleanup. Pins remoteTransport: "streamable-http" to avoid the auto-fallback masking issue that would be present for default-configured connections.

Sequence Diagram

%%{init: {'theme': 'neutral'}}%%
sequenceDiagram
    participant HC as checkHealth
    participant BCI as buildConnectorInput
    participant CMC as createMcpConnector
    participant DT as discoverTools
    participant MLS as mcpLivenessFailureStatus

    HC->>BCI: config, values, template, allowStdio
    BCI-->>HC: ConnectorInput (with remoteTransport)

    alt "remoteTransport = streamable-http or sse"
        HC->>CMC: ConnectorInput
        CMC->>DT: connector Effect
        DT->>DT: connect + listTools()
        alt success
            DT-->>HC: healthy
        else McpToolDiscoveryError
            DT-->>MLS: error.message
            MLS-->>HC: expired (HTTP 401/403) or degraded
        end
    else "remoteTransport = auto (default)"
        HC->>CMC: ConnectorInput
        CMC->>DT: try streamable-http, fallback to SSE on any non-OAuth error
        Note over CMC: 401 from streamable-http triggers SSE fallback
        DT->>DT: SSE fails with transport error (no HTTP status)
        DT-->>MLS: Failed connecting via sse
        MLS-->>HC: degraded (misclassified - should be expired)
    end
Loading
%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%%
sequenceDiagram
    participant HC as checkHealth
    participant BCI as buildConnectorInput
    participant CMC as createMcpConnector
    participant DT as discoverTools
    participant MLS as mcpLivenessFailureStatus

    HC->>BCI: config, values, template, allowStdio
    BCI-->>HC: ConnectorInput (with remoteTransport)

    alt "remoteTransport = streamable-http or sse"
        HC->>CMC: ConnectorInput
        CMC->>DT: connector Effect
        DT->>DT: connect + listTools()
        alt success
            DT-->>HC: healthy
        else McpToolDiscoveryError
            DT-->>MLS: error.message
            MLS-->>HC: expired (HTTP 401/403) or degraded
        end
    else "remoteTransport = auto (default)"
        HC->>CMC: ConnectorInput
        CMC->>DT: try streamable-http, fallback to SSE on any non-OAuth error
        Note over CMC: 401 from streamable-http triggers SSE fallback
        DT->>DT: SSE fails with transport error (no HTTP status)
        DT-->>MLS: Failed connecting via sse
        MLS-->>HC: degraded (misclassified - should be expired)
    end
Loading

Reviews (4): Last reviewed commit: "feat: MCP connection liveness health che..." | Re-trigger Greptile

Comment on lines +1204 to +1216
return yield* discoverTools(connector).pipe(
Effect.map(
() =>
({ status: "healthy" as const, checkedAt: Date.now() }) satisfies HealthCheckResult,
),
Effect.catchTag("McpToolDiscoveryError", (error) =>
Effect.succeed({
status: mcpLivenessFailureStatus(error.message),
checkedAt: Date.now(),
detail: error.message,
} satisfies HealthCheckResult),
),
);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 listTools() HTTP status is silently discarded

discoverTools catches listTools() failures with a fixed string "Failed listing MCP tools", discarding the original exception entirely (catch: () => new McpToolDiscoveryError({...})). For SSE transports that validate auth on each request (i.e. the 401 is returned on the POST for listTools() rather than the GET that establishes the stream), the HTTP status never reaches mcpLivenessFailureStatus, so the credential is classified as "degraded" instead of "expired". The test comment acknowledges this by pinning streamable-http ("no auto SSE fallback to muddy the classification"), but an SSE-configured saved connection with a revoked token will silently report the wrong status.

Comment on lines +74 to +80
const authWalled =
lower.includes("oauth re-authorization") ||
lower.includes("(http 401)") ||
lower.includes("(http 403)") ||
lower.includes("unauthorized") ||
lower.includes("forbidden");
return authWalled ? "expired" : "degraded";

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 "unauthorized" / "forbidden" fallbacks can produce false positives

The broad lower.includes("unauthorized") and lower.includes("forbidden") checks can match non-auth error text (e.g. an upstream tool description, a server error body, or a configuration message that uses these words). A misconfigured-but-alive server that emits "operation forbidden" in its error body would be mis-reported as "expired" instead of "degraded". Scoping these to known patterns (or removing them in favor of only HTTP-status matching) reduces false positives without losing coverage, since the HTTP-status suffix from connection.ts and the OAuth re-authorization string already cover the primary auth-failure paths.

Suggested change
const authWalled =
lower.includes("oauth re-authorization") ||
lower.includes("(http 401)") ||
lower.includes("(http 403)") ||
lower.includes("unauthorized") ||
lower.includes("forbidden");
return authWalled ? "expired" : "degraded";
const authWalled =
lower.includes("oauth re-authorization") ||
lower.includes("(http 401)") ||
lower.includes("(http 403)");
return authWalled ? "expired" : "degraded";

@RhysSullivan RhysSullivan force-pushed the claude/health-checks-identity branch from d58fd6e to b90461f Compare June 25, 2026 20:30
@RhysSullivan RhysSullivan force-pushed the claude/health-checks-mcp branch from f87af0f to 0fdd8fe Compare June 25, 2026 20:30
@RhysSullivan RhysSullivan force-pushed the claude/health-checks-identity branch from b90461f to aa2f8f0 Compare June 25, 2026 20:45
@RhysSullivan RhysSullivan force-pushed the claude/health-checks-mcp branch from 0fdd8fe to 169e9b7 Compare June 25, 2026 20:45
Comment on lines +1197 to +1202
const connector = yield* buildConnectorInput(
parsed,
credential.values,
credential.template === null ? null : String(credential.template),
allowStdio,
).pipe(Effect.map((ci) => createMcpConnector(ci)));

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Health check bypasses the configured httpClientLayer

buildConnectorInput is called without the options?.httpClientLayer that all other callers pass. When the plugin is initialised with a custom HTTP client layer (e.g. a proxy, custom TLS configuration, or an in-process test double), createMcpConnector receives httpClientLayer: undefined, so fetchFromHttpClientLayer is skipped and the raw global fetch is used instead (line const fetch = input.httpClientLayer ? fetchFromHttpClientLayer(...) : undefined). A health check against a server that is only reachable through the configured layer would return "degraded" for a perfectly valid credential.

Suggested change
const connector = yield* buildConnectorInput(
parsed,
credential.values,
credential.template === null ? null : String(credential.template),
allowStdio,
).pipe(Effect.map((ci) => createMcpConnector(ci)));
const connector = yield* buildConnectorInput(
parsed,
credential.values,
credential.template === null ? null : String(credential.template),
allowStdio,
options?.httpClientLayer,
).pipe(Effect.map((ci) => createMcpConnector(ci)));

Give MCP connections a liveness probe. `checkHealth` dials the server and lists
its tools (the same connect path tool discovery uses): a credential that
authenticates and gets a tool list reads healthy; a 401/403 (or an OAuth
re-authorization signal) reads expired; any other connection/discovery failure
reads degraded. The connect-modal "Validate key" path runs the same probe on an
unsaved credential.

MCP gets liveness ONLY: there is no usable identity source (no id_token, no
userinfo, no whoami convention across servers), so no identity is derived and no
operation/identity editor is shown - the connection's name stays the user's
label. The plugin implements only `checkHealth`, so the editor self-hides while
the generic status dot + "Check now" still render.

Factors the HTTP-status extraction out of invoke into a shared http-status helper
and surfaces the upstream status in connect errors so the probe can classify a
401/403 as expired rather than a generic degraded.

Covered by e2e: a saved MCP connection reads healthy, then expired once the
upstream revokes the token; validate reports healthy for a live key and expired
for a rejected one; no identity is ever derived.
@RhysSullivan RhysSullivan force-pushed the claude/health-checks-identity branch from aa2f8f0 to 1952ca8 Compare June 26, 2026 02:26
@RhysSullivan RhysSullivan force-pushed the claude/health-checks-mcp branch from 169e9b7 to 1327086 Compare June 26, 2026 02:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant