Skip to content

[FIX] Skip un-clonable connectors gracefully; cascade to dependent workflows#22

Merged
chandrasekharan-zipstack merged 2 commits into
mainfrom
clone-skip-oauth-connectors
Jun 19, 2026
Merged

[FIX] Skip un-clonable connectors gracefully; cascade to dependent workflows#22
chandrasekharan-zipstack merged 2 commits into
mainfrom
clone-skip-oauth-connectors

Conversation

@chandrasekharan-zipstack

Copy link
Copy Markdown
Contributor

Problem

OAuth-backed connectors (e.g. Google Drive) and redacted-metadata connectors (auto-provisioned, e.g. Unstract Cloud Storage) cannot be cloned — the Platform API never exposes OAuth refresh tokens, and a valid token can only be minted by completing the OAuth flow as the target user (UI-only). The clone already skipped them, but still cloned their workflows and pipelines connector-less → every scheduled ETL/TASK run failed (the "failing every half hour" symptom).

Changes

  1. Adopt-before-recreate (connector.py): a same-name target connector is adopted before any recreate attempt. This is the recovery path — the operator provisions the connector on the target (where OAuth completes), re-runs, and the clone adopts it and wires dependent endpoints. (Backend match is exact connector_name, org-scoped.)
  2. Record skipped connector ids + surface a report warning for each genuine skip (OAuth / redacted), instead of a log line only.
  3. Cascade-skip (workflow.py): a workflow whose SOURCE/DEST endpoint references a skipped connector is skipped; pipeline / api_deployment / tool_instance cascade off the missing workflow remap (same mechanism as the frictionless-adapter cascade). No more guaranteed-failing pipelines.
  4. Surface frictionless skips (tool + workflow) in the report's Warnings block, not just the log.

The wf → connectors map is built from a single bulk endpoint listing, and only when connectors were actually skipped (zero cost otherwise).

Tests

tests/clone/184 passed. New/strengthened: OAuth/redacted skip records id + warning; OAuth connector adopted when a same-name target exists; workflow cascade-skip on skipped endpoint connector. Live dry-run against a dev org: no regression (identical counts, Completed successfully).

🤖 Generated with Claude Code

…rkflows

OAuth-backed and redacted-metadata connectors can't be recreated via the
Platform API (credentials can't be minted server-side). Previously they were
skipped but their workflows/pipelines were still cloned connector-less, so
every scheduled run failed.

- connector: adopt a same-name target connector before recreating, so an
  operator who provisions one on the target (where OAuth completes) gets it
  adopted on re-run, wiring dependent endpoints.
- connector: record genuinely-skipped connector ids + surface a report warning.
- workflow: cascade-skip workflows whose endpoints use a skipped connector;
  pipeline / api_deployment / tool_instance cascade off the missing remap.
- report: surface frictionless tool/workflow skips as warnings, not just logs.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01CsGrHbs5SWmQkKqiimg6CF
@greptile-apps

greptile-apps Bot commented Jun 19, 2026

Copy link
Copy Markdown

Greptile Summary

This PR fixes a bug where OAuth-backed and redacted-metadata connectors were already being skipped during cloning, but their dependent workflows and pipelines were still cloned without a valid connector — causing every scheduled ETL/TASK run to fail. The fix adopts an adopt-before-recreate pattern for connectors and introduces a cascade-skip mechanism for workflows whose endpoints reference an un-clonable connector.

  • connector.py: Moves skip detection into _recreate_or_skip(), checks for a same-name target connector first (the operator-provisioned recovery path), records skipped IDs in ctx.skipped_connector_ids, and appends a result.warnings entry for each genuine skip — replacing the previous log-only behavior.
  • workflow.py: Builds a workflow_id → connector_ids map (from a single bulk endpoint listing, only when connectors were actually skipped) and collects all blocking reasons (skipped tool + skipped connector) before deciding to cascade-skip, so both reasons appear in one report pass rather than one per re-run.
  • custom_tool.py: Surfaces frictionless-adapter tool skips in result.warnings (previously log-only), closing the last gap in the PR's "surface frictionless skips in the Warnings block" goal.

Confidence Score: 5/5

Safe to merge — all changed paths are well-tested, thread safety is maintained via the shared parallel_map lock, and the cascade-skip logic is built once in the main thread before any fan-out.

The adopt-before-recreate reordering in connector.py is correct: target-lookup happens first, and the OAuth/redacted-metadata skip paths now only fire when no same-name connector exists. The cascade in workflow.py correctly reads skipped_connector_ids after ConnectorPhase has fully completed (no concurrency overlap). The parallel_map base uses a single shared lock for all workers, so mutations to skipped_connector_ids are serialized. Both previous review comments (dual-skip silent swallow, missing warning assertion) have been addressed.

No files require special attention.

Important Files Changed

Filename Overview
src/unstract/clone/context.py Adds skipped_connector_ids: set[str] to CloneContext; straightforward field addition consistent with existing skipped_custom_tool_registry_ids pattern.
src/unstract/clone/phases/connector.py Adopts adopt-before-recreate ordering; moves OAuth/redacted-metadata skip logic into _recreate_or_skip(), records skipped IDs in ctx.skipped_connector_ids and surfaces report warnings. Thread-safe (mutations under the shared parallel_map lock).
src/unstract/clone/phases/workflow.py Adds connector cascade-skip via _collect_wf_connector_map (built only when connectors were skipped) and collects all skip reasons before returning to report both in a single pass. Previous review comment about dual-skip silent swallowing is now resolved.
src/unstract/clone/phases/custom_tool.py Adds result.warnings.append(...) when a frictionless-adapter tool is skipped, surfacing it in the report's Warnings block as stated in the PR goals.
tests/clone/test_connector_phase.py Adds assertions for skipped_connector_ids and result.warnings to existing skip tests; adds new test_oauth_connector_adopted_when_target_exists covering the recovery path.
tests/clone/test_workflow_phase.py Adds list_workflow_endpoints to FakeClient; new tests cover cascade-skip on connector, dual-reason skip, and the missing warning assertion from the previous review cycle.

Reviews (2): Last reviewed commit: "[FIX] Address Greptile: report all workf..." | Re-trigger Greptile

- workflow: a workflow blocked by both a skipped tool and a skipped connector
  now surfaces both reasons instead of only the first (no extra re-run to learn
  the second blocker).
- tests: assert the tool-skip cascade emits its report warning; cover the
  dual-skip (tool + connector) case.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01CsGrHbs5SWmQkKqiimg6CF
@chandrasekharan-zipstack chandrasekharan-zipstack merged commit 92738b5 into main Jun 19, 2026
3 checks passed
@chandrasekharan-zipstack chandrasekharan-zipstack deleted the clone-skip-oauth-connectors branch June 19, 2026 10:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants