Skip to content

[v2-rebuild] databricks-uc-migration-pilot: Unity Catalog readiness, IAM/SCIM diagnostics, system-table access tracing #793

@jeremylongshore

Description

@jeremylongshore

What this skill does

databricks-uc-migration-pilot exists because Unity Catalog migration is the highest-priority deadline-driven work in the Databricks ecosystem right now. Databricks has announced that all new workspaces will be UC-only beginning September 30, 2026 — no DBFS root, no DBFS mounts, no Hive metastore, no no-isolation shared clusters. Existing workspaces either migrate by that date or start losing functionality as features get UC-gated. This skill runs the readiness audit, produces a per-table migration plan, and traces UC permission failures back to the two-layer access model that gates them.

What it catches

  • Silent skips on un-migratable tables — UCX assessment and the Migration Assistant both flag tables backed by wasbs://, adl://, or dbfs:/user/hive as "not eligible" and skip them, returning a green checkmark while leaving half the tables behind. (004-RL-RSRC D1)
  • IAM-role-recreation trap on AWS — bucket policies store the role's unique ID (AROAXXX...) internally; deleting and recreating the role with the same name leaves a tombstone, and every UC query starts failing with java.nio.file.AccessDeniedException. The bucket policy looks identical in the AWS console. (004-RL-RSRC D2)
  • Single-metastore-per-region constraint — forces every team designing Dev/Test/Prod isolation to either encode environment in catalog names (bronze_dev, bronze_test, bronze_prod) with substitution everywhere, pay for cross-region egress, or run separate Databricks accounts. (004-RL-RSRC D3)
  • Azure AD nested group sync failures — the SCIM connector enumerates direct members only, no flattening; the "external" flag locks membership to the IdP source-of-truth and makes in-Databricks modification return InternalError. (004-RL-RSRC D7)
  • Two-layer access model on system.billing.usage — gated behind metastore-admin enablement of system.billing schema plus explicit grants; FinOps cannot do its job without involving a metastore admin per consumer. (004-RL-RSRC D10)

Design questions I want pushback on

  1. Readiness audit shape. Plan is a CSV with one row per Hive table, columns for migration eligibility, blocking reason, recommended target catalog, and per-table estimated work. Right granularity, or should it group by schema/owner instead?
  2. Catalog naming convention. For D3, the skill needs to know the org's chosen convention (bronze_dev vs dev.bronze vs separate accounts) before it can plan. Should the skill ask in-flight (interactive) or take it as a config file the user prepares ahead?
  3. IAM role tombstone detection. D2 needs a check that compares the role unique ID in the bucket policy against the live role's current unique ID. Plan is to require AWS SDK access alongside Databricks access. Is the auth complexity worth it, or should the skill just flag "role was recently recreated" patterns and ask the user to verify manually?
  4. Nested-group flattening for D7. The skill could either fix the problem (write a Lambda/Function that flattens nested groups before SCIM) or just diagnose it (here is why your group is empty). Where is the right line between "diagnose" and "fix"?
  5. system.billing.usage permission tracer. When a user runs a query and gets a permission error, the skill needs to traverse the two-level access graph (metastore admin enablement + grant chain) to identify what is missing. Should this be one slash command, or a subagent the SKILL.md hands off to?

What I am not asking about right now

  • Whether to delay shipping until after Sept 30 2026 — the deadline is the reason this skill exists; shipping after is shipping late.
  • Whether to support GCP UC — AWS + Azure only for v2.
  • Whether to merge with Lakehouse Federation work — out of scope.

How to respond

Comment below with any thoughts, leave thumbs-up / thumbs-down on individual bullets in the design questions, or send a voice memo on WhatsApp and I will transcribe it into the issue with attribution. English is not required for voice memos — Portuguese is fine.

Source bead: claude-h53a in the local beads workspace.


Reference material

Most relevant for this skill:

Doc What it covers
004-RL-RSRC Unity Catalog / Asset Bundles / identity / workspace ops / secrets / networking pain catalog

Full reference set + cross-skill context: see umbrella issue #795 § Reference material.

  • Jeremy Longshore
    intentsolutions.io

Metadata

Metadata

Assignees

No one assigned

    Labels

    community-design-reviewPractitioner input is the load-bearing input for this issuedatabricks-packScope label for Databricks plugin pack workfeedback-wantedIssue is soliciting community feedback before code landssaas-packsSaaS integration packs under plugins/saas-packs/v2-rebuildTemporal marker for v2 rebuild initiatives (sunsetable)

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions