Skip to content

[v2-rebuild] databricks-bundle-medic: Asset Bundles deploy diagnostics, CMK rotation, PrivateLink endpoint audit #794

@jeremylongshore

Description

@jeremylongshore

What this skill does

databricks-bundle-medic is the deploy and infrastructure skill of the pack. It covers the Databricks Asset Bundles surface (which is immature tooling — the replacement for the deprecated dbx, with a moving bug list across CLI versions) plus the deploy-time infrastructure operations that bite production teams when they touch encryption keys, networking topology, or workspace-level configuration that compute interacts with at boot.

What it catches

Design questions I want pushback on

  1. PreToolUse hook on databricks bundle deploy. Plan is: download the remote terraform.tfstate, validate it parses as JSON, warn if size has shrunk vs. last known good copy, cache a known-good copy locally as a recovery escape hatch. Right behavior, or too cautious?
  2. PostToolUse auto-retry on D6. On the specific "User does not have CREATE TABLE on Schema" stderr pattern, auto-retry once. Plan is to fire ONLY on this exact known-transient failure, never on others. Right call, or risky?
  3. databricks bundle bind workaround scope. D4 — should the skill ship a terraform.tfstate editor (powerful but dangerous), or only refuse to proceed and direct the user to wait for upstream fix (safe but unhelpful)?
  4. CMK rotation runbook. D8 needs a workspace-drain sequence. Plan is to enumerate all clusters/pools/warehouses, ask for confirmation, terminate in dependency order, run rotation, restart. Should the skill do the termination, or only produce the sequence as a runbook the user runs manually?
  5. PrivateLink audit detection. D9 is the easiest to detect (enumerate VPC endpoints, check for S3/STS/Kinesis coverage) but the report needs cost math. Are the AWS cost-explorer integrations worth the auth complexity, or is "here are the missing endpoints, expected NAT cost is X" enough?

What I am not asking about right now

  • Whether to support dbx migration — out of scope, dbx is deprecated.
  • Whether to handle Azure-specific networking — Azure ExpressRoute and PrivateLink Service work is acknowledged but deferred to v3.
  • Whether to add Terraform state import features — Terraform-side problems live in the Terraform provider's issue tracker.

How to respond

Comment below with any thoughts, leave thumbs-up / thumbs-down on individual bullets in the design questions, or send a voice memo on WhatsApp and I will transcribe it into the issue with attribution. English is not required for voice memos — Portuguese is fine.

Source bead: claude-jhnj in the local beads workspace.


Reference material

Most relevant for this skill:

Doc What it covers
004-RL-RSRC Unity Catalog / Asset Bundles / identity / workspace ops / secrets / networking pain catalog

Full reference set + cross-skill context: see umbrella issue #795 § Reference material.

  • Jeremy Longshore
    intentsolutions.io

Metadata

Metadata

Assignees

No one assigned

    Labels

    community-design-reviewPractitioner input is the load-bearing input for this issuedatabricks-packScope label for Databricks plugin pack workfeedback-wantedIssue is soliciting community feedback before code landssaas-packsSaaS integration packs under plugins/saas-packs/v2-rebuildTemporal marker for v2 rebuild initiatives (sunsetable)

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions