Skip to content

acc: Always destroy deployed bundles on test exit in cloud runs#5585

Open
chrisst wants to merge 2 commits into
mainfrom
chris.stephens/fix4-deferred-teardown
Open

acc: Always destroy deployed bundles on test exit in cloud runs#5585
chrisst wants to merge 2 commits into
mainfrom
chris.stephens/fix4-deferred-teardown

Conversation

@chrisst

@chrisst chrisst commented Jun 12, 2026

Copy link
Copy Markdown

What

Adds a harness-level guarantee that bundles deployed by acceptance tests are destroyed when the test exits, for cloud runs (CLOUD_ENV set):

  • runTest registers a t.Cleanup before the script starts (covering failures, require aborts, and script timeouts), capturing a clone of the exact env the script ran with;
  • the cleanup walks the test temp dir for .databricks/bundle/<target> state dirs and runs $CLI bundle destroy --auto-approve --target <target> per bundle root (10-minute cap per destroy);
  • destroy errors are logged via t.Logf only — never fail the test; double-destroy is harmless ("No active deployment found to destroy!" exits 0);
  • local/testserver runs: complete no-op.

Why

Acceptance scripts run under bash -e, and script.cleanup fragments are appended after the main body — they never execute when a script fails or times out between bundle deploy and bundle destroy. Against shared cloud test workspaces this leaks real resources: during the 2026-06-11/12 incident one shared GCP workspace had accumulated 100+ leaked test warehouses and dozens of leaked test-bundle-pipeline-* pipelines, exhausting the project's local-SSD quota and blocking terraform-provider CI for ~2 days (ref ES-1974228).

Cleanup output cannot pollute golden files: it runs after output comparison and goes only to the test log.

Known limitation: destroy is best-effort — a bundle deployed with required --var flags or a config corrupted mid-test may still fail to destroy; this is logged as a leak warning rather than failing the run.

Tests

go build ./..., go vet ./acceptance pass; local deploy+destroy acceptance tests (bundle/resources/sql_warehouses, bundle/resources/pipelines/recreate-keys across all engine variants) pass with no output regressions.

This pull request and its description were written by Isaac.

When acceptance tests run against real cloud workspaces (CLOUD_ENV set),
a test that fails, times out, or exits mid-script never reaches its own
'bundle destroy' step: scripts run under 'bash -e' and the merged
script.cleanup parts are skipped on failure. The deployed resources
(SQL warehouses, pipelines, jobs, ...) then leak in the shared test
workspaces. Leaked started warehouses recently exhausted a GCP quota and
took CI down for two days; we observed 100+ leaked warehouses and dozens
of leaked test pipelines in a single workspace.

This adds a harness-level safety net: on cloud runs, runTest registers a
t.Cleanup (before starting the script, so it also covers timeouts and
mid-test failures) that scans the test's temp dir for bundle state
directories (<bundle_root>/.databricks/bundle/<target>) and runs
'$CLI bundle destroy --auto-approve --target <target>' in each bundle
root, reusing the exact environment the script ran with.

The mechanism is deliberately best effort and invisible to test output:

- It is a no-op for local testserver runs (gated on CLOUD_ENV).
- It runs after output comparison and logs only via t.Logf, so cleanup
  output is never compared against expected out files.
- Double-destroy is harmless: 'bundle destroy' on an already-destroyed
  bundle exits 0 with 'No active deployment found to destroy!'. In the
  common success path the shared script.cleanup already removed
  .databricks, so nothing is even attempted.
- Destroy failures are logged but never fail the test.

Co-authored-by: Isaac
@chrisst chrisst temporarily deployed to test-trigger-is June 12, 2026 19:46 — with GitHub Actions Inactive
@chrisst chrisst temporarily deployed to test-trigger-is June 12, 2026 19:46 — with GitHub Actions Inactive
Use context.WithoutCancel(t.Context()) instead of context.Background()
(t.Context() is already canceled when cleanups run), make the
best-effort nilerr skip explicit, and trim narration comments.

Co-authored-by: Isaac
@chrisst chrisst temporarily deployed to test-trigger-is June 12, 2026 20:08 — with GitHub Actions Inactive
@chrisst chrisst temporarily deployed to test-trigger-is June 12, 2026 20:08 — with GitHub Actions Inactive
@chrisst chrisst requested a review from pietern June 12, 2026 20:11
@chrisst chrisst marked this pull request as ready for review June 12, 2026 20:12
@github-actions

Copy link
Copy Markdown
Contributor

Waiting for approval

Based on git history, these people are best suited to review:

  • @denik -- recent work in acceptance/

Eligible reviewers: @andrewnester, @anton-107, @pietern, @renaudhartert-db, @shreyas-goenka, @simonfaltum

Suggestions based on git history. See OWNERS for ownership rules.

@eng-dev-ecosystem-bot

Copy link
Copy Markdown
Collaborator

Integration test report

Commit: 2aadb70

Run: 27440224794

Env ❌​FAIL 🟨​KNOWN 🔄​flaky 💚​RECOVERED 🙈​SKIP ✅​pass 🙈​skip Time
🟨​ aws linux 7 15 264 977 8:17
🟨​ aws windows 7 15 266 975 16:04
💚​ aws-ucws linux 7 15 360 891 7:54
💚​ aws-ucws windows 7 15 362 889 12:19
💚​ azure linux 1 17 267 975 6:56
💚​ azure windows 1 17 269 973 11:32
🔄​ azure-ucws linux 3 17 363 887 11:11
❌​ azure-ucws windows 38 1 17 329 885 15:12
💚​ gcp linux 1 17 263 978 8:07
❌​ gcp windows 2 1 1 17 262 976 14:47
60 interesting tests: 38 FAIL, 15 SKIP, 7 KNOWN
Test Name aws linux aws windows aws-ucws linux aws-ucws windows azure linux azure windows azure-ucws linux azure-ucws windows gcp linux gcp windows
🟨​ TestAccept 🟨​K 🟨​K 💚​R 💚​R 💚​R 💚​R 🔄​f 🟨​K 💚​R 💚​R
❌​ TestAccept/bundle/deploy/files/no-snapshot-sync ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p ❌​F ✅​p ✅​p
❌​ TestAccept/bundle/deploy/files/no-snapshot-sync/DATABRICKS_BUNDLE_ENGINE=direct ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p ❌​F ✅​p ✅​p
❌​ TestAccept/bundle/deploy/files/no-snapshot-sync/DATABRICKS_BUNDLE_ENGINE=terraform ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p ❌​F ✅​p ✅​p
❌​ TestAccept/bundle/deploy/mlops-stacks ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p ❌​F ✅​p ✅​p
❌​ TestAccept/bundle/deploy/mlops-stacks/DATABRICKS_BUNDLE_ENGINE=direct ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p ❌​F ✅​p ✅​p
❌​ TestAccept/bundle/deploy/mlops-stacks/DATABRICKS_BUNDLE_ENGINE=terraform ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p ❌​F ✅​p ✅​p
❌​ TestAccept/bundle/deployment/bind/alert 🙈​s 🙈​s 🙈​s 🙈​s ✅​p ✅​p ✅​p ❌​F ✅​p ✅​p
❌​ TestAccept/bundle/deployment/bind/alert/DATABRICKS_BUNDLE_ENGINE=direct ✅​p ✅​p ✅​p ❌​F ✅​p ✅​p
❌​ TestAccept/bundle/deployment/bind/alert/DATABRICKS_BUNDLE_ENGINE=terraform ✅​p ✅​p ✅​p ❌​F ✅​p ✅​p
❌​ TestAccept/bundle/deployment/bind/job/generate-and-bind ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p ❌​F ✅​p ✅​p
❌​ TestAccept/bundle/deployment/bind/job/generate-and-bind/DATABRICKS_BUNDLE_ENGINE=direct ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p ❌​F ✅​p ✅​p
❌​ TestAccept/bundle/deployment/bind/job/generate-and-bind/DATABRICKS_BUNDLE_ENGINE=terraform ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p ❌​F ✅​p ✅​p
❌​ TestAccept/bundle/destroy/jobs-and-pipeline ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p ❌​F ✅​p ✅​p
❌​ TestAccept/bundle/destroy/jobs-and-pipeline/DATABRICKS_BUNDLE_ENGINE=direct ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p ❌​F ✅​p ✅​p
❌​ TestAccept/bundle/destroy/jobs-and-pipeline/DATABRICKS_BUNDLE_ENGINE=terraform ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p ❌​F ✅​p ✅​p
🙈​ TestAccept/bundle/invariant/no_drift 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
❌​ TestAccept/bundle/resources/alerts/with_file ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p ❌​F ✅​p ✅​p
❌​ TestAccept/bundle/resources/alerts/with_file/DATABRICKS_BUNDLE_ENGINE=direct ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p ❌​F ✅​p ✅​p
❌​ TestAccept/bundle/resources/alerts/with_file/DATABRICKS_BUNDLE_ENGINE=terraform ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p ❌​F ✅​p ✅​p
❌​ TestAccept/bundle/resources/apps/inline_config ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p 🔄​f ❌​F ✅​p ✅​p
❌​ TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=direct ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p 🔄​f ❌​F ✅​p ✅​p
❌​ TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=terraform ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p ❌​F ✅​p ✅​p
❌​ TestAccept/bundle/resources/dashboards/generate_inplace ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p ❌​F ✅​p ✅​p
❌​ TestAccept/bundle/resources/dashboards/generate_inplace/DATABRICKS_BUNDLE_ENGINE=direct ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p ❌​F ✅​p ✅​p
❌​ TestAccept/bundle/resources/dashboards/generate_inplace/DATABRICKS_BUNDLE_ENGINE=terraform ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p ❌​F ✅​p ✅​p
❌​ TestAccept/bundle/resources/jobs/check-metadata ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p ❌​F ✅​p ✅​p
❌​ TestAccept/bundle/resources/jobs/check-metadata/DATABRICKS_BUNDLE_ENGINE=direct ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p ❌​F ✅​p ✅​p
❌​ TestAccept/bundle/resources/jobs/check-metadata/DATABRICKS_BUNDLE_ENGINE=terraform ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p ❌​F ✅​p ✅​p
🙈​ TestAccept/bundle/resources/permissions 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
❌​ TestAccept/bundle/resources/permissions/dashboards/create ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p ❌​F 🙈​s 🙈​s
❌​ TestAccept/bundle/resources/permissions/dashboards/create/DATABRICKS_BUNDLE_ENGINE=direct ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p ❌​F
❌​ TestAccept/bundle/resources/permissions/dashboards/create/DATABRICKS_BUNDLE_ENGINE=terraform ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p ❌​F
🟨​ TestAccept/bundle/resources/permissions/jobs/destroy_without_mgmtperms/with_permissions 🟨​K 🟨​K 💚​R 💚​R 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🟨​ TestAccept/bundle/resources/permissions/jobs/destroy_without_mgmtperms/with_permissions/DATABRICKS_BUNDLE_ENGINE=direct 🟨​K 🟨​K 💚​R 💚​R
🟨​ TestAccept/bundle/resources/permissions/jobs/destroy_without_mgmtperms/with_permissions/DATABRICKS_BUNDLE_ENGINE=terraform 🟨​K 🟨​K 💚​R 💚​R
🟨​ TestAccept/bundle/resources/permissions/jobs/destroy_without_mgmtperms/without_permissions 🟨​K 🟨​K 💚​R 💚​R 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🟨​ TestAccept/bundle/resources/permissions/jobs/destroy_without_mgmtperms/without_permissions/DATABRICKS_BUNDLE_ENGINE=direct 🟨​K 🟨​K 💚​R 💚​R
🟨​ TestAccept/bundle/resources/permissions/jobs/destroy_without_mgmtperms/without_permissions/DATABRICKS_BUNDLE_ENGINE=terraform 🟨​K 🟨​K 💚​R 💚​R
🙈​ TestAccept/bundle/resources/postgres_branches/basic 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🙈​ TestAccept/bundle/resources/postgres_branches/recreate 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🙈​ TestAccept/bundle/resources/postgres_branches/replace_existing 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🙈​ TestAccept/bundle/resources/postgres_branches/update_protected 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🙈​ TestAccept/bundle/resources/postgres_branches/without_branch_id 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🙈​ TestAccept/bundle/resources/postgres_endpoints/basic 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🙈​ TestAccept/bundle/resources/postgres_endpoints/recreate 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🙈​ TestAccept/bundle/resources/postgres_projects/update_display_name 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🙈​ TestAccept/bundle/resources/synced_database_tables/basic 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🙈​ TestAccept/bundle/resources/vector_search_endpoints/drift/recreated_same_name 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🙈​ TestAccept/bundle/resources/vector_search_indexes/basic 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🙈​ TestAccept/bundle/resources/vector_search_indexes/grants/select 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
❌​ TestAccept/bundle/resources/volumes/recreate 🙈​s 🙈​s ✅​p ✅​p 🙈​s 🙈​s ✅​p ❌​F 🙈​s 🙈​s
❌​ TestAccept/bundle/resources/volumes/recreate/DATABRICKS_BUNDLE_ENGINE=direct ✅​p ✅​p ✅​p ❌​F
❌​ TestAccept/bundle/resources/volumes/recreate/DATABRICKS_BUNDLE_ENGINE=terraform ✅​p ✅​p ✅​p ❌​F
❌​ TestAccept/bundle/run_as/job_default ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p ❌​F ✅​p ✅​p
❌​ TestAccept/bundle/run_as/job_default/DATABRICKS_BUNDLE_ENGINE=direct ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p ❌​F ✅​p ✅​p
🙈​ TestAccept/ssh/connection 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
❌​ TestFetchRepositoryInfoAPI_FromRepo ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p ❌​F ✅​p ❌​F
❌​ TestFetchRepositoryInfoAPI_FromRepo/root ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p ❌​F ✅​p 🔄​f
❌​ TestFetchRepositoryInfoAPI_FromRepo/subdir ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p ❌​F ✅​p ❌​F
Top 25 slowest tests (at least 2 minutes):
duration env testname
6:23 gcp windows TestAccept
6:09 aws-ucws windows TestAccept
6:01 azure windows TestAccept
4:22 gcp windows TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=terraform
4:18 gcp linux TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=terraform
4:11 gcp windows TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=direct
4:02 gcp linux TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=direct
3:24 azure windows TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=terraform
3:23 aws-ucws windows TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=terraform
3:16 aws-ucws linux TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=direct
3:14 aws windows TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=direct
3:05 azure windows TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=direct
3:02 azure linux TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=terraform
2:56 azure linux TestAccept
2:55 gcp linux TestAccept
2:52 azure linux TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=direct
2:47 aws-ucws linux TestAccept
2:44 aws windows TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=terraform
2:41 azure-ucws linux TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=terraform
2:39 azure-ucws linux TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=direct
2:31 aws linux TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=direct
2:31 aws linux TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=terraform
2:30 aws-ucws windows TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=direct
2:23 aws-ucws linux TestAccept/bundle/resources/apps/inline_config/DATABRICKS_BUNDLE_ENGINE=terraform
2:01 azure-ucws windows TestAccept/bundle/resources/model_serving_endpoints/basic/DATABRICKS_BUNDLE_ENGINE=terraform

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants