Skip to content

Enforce execution limits with a balance gate and per-org rate-limit backstop#1274

Open
RhysSullivan wants to merge 6 commits into
mainfrom
feat/execution-limit-gate
Open

Enforce execution limits with a balance gate and per-org rate-limit backstop#1274
RhysSullivan wants to merge 6 commits into
mainfrom
feat/execution-limit-gate

Conversation

@RhysSullivan

Copy link
Copy Markdown
Owner

Problem

Execution usage is tracked to the billing provider after every execution, but nothing ever checks the balance before running. An org on the free plan (10k executions/month, no overage) can keep executing indefinitely past its quota: the ledger clamps at the cap while executions continue at full speed. Observed in production as one tenant sustaining automated polling around the clock, far past its included usage, for over a week.

What this adds

Two pre-execution guards in CloudMeteringEngineDecorator, the one decorator both cloud planes (MCP session DO and HTTP executor plane) build engines through. They wrap outside usage tracking, so a blocked execution is neither run nor billed. resume is never gated: a paused execution already consumed its slot, and blocking resume would strand approved work.

1. Balance gate (execution-gate.ts)

  • Checks the org's executions feature via the billing service before execute / executeWithPause.
  • Fails open on any billing error, timeout (~2s budget), or missing customer — a billing-provider outage can never block executions or add meaningful latency. Failures warn + report to Sentry, mirroring trackExecution.
  • Caches the outcome per org for 60s (allowed and blocked both cached; errors never). Bounds billing-provider traffic to ~1 check per org-session per minute regardless of execution rate.
  • Blocked executions surface to the MCP client through ExecuteResult.error (the same channel compilation errors use, since engine error-channel failures are deliberately rendered opaque by the host): a clean isError tool result telling the user their plan's included executions are used up.

2. Rate-limit backstop (execution-rate-limit.ts)

  • 1000 execute calls per org per hour, fixed window — calibration: the heaviest legitimate org runs ~1.1k per month, so this only trips on runaway automation. Independent of billing, so it still holds during a billing outage (where the balance gate fails open).
  • Backed by a minimal counter Durable Object per org (idFromName(orgId), single {windowId, count} record, alarm-purged after two idle windows). New EXECUTION_RATE_LIMITER binding + migration v3.
  • Also fails open (unreachable DO, missing binding => allow + warn); checked before the balance gate since it's cheaper.

Tests

19 unit tests across both guards: allow/block paths, fail-open on error and on timeout, cache TTL and per-org isolation, error outcomes never cached, window reset, and resume never gated. Fail-open tests assert the check was attempted AND the execution ran.

Verification

  • bun run typecheck — 42/42 green
  • vitest run src/engine/execution-gate.test.ts src/engine/execution-rate-limit.test.ts — 19/19
  • bun run lint, bun run format:check — clean

Deploy notes

  • Migration v3 creates the counter DO class; standard single deploy. In workers without the binding (tests, older local setups) the limiter logs a warning and disables itself.
  • The gate takes effect immediately on deploy for any org already past its included usage.

Check the org's Autumn execution balance before execute/executeWithPause
(never resume, so paused executions can always complete). Blocked orgs get
a descriptive ExecuteResult.error instead of running. Fails open on any
billing error or a 2s timeout, and caches per-org outcomes for 60s.
1000 execute calls per org per hour, fixed window, counted in a minimal
per-org Durable Object (EXECUTION_RATE_LIMITER, one instance per org via
idFromName). Independent of billing so runaway automation is caught even
when Autumn is down; fails open if the counter DO errors or is slow. The
DO stores a single {windowId, count} record and purges itself by alarm
after two idle windows.
Order: rate-limit backstop first (cheap counter), then the balance gate,
then usage tracking. Guards wrap outside the tracker so a blocked
execution is neither run nor tracked. Covers both planes (HTTP executor
plane and MCP session DO) since both build engines through this layer.
Covers allow/block, the typed errors, fail-open on billing errors and
timeouts (asserting the check was attempted AND the execution ran), the
60s per-org outcome cache (allowed and blocked cached, errors never),
window rollover, per-org isolation, and that resume is never gated.
@github-actions

github-actions Bot commented Jul 2, 2026

Copy link
Copy Markdown
Contributor

Cloudflare preview

Console https://executor-preview-pr-1274.executor-e2e.workers.dev
MCP https://executor-preview-pr-1274.executor-e2e.workers.dev/mcp
Deployed commit 3b4b919

Sign-in is Cloudflare Access (one-time PIN to an allowed email). The preview has its own database and encryption key; it is destroyed when this PR closes.

@pkg-pr-new

pkg-pr-new Bot commented Jul 2, 2026

Copy link
Copy Markdown

Open in StackBlitz

@executor-js/cli

npm i https://pkg.pr.new/@executor-js/cli@1274

@executor-js/config

npm i https://pkg.pr.new/@executor-js/config@1274

@executor-js/execution

npm i https://pkg.pr.new/@executor-js/execution@1274

@executor-js/sdk

npm i https://pkg.pr.new/@executor-js/sdk@1274

@executor-js/codemode-core

npm i https://pkg.pr.new/@executor-js/codemode-core@1274

@executor-js/runtime-quickjs

npm i https://pkg.pr.new/@executor-js/runtime-quickjs@1274

@executor-js/plugin-file-secrets

npm i https://pkg.pr.new/@executor-js/plugin-file-secrets@1274

@executor-js/plugin-graphql

npm i https://pkg.pr.new/@executor-js/plugin-graphql@1274

@executor-js/plugin-keychain

npm i https://pkg.pr.new/@executor-js/plugin-keychain@1274

@executor-js/plugin-mcp

npm i https://pkg.pr.new/@executor-js/plugin-mcp@1274

@executor-js/plugin-onepassword

npm i https://pkg.pr.new/@executor-js/plugin-onepassword@1274

@executor-js/plugin-openapi

npm i https://pkg.pr.new/@executor-js/plugin-openapi@1274

executor

npm i https://pkg.pr.new/executor@1274

commit: 3b4b919

@greptile-apps

greptile-apps Bot commented Jul 2, 2026

Copy link
Copy Markdown

Greptile Summary

This PR adds two pre-execution guards to CloudMeteringEngineDecorator: a per-org fixed-window rate limiter (1000/hour, backed by a counter Durable Object) and a billing balance gate (Autumn executions feature check, cached 60s per org). Both fail open on errors and timeouts, and neither gates resume. Blocked executions short-circuit before the usage tracker, so they are neither run nor billed.

  • execution-gate.ts and execution-rate-limit.ts implement the guards with per-org caching and alarm-purged Durable Object storage; withPreExecutionGate is the shared seam both decorators use.
  • execution-stack-metered.ts wires rate limiter then balance gate then usage tracker in cheapest-first order; one instance per layer build keeps the cache shared across engines in a session.
  • wrangler.jsonc adds the EXECUTION_RATE_LIMITER binding and a v3 migration for the new SQLite-backed counter DO class.

Confidence Score: 3/5

Safe to merge after verifying Autumn's check() response for unlimited/non-metered plan customers; all other changes are well-tested and fail open.

The core concern is checkExecutionBalance returning check.allowed verbatim from Autumn's check() call. The fail-open path only fires when Autumn throws an error. If Autumn returns {allowed: false} for customers on plans where executions are not metered, those orgs would be silently blocked on deploy. The rate limiter and gate logic themselves are clean, tests are solid, and the Durable Object wiring is correct.

apps/cloud/src/extensions/billing/service.ts — the checkExecutionBalance implementation needs verification that Autumn's check() throws (rather than returning {allowed: false}) for customers on unlimited or non-execution-metered plans.

Important Files Changed

Filename Overview
apps/cloud/src/engine/execution-gate.ts New balance gate that checks Autumn before execute/executeWithPause; correctly fails open on errors/timeouts, caches outcomes per org for 60s, and never gates resume. Minor: cache TTL is slightly underestimated when billing call is slow.
apps/cloud/src/engine/execution-rate-limit.ts New per-org fixed-window rate limiter backed by a counter Durable Object; correctly fails open, never gates resume. Increments counter even for blocked executions, meaning DO write volume scales with abuse intensity after the limit is hit.
apps/cloud/src/extensions/billing/service.ts Adds checkExecutionBalance wrapping Autumn's check() API. Returns check.allowed verbatim — if Autumn returns {allowed: false} rather than throwing for unlimited/unconfigured-plan customers, those orgs will be silently blocked with no fail-open.
apps/cloud/src/engine/execution-stack-metered.ts Wires rate limiter then balance gate outside the usage tracker; correct ordering ensures blocked executions are neither run nor billed. One instance of each guard per layer build, cache shared across engines as intended.
apps/cloud/wrangler.jsonc Adds EXECUTION_RATE_LIMITER DO binding and v3 migration; migration uses new_sqlite_classes which is compatible with the KV storage API used by ExecutionRateLimiterDO.
apps/cloud/src/engine/execution-gate.test.ts 19 unit tests covering allow/block paths, fail-open on error and timeout, cache TTL, per-org isolation, errors-never-cached, and resume pass-through.
apps/cloud/src/engine/execution-rate-limit.test.ts Unit tests cover allow/block, window reset, fail-open on error and timeout, per-org isolation, and resume pass-through.
apps/cloud/src/server.ts Exports ExecutionRateLimiterDO from the worker entry point so the Cloudflare runtime can instantiate it.
apps/cloud/src/env-augment.d.ts Adds optional EXECUTION_RATE_LIMITER binding declaration; marked optional matching existing patterns, allowing graceful degradation when absent.

Sequence Diagram

%%{init: {'theme': 'neutral'}}%%
sequenceDiagram
    participant Client as MCP/HTTP Client
    participant RL as Rate Limiter (DO)
    participant BG as Balance Gate (Autumn, cached 60s)
    participant UT as Usage Tracker (fire-and-forget)
    participant E as Engine

    Client->>RL: execute(code)
    alt "count > 1000/hr"
        RL-->>Client: ExecuteResult.error (rate limit)
    else within limit
        RL->>BG: decide(orgId)
        alt Autumn error / timeout
            BG-->>BG: fail open (warn + Sentry)
            BG->>UT: allowed
        else "allowed = false"
            BG-->>Client: ExecuteResult.error (quota exceeded)
        else "allowed = true"
            BG->>UT: allowed
        end
        UT->>E: execute(code)
        E-->>UT: result
        UT->>UT: Effect.runFork(trackExecution) [async]
        UT-->>Client: result
    end

    Note over Client,E: resume() bypasses both guards entirely
Loading
%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%%
sequenceDiagram
    participant Client as MCP/HTTP Client
    participant RL as Rate Limiter (DO)
    participant BG as Balance Gate (Autumn, cached 60s)
    participant UT as Usage Tracker (fire-and-forget)
    participant E as Engine

    Client->>RL: execute(code)
    alt "count > 1000/hr"
        RL-->>Client: ExecuteResult.error (rate limit)
    else within limit
        RL->>BG: decide(orgId)
        alt Autumn error / timeout
            BG-->>BG: fail open (warn + Sentry)
            BG->>UT: allowed
        else "allowed = false"
            BG-->>Client: ExecuteResult.error (quota exceeded)
        else "allowed = true"
            BG->>UT: allowed
        end
        UT->>E: execute(code)
        E-->>UT: result
        UT->>UT: Effect.runFork(trackExecution) [async]
        UT-->>Client: result
    end

    Note over Client,E: resume() bypasses both guards entirely
Loading

Comments Outside Diff (2)

  1. apps/cloud/src/engine/execution-rate-limit.ts, line 793-822 (link)

    P2 Counter DO is incremented for every blocked execution, not just allowed ones

    increment is called unconditionally inside decide, before the count > limit check. Once the rate limit is reached, every subsequent blocked attempt still writes to the counter DO (advancing the count from limit+1 to limit+2, etc.). The abuse scenario this backstop targets — tight polling loops — could trigger thousands of additional DO writes per hour for already-blocked orgs. This is inherent to increment-then-check, but it means the DO write volume scales directly with abuse intensity. The current design is correct and the DO write cost is low, but worth noting for calibration.

  2. apps/cloud/src/engine/execution-gate.ts, line 405-432 (link)

    P2 nowMs used for cache-entry expiry is captured before the async billing call

    nowMs = Date.now() is read at the top of Effect.suspend, before checkBalance is awaited. writeCache(organizationId, allowed, nowMs) then sets expiresAtMs = nowMs + BALANCE_CACHE_TTL_MS. The effective TTL seen from the time the entry is written is therefore 60s - (billing call duration). With a 2s timeout budget the minimum effective TTL is ~58s. This is harmless in practice but worth knowing.

    Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Reviews (1): Last reviewed commit: "Use typed test errors and instanceof ass..." | Re-trigger Greptile

Comment on lines +87 to +94
const checkExecutionBalance = (organizationId: string) =>
Effect.gen(function* () {
yield* Effect.annotateCurrentSpan({ "autumn.customer.id": organizationId });
const check = yield* use((c) =>
c.check({ customerId: organizationId, featureId: "executions" }),
);
return { allowed: check.allowed };
}).pipe(Effect.withSpan("autumn.checkExecutionBalance"));

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Autumn check() for unlimited-plan orgs may silently block executions

checkExecutionBalance returns { allowed: check.allowed } verbatim. The balance gate's fail-open path only triggers on an AutumnError (i.e., when use(...) throws). If Autumn's check() returns { allowed: false } for customers on plans where the executions feature is unlimited or not feature-gated (rather than throwing), those orgs will be blocked without any fail-open. The PR description says it "fails open on missing customer", which implies Autumn throws for unknown customers — but orgs on enterprise or legacy plans that have executions either unconfigured or returned as false would be silently blocked with no warning. Worth verifying the exact Autumn response shape for unlimited/non-metered plan customers before this ships.

@cloudflare-workers-and-pages

cloudflare-workers-and-pages Bot commented Jul 2, 2026

Copy link
Copy Markdown

Deploying with  Cloudflare Workers  Cloudflare Workers

The latest updates on your project. Learn more about integrating Git with Workers.

Status Name Latest Commit Updated (UTC)
❌ Deployment failed
View logs
executor-cloud 3b4b919 Jul 02 2026, 09:25 PM

@cloudflare-workers-and-pages

cloudflare-workers-and-pages Bot commented Jul 2, 2026

Copy link
Copy Markdown

Deploying with  Cloudflare Workers  Cloudflare Workers

The latest updates on your project. Learn more about integrating Git with Workers.

Status Name Latest Commit Preview URL Updated (UTC)
✅ Deployment successful!
View logs
executor-marketing 3b4b919 Commit Preview URL

Branch Preview URL
Jul 02 2026, 09:25 PM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant