Skip to content

Feature request: provide an API to expose events as a stream to client code from an orchestration #754

Description

@kshyju

Is your feature request related to a problem?

Today the only orchestration state an external client can read while an orchestration is still running is the custom status (SetCustomStatus / OrchestrationMetadata.SerializedCustomStatus). The orchestration output is not available until the instance reaches a terminal state.

Custom status is a snapshot primitive: each write replaces the entire value (last-write-wins) — there is no append/merge — and it is capped at 16 KB (UTF-16) by the backend. That makes it a good fit for "progress: 42%" style status, but the wrong primitive for streaming an incremental, append-only sequence of events from a running orchestration to a live consumer.

As a result, anyone who wants live, mid-run event streaming is forced to (ab)use custom status as a stream:

  • They accumulate events and re-serialize the growing list into custom status on every step, then a poller diffs successive snapshots to recover the new events.
  • Once the accumulated payload crosses 16 KB, the next status write throws and fails the entire orchestration — purely as a function of how many/large the events are.

This is a recurring pattern (progress feeds, log/event streaming, sub-step results, fan-out child updates) and there is currently no primitive in Durable Task designed for it. A first-class streaming API would let orchestrations expose live, incremental progress safely instead of every consumer reinventing a fragile, cap-bound workaround.

Describe the solution you'd like

Provide a way/API to expose events as a stream to client code from an orchestration — i.e. an orchestration can emit events as it runs, and client code can consume them as a live, incremental stream while the instance is still in-flight, without the 16 KB snapshot cap and without overwrite semantics.

Key properties:

  • Append-only / incremental — the orchestration emits individual events; client code receives only what's new, with no client-side snapshot diffing.
  • Unbounded total volume — the cap should be per-event (reasonable) rather than on the cumulative live payload.
  • Ordered and lossless — events delivered in emit order; a consumer that starts late or polls slowly can still retrieve everything from its last position (cursor/sequence-number based).
  • Push or efficient pull — ideally a subscription/long-poll; at minimum a cursor-based GetEventsSince(sequence) so clients don't re-fetch the whole history.
  • Replay-safe — emitting an event should be deterministic under orchestration replay (no duplicate emissions on replay), consistent with the rest of the programming model.
  • Independent of custom status and output — so it doesn't compete with the existing 16 KB status budget or require the orchestration to complete first.

Illustrative (non-prescriptive) shapes:

// Orchestration side — emit events as the orchestration runs
context.EmitEvent(string name, object payload);            // append-only, ordered, replay-safe

// Client side — consume them as a stream
await foreach (var evt in client.WatchEventsAsync(instanceId, fromSequence, ct)) { ... }
// or pull:
OrchestrationEventPage page = await client.GetEventsAsync(instanceId, afterSequence, ct);

We're not attached to a specific mechanism (a dedicated append-only sub-channel, a larger/configurable status, a backend event feed, etc.) — the ask is the capability. The SDK/backend team is best placed to choose the implementation.

Describe alternatives you've considered

  • Bounded windowing in custom status — keep only a trailing window of recent events under 16 KB and backfill the full log from the output at completion. Lossless and order-preserving, but events that don't fit the live window are only delivered at completion, not live, and it's complex per-consumer boilerplate.
  • Raising/configuring the custom status cap — helps marginally but doesn't fix the fundamental mismatch (snapshot replace semantics, O(n) re-serialization per step, eventual overflow for long-running or chatty workflows).
  • External sink (queue / blob / SignalR) written as an orchestration side effect — works and scales, but pushes durability, ordering, replay-safety, and plumbing onto every app that wants streaming, and sits outside the orchestration's own consistency/replay model. A built-in API would let callers drop their bespoke workarounds.

Additional context

The status-size limit is already biting users from the other direction — e.g. Azure/durabletask#1185, where a larger orchestration status overflows the backend storage property limit and stalls the orchestration. That issue asks for the status write to not break when it grows; this request is complementary: rather than making the snapshot status bigger, provide a purpose-built append-only event stream so streaming scenarios don't depend on the status payload at all.

If this is better tracked against the backend/storage providers rather than the .NET client surface, please feel free to transfer/route it to Azure/durabletask.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions