How Incidentary captures trace data before an alert fires, and what this means for the assembled artifact.

Pre-Alert Capture Window

Incidentary captures trace data continuously, before any alert fires. When an alert arrives, the system assembles the artifact from this already-captured data. This is what makes the artifact available within seconds — not because assembly is fast, but because the data already exists.

How it works from the outside

The SDK records events as your service handles requests. These events are held in a local buffer and flushed to Incidentary periodically in the background. When an alert fires — via Slack command, PagerDuty webhook, or OpsGenie webhook — Incidentary looks back into the recently captured events and assembles the causal chain for the alerting service's trace.

The result is that the artifact reflects what was happening before the alert fired, not just what happens after responders start looking.

What this means in practice

When you receive the Slack notification or open the incident URL, you are looking at the activity that led up to the incident. The first confirmed break in the artifact will typically precede the alert by some amount of time — often by the duration of the slowdown or error cascade that triggered the alert.

This is the core value: responders see the cause, not just the symptom.

What happens when coverage is partial

If the SDK was deployed recently, the capture window may not yet contain events from before the alert fired. The artifact will include events from the SDK's deployment onward, and the completeness label will reflect the coverage gap.

If a service in the causal chain does not have the SDK installed, that service will appear as a gap in the artifact. The chain will be intact up to the gap, and resume after it if downstream services are instrumented.

Coverage is stated honestly in the artifact. "Partial" coverage means some of the chain is captured. "Low" coverage means significant portions are missing. You can act on this: install the SDK on the services that appear as gaps, and future incidents will have better coverage.

What the capture window does not do

It does not speculate about the past: If the SDK was not running before deployment, there are no events from that period. The artifact does not estimate or fill in what happened before the SDK was present.
It does not capture events from unaffected traces: Events from other requests that were not part of the alerting trace are not included in the incident artifact. The artifact is scoped to the specific trace that triggered the alert.
It does not create incidents: The SDK is passive. It records events. Incidents are only created when an alert arrives via one of the connected integrations.

Effect on the artifact's first confirmed break

Because the capture window extends before the alert, the first confirmed break in the artifact will often show the original failure, not the alert condition. For example:

Alert condition: error_rate > 1% for 5 minutes
First confirmed break in artifact: payments HTTP_IN 503 at 14:22:03 (3 minutes before the alert fired)

This is the intended behavior. The alert tells you something is wrong. Incidentary tells you where it started.

Pre-Alert Capture Window

Pre-Alert Capture Window

How it works from the outside

What this means in practice

What happens when coverage is partial

What the capture window does not do

Effect on the artifact's first confirmed break

Related

On this page