Documentation
Test Execution

Regression Manifests

Deterministic CI: capture one good scan as a manifest and replay the exact same path every time. Pair with visual regression for pixel-level checks.

Baseline Replays

Free-form scans aren't deterministic. Different runs reach different states because the SPA hydrates differently, or a button takes a different number of milliseconds to enable, or the AI chose a different element to click first. That's fine for discovery, but it makes scans unsuitable for CI failure signals — you'd see ghost regressions.

Baseline replay is the deterministic alternative. You take one good scan, save its exact path as a manifest, and the replay walks that exact path every time. The only thing that changes between runs is your code.

Older terminology: Baseline replays are also called regression manifests in older parts of the UI. Same thing.

When to use it

ScenarioMode
First-time setup, exploring what we can findFull Site scan
You changed a few pages and want fresh testsSingle Page scan
CI on every PR — same path, every time, fastBaseline Replay
Weekly refresh of the baselineFull Site scan, then Set as Baseline
Visual regression in CIBaseline Replay + visual regression on

Setting a baseline

  1. Run a full-site scan you're happy with — finished pages, no obvious flakes.
  2. On the scan result page, click Set as Baseline.
  3. AegisRunner compiles a manifest containing:
    • Every discovered page, in scan order.
    • Every recorded interaction per page (clicks, form submits, dropdowns).
    • Expected page count and state count.
    • The screenshot for each state, used as the visual regression baseline.
  4. Baseline Replay mode now appears in the scan modes for this project.
  5. Compare with Baseline appears on every future free-form scan.
One baseline per project. Setting a new one archives the previous. Old comparisons still work; new comparisons go against the new baseline.

What's in a manifest

The manifest is a JSON file (download it from Project Config → Baseline → Download manifest) that contains:

  • Pages — URL, page state count, expected screenshot hashes.
  • Interactions — for each page, the list of clicks, fills, selects in order.
  • Replay policy — strict-order mode, allow-new-pages flag, and timeout overrides.
  • Provenance — which scan it was compiled from, when, and by whom.

If you're versioning manifests in your own git repo (some teams do), the JSON is small enough to be tracked alongside source.

Running a baseline replay

  1. Open Scan, pick the project.
  2. Choose Baseline Replay mode.
  3. Click Start. The replay walks the manifest top to bottom.

The replay:

  • Uses one browser context (no parallelism). Ensures the same page state on every run.
  • Skips discovery — no extra link-walking or new-page exploration.
  • Captures screenshots at every state for visual regression.
  • Reports exactly which pages and states matched, were missing, or had drift.

Replay results

The replay's result page is structured differently from a free-form scan:

  • Page check table — every manifest page with one of three statuses:
    • Matched — page reached and all interactions executed.
    • Missing — page in manifest, not reachable now.
    • Drifted — page reached but state count or interaction outcomes differ from baseline.
  • Visual diffs — if visual regression was on, side-by-side panels for any state with pixel differences.
  • New pages — only shown if allowNewPages is enabled. Otherwise treated as drift.

Replay policy

Two flags control how strict the replay is:

FlagDefaultEffect
strictPageOrdertruePages must be visited in manifest order. Disable if your app's order is non-deterministic but the destinations are.
allowNewPagesfalseIf a new page appears, treat it as informational rather than drift. Useful when you've shipped a small feature and don't want to update the baseline yet.

Set both under Project Config → Baseline → Replay policy.

When to refresh the baseline

Refresh the baseline when:

  • You ship a new feature and the baseline is missing it.
  • Drift count creeps up over weeks (means your baseline has rotted).
  • You ran a major site refactor — start from a fresh full scan.

Recommended cadence: full-site scans weekly, replays in CI per PR. Lock in a fresh baseline whenever the team agrees the weekly scan looks clean.

Pairing with visual regression

Baseline replay alone catches structural drift. Pair it with visual regression to also catch pixel-level changes:

  1. Set a baseline scan.
  2. Toggle Visual regression on for the replay.
  3. The replay walks the baseline path; pixel diffs are reported per state.
  4. Accept or reject diffs from the result page.

This is the most reliable CI configuration we have — deterministic path + per-state pixel comparison.

Triggering replays from CI

POST /api/v1/ci/trigger
Authorization: Bearer aegis_<token>
Content-Type: application/json

{
  "crawl": true,
  "runType": "regression",
  "wait": true,
  "baseUrl": "https://pr-123.preview.example.com"
}

See CI/CD Integration.

Common questions

Drift on every page — what's wrong?

Either the deployment is broken (check directly in a browser) or the baseline has rotted. Run a full-site scan; if it looks clean, set it as the new baseline.

The replay can't find page X anymore.

Page was removed or its URL changed. Either it's a real regression (your app removed something) or expected (you renamed a route — re-baseline).

Can multiple environments share one baseline?

Yes — a baseline is per-project, not per-environment. Run replays against staging and production using the same manifest, just override baseUrl.

Limitations

  • SPA hydration variance. Even with parallel=1, very dynamic SPAs can land in slightly different states between runs. Tighten with explicit waits in test steps if drift is consistent.
  • One baseline per project. If you genuinely need different baselines for different environments, create separate projects.
  • No partial replays from the UI — once you start a replay, it runs the full manifest. Use Single Page mode for one-page checks.

Related

Need help?

Can't find what you're looking for? Our support team is here to help.