Failure Triage

When a test run completes with failures, AegisRunner automatically analyzes them using AI to cluster failures by root cause, assign severity ratings, detect flaky tests, and provide actionable fix recommendations.

Automatic: Triage runs automatically when a test run finishes with failures. No manual action required. Results appear within seconds.

How It Works

Test run completes with failures — triage triggers automatically in the background
AI receives failure data — up to 100 failed test cases with names, page URLs, error messages, step details, and screenshots
Clustering — AI groups failures by root cause (e.g., "6 cart tests failed because cart is empty", "3 login tests failed because selectors changed")
Report generated — stored on the test run, visible in the dashboard immediately

Triage Report

Each report contains:

Summary

A one-paragraph overview of all failures (e.g., "All 12 failures are caused by elements not being found in the DOM, primarily affecting cart interactions (6), login functionality (3), and category pages (3)").

Failure Clusters

Failures are grouped into clusters. Each cluster contains:

Field	Description
Cluster Name	Short label (e.g., "Cart page elements not found")
Severity	`high`, `medium`, or `low`
Root Cause	Why these tests failed (e.g., "Tests assume cart items exist but cart is empty when tests run")
Recommendation	How to fix (e.g., "Add prerequisite step to add items to cart before testing cart interactions")
Flaky Pattern	Whether this looks like a flaky test (intermittent, timing-dependent) vs a genuine regression
Affected Cases	List of test case IDs in this cluster — click to filter the results table

Example Report

{
  "summary": "All 12 failures caused by elements not found in DOM...",
  "clusters": [
    {
      "cluster_name": "Cart page elements not found",
      "severity": "high",
      "root_cause": "Tests assume cart items exist but cart is empty",
      "recommendation": "Add prerequisite step to add items to cart",
      "is_flaky_pattern": false,
      "affected_case_ids": ["019d...", "019d...", ...]
    },
    {
      "cluster_name": "Login form elements missing",
      "severity": "high",
      "root_cause": "Login page selectors changed or form did not render",
      "recommendation": "Verify login page selectors match current implementation",
      "is_flaky_pattern": false,
      "affected_case_ids": ["019d...", ...]
    }
  ]
}

Auto-Heal & Fix Suggestions

Beyond identifying root causes, AegisRunner can generate fix suggestions for failed tests:

Selector fixes — when an element is not found, suggests updated selectors based on the current DOM
Prerequisite steps — when a test assumes prior state (e.g., items in cart), suggests adding setup steps
Wait adjustments — when failures are timing-related, suggests adding explicit waits
Test data updates — when tests use outdated data, suggests updated values

Fix suggestions are generated from the triage report and the detailed failure data (including DOM snapshots and screenshots).

Flaky Test Detection

The triage system identifies flaky tests — tests that fail intermittently due to timing, network, or rendering issues rather than actual regressions. Flaky clusters are marked with is_flaky_pattern: true and include specific patterns:

Timeout errors that occur inconsistently
Element visibility assertions that depend on animation timing
Network-dependent assertions that fail under load
Tests that passed in the previous run but fail now with no code change

Viewing Triage Results

Open any completed test run with failures
The Failure Triage panel appears above the results table
Clusters are sorted by severity (high → medium → low)
Click a cluster to filter the results table to just the affected test cases
Each cluster shows the count of affected tests, root cause, and recommendation

Manual Triage

You can also trigger triage manually from the API:

POST /api/v1/runs/:runId/triage
Authorization: Bearer YOUR_TOKEN

This re-runs the AI analysis, which is useful if the initial triage timed out or if you want a fresh analysis after updating test data.

Triage Status

Status	Meaning
`none`	No failures to triage (all tests passed)
`pending`	Triage is running
`completed`	Triage report is ready
`failed`	AI analysis failed (timeout, provider error)

Plan Availability

Failure triage is available on Starter plans and above. The free plan shows basic failure lists without AI clustering.

Failure Triage

Failure Triage

How It Works

Triage Report

Summary

Failure Clusters

Example Report

Auto-Heal & Fix Suggestions

Flaky Test Detection

Viewing Triage Results

Manual Triage

Triage Status

Plan Availability

Related Documentation

Need help?