Crawling
Starting a New Crawl
Learn how to configure and start a website crawl, including all available options for different subscription tiers.
Starting a New Crawl
Crawling discovers pages, forms, interactive elements, and captures data for AI test generation.
Crawl Modes
Full Site
Discovers all pages from your base URL. Explores interactions on each page. Best for comprehensive testing.
Single Page
Analyzes one URL. Discovers all interactive elements and states. Ideal for focused testing.
Regression
Replays a baseline manifest — exact same pages, exact same interactions, every time. Deterministic and repeatable. Requires a baseline to be set first.
Regression mode appears when your project has a baseline. Complete a Full Site crawl and click Set as Baseline on the results page.
Crawl Settings
Basic
| Setting | Default | Description |
|---|---|---|
| Max Pages | 200 | Maximum pages to discover |
| Max Depth | 5 | Link depth from start URL |
| Device | Desktop HD | Device profile for viewport, user agent, and touch emulation. See device options below. |
Advanced
| Setting | Default | Description |
|---|---|---|
| Respect Robots.txt | Off | Skip disallowed URLs, honor Crawl-Delay |
| Fill Forms | Off | Auto-fill forms with test data during crawl |
| Skip Auth Forms | On | Avoid submitting login/register forms |
| Include/Exclude Patterns | None | Regex patterns to focus or skip URL paths |
Device Profiles
Choose a device profile to crawl your site as it appears on different devices. The crawler emulates the viewport, user agent, device scale factor, and touch capabilities.
| Profile | Viewport | Type |
|---|---|---|
| Desktop HD (default) | 1920 x 1080 | Desktop |
| Desktop 4K | 3840 x 2160 | Desktop |
| iPhone 12 | 390 x 844 | Mobile + Touch |
| iPhone 14 | 390 x 844 | Mobile + Touch |
| iPhone 14 Pro Max | 430 x 932 | Mobile + Touch |
| Pixel 7 | 412 x 915 | Mobile + Touch |
| Pixel 7 Pro | 412 x 892 | Mobile + Touch |
| Samsung Galaxy S23 | 360 x 780 | Mobile + Touch |
| iPad Pro 12.9" | 1024 x 1366 | Tablet + Touch |
| iPad Mini | 768 x 1024 | Tablet + Touch |
| Galaxy Tab S8 | 800 x 1280 | Tablet + Touch |
Mobile crawls discover mobile-specific elements (hamburger menus, bottom navigation, mobile modals) and generate mobile-specific tests. AI-generated tests from mobile crawls automatically use the
mobile viewport.
How Crawling Works
Exploration (Full Site)
- Page Discovery — Sitemap parsing + link extraction from start URL
- State Discovery — DFS interaction: clicks buttons, opens dropdowns, fills forms to discover UI states
- DOM Extraction — Stable CSS selectors (ID, name, aria-label, data-testid) for forms, buttons, links
- Audits — Accessibility (axe-core WCAG), SEO, security headers, performance per page
- AI Test Generation — After crawl completes, AI generates test suites per page
Regression
- Visits manifest pages in recorded order
- Executes recorded interactions (click, select, fill)
- Captures screenshots for comparison
- Reports matched vs missing pages/interactions
- Single worker — no parallelism for maximum determinism
Setting a Baseline
- Run a Full Site crawl
- Click Set as Baseline on results page
- Manifest is compiled (pages, interactions, expected states)
- Regression mode becomes available
See Regression Manifests for details.
CI/CD Integration
curl -X POST /api/v1/ci/trigger -H "Authorization: Bearer aegis_..." -d '{"crawl": true, "maxPages": 100}'CI crawls inherit settings from last UI crawl. See CI/CD Integration.
Best Practices
- Start small, then increase max pages
- Exclude admin areas, logout links, delete actions
- Enable Respect Robots.txt for production sites
- Enable Fill Forms for comprehensive form test coverage
- Set a baseline after a good crawl, use regression for CI