Edge Crawler & Indexer

Automatically warm your edge cache and build a structural index of your pages for AI features.

The Edge Crawler & Indexer is an asynchronous background system powered by Bastio AI (Firecrawl). It systematically navigates your website through the SerpWise Edge Proxy to accomplish two major goals:

  1. Cache Warming: By visiting your URLs, the crawler forces a "Cache Miss", causing SerpWise to run your rules and store the final HTML in memory. When real visitors or Googlebot arrive later, they get instant "Cache Hits".
  2. Page Indexing: The crawler extracts clean, LLM-ready Markdown, raw HTML, and Meta Titles from every page it visits. This structured index powers future SerpWise features (like AI Product Feeds, automatic XML Sitemap generation, and cross-site context for Gemini).

Credit Usage: The Edge Crawler consumes 1 Credit per successfully crawled and indexed page.

Setting Up Crawlers

You can configure multiple independent crawlers for a single domain (e.g., a "Daily Products Crawl" and a "Weekly Blog Crawl").

  1. Navigate to your Domain Dashboard and click the Edge Crawler tab.
  2. Click Add Crawler.

Configuration Options

  • Name: A friendly identifier (e.g., "XML Sitemap Warmer").
  • Schedule: Choose how often this crawler should run automatically (Daily, Weekly, Monthly, or Manual Only).
  • Start URL: The entry point for the crawler. We highly recommend using an XML Sitemap URL (e.g., https://example.com/sitemap_products.xml), as this is the most efficient way to discover all your important pages. You can also use your homepage (/).
  • Max Pages Limit: A safety ceiling. The crawler will stop once it hits this many pages, preventing you from accidentally burning through all your credits on an infinitely deep website.
  • Max Depth: How many "clicks" deep from the Start URL the crawler should go. If you use a Sitemap, a Depth of 1 is usually sufficient.
  • Include / Exclude Paths: You can restrict the crawler to only visit specific sections of your site (e.g., Include */products/*, Exclude */checkout/*).

Manual Execution

Even if a crawler is on a schedule, you can trigger it instantly at any time:

  1. Find the crawler in the Saved Crawlers table.
  2. Click the Run Now (Play) button.
  3. A banner will appear indicating the crawler is running, and you'll see the "Pages" count update in real-time as webhooks arrive.

The Page Index

Below the crawler configurations, you will find the Indexed Pages table. This is a unified view of every URL the crawler has successfully discovered, fetched, and saved to the database.

It displays:

  • The URL Path
  • The final HTTP Status Code (returned after SerpWise rules are applied)
  • The extracted <title> tag
  • The relative time since it was last updated

When SerpWise generates automated XML sitemaps or AI-powered Meta Suggestions, it queries this exact database table instead of re-scraping your site, saving you credits and processing time.

What Crawling Enables

The page index built by the Edge Crawler powers a wide range of URL Intelligence features:

On this page