The AI-powered brain for your
product catalog.
A 4-tier cascading extraction pipeline, deterministic staleness detection on 14 product fields, event-driven feed regeneration, and automated AI re-enrichment with 30+ attributes. Plus a native MCP server so AI agents can query your entire catalog.
The definitive AI-powered e-commerce optimization platform.
Transform basic product feeds into a powerful growth engine. Serpwise combines text and vision AI to automatically enrich your products with 30+ attributes, generate comprehensive feeds, and inject pre-rendered content directly onto your pages via our edge proxy.
Explore E-commerce FeaturesProduct Intelligence Engine
How It Actually Works
We don't just generate XML files. Serpwise is a complete edge architecture with a 4-tier extraction pipeline, deterministic change detection, event-driven feed sync, and automated AI re-enrichment.
4-Tier Cascading Data Extraction
Every product page passes through a cascading extraction pipeline. Each tier only fills fields that the previous tier missed, guaranteeing maximum coverage:
- JSON-LD Product Schema — Parses all
application/ld+jsonblocks, walks@grapharrays, and extracts offers, aggregateRating, GTIN variants (gtin, gtin12, gtin13, gtin14, isbn), and brand objects. - Open Graph Meta Tags — Fills remaining gaps from
og:title,product:price:amount,product:brand, andog:imagetags. - AI-Mapped CSS Selectors — During the 3-step onboarding wizard, Serpwise learns your site's CSS selectors for title, price, image, and description. These are applied as a third fallback layer.
- Content Image Discovery — If no images were found, scans all
<img>tags outside nav, footer, aside, and header regions, collecting up to 10 resolved URLs.
This pipeline extracts 14 core fields per product: title, description, price, currency, availability, condition, brand, images, GTIN, MPN, SKU, category, rating, and review count.
Deterministic Staleness Detection
Most monitoring tools diff raw HTML, triggering false positives on layout changes. Serpwise takes a fundamentally different approach:
- After extraction, we compute a deterministic SHA-256 fingerprint of the 14 extracted product fields (not the raw HTML).
- Fast-path: If the fingerprint matches the stored value, processing stops immediately — zero wasted compute.
- Field-level diff: If fingerprints differ, we compare all 14 field pairs and record exactly which fields changed (e.g.,
["price", "availability"]).
This means a CSS redesign won't trigger a false stale flag, but a $5 price drop or an out-of-stock change will be caught instantly.
Event-Driven Feed Regeneration
When a product is flagged stale, feed regeneration fires immediately — not on a cron, not in a batch overnight:
- All active feeds for that domain (XML, CSV, JSON) are regenerated in a single pass with
triggeredBy: "auto". - Fault isolation: Errors in one feed don't block others — partial success is still progress.
- Single cache invalidation at the end (not per-feed) ensures the edge proxy serves the updated feeds within seconds.
Your Google Merchant feed reflects a price change within seconds of the crawler detecting it, not hours.
Automated AI Re-Analysis
Staleness detection does more than update feeds. If you've configured a domain analysis schedule, the system automatically queues AI re-enrichment:
- Configurable schedule: Daily, weekly, or monthly intervals with a max products per run limit.
- Choose analysis type: Text-only analysis, vision-only analysis, or full (both).
- Credit-aware execution: The scheduler checks your balance and pre-deducts credits before enqueueing, so you never overspend.
- Stale-first priority: Products whose extracted data has changed since their last AI analysis are processed first.
When a scheduled run completes, your 30+ attribute profiles, injected HTML, and MCP server data are all up to date — no manual intervention required.
Multimodal AI Enrichment
Our AI text analysis extracts 30+ attributes: SEO-optimized titles, Google Product Taxonomy classification, highlights, specs, FAQ, keywords, and custom labels. Vision analysis examines product images to identify color, material, pattern, style, and visible brand markings. The system reconciles both sources logically — vision is prioritized for color and pattern, text for exact material and dimensions.
Native MCP Server
The edge proxy exposes a Model Context Protocol (MCP) server with three tools: semantic product search, product detail retrieval, and category listing. Agents authenticate via scoped API keys, and semantic search runs cosine similarity against your product embeddings. Any MCP-compatible AI agent can query your full catalog using natural language.
Edge HTML Injection
Because Serpwise operates as a reverse proxy, it modifies the HTTP response in-transit. AI-generated Specs tables, FAQPage-structured Q&As, enhanced JSON-LD, and semantically related product blocks are injected directly into the page before it reaches the visitor or search engine crawler. Zero CMS plugins, zero theme edits.
What We Replace
Serpwise consolidates an entire e-commerce optimization stack into a single edge platform.
Platform agnostic. Plays nice with your stack.
Because Serpwise works at the DNS/Edge level, it works natively with any platform, framework, or CMS. No platform rewrites required.
CDN Routing
Origin Platforms
...and any other platform. serpwise is a reverse proxy, so it supports every HTTP origin out of the box.
Decision support
Technical Deep Dive FAQ
How does the cascading data extraction differ from traditional scrapers?
Most scrapers only read JSON-LD, so if your database doesn't expose a field, it's missing from your feed. Serpwise runs a 4-tier cascading pipeline: it first parses JSON-LD Product schema, then fills gaps with Open Graph meta tags (og:title, product:price:amount, etc.), then applies AI-mapped CSS selectors learned during onboarding, and finally falls back to content image discovery. Each tier only fills fields the previous tier missed, giving you the most complete extraction possible across 14 core product fields.
How does staleness detection work without monitoring raw HTML?
We don't diff raw HTML, which would trigger false positives on layout changes. Instead, we compute a deterministic fingerprint of the 14 extracted product fields (title, price, availability, brand, images, etc.). On each crawl, we compare the new fingerprint against the stored one. If they differ, we run a field-by-field comparison to identify exactly which fields changed (e.g., 'price' and 'availability') and record them. This means a CSS redesign won't trigger a false stale flag, but a $5 price drop will.
What happens automatically when a product is flagged as stale?
Two things fire immediately. First, all active feeds for that domain (XML, CSV, JSON) are regenerated in a single batch, followed by one gateway cache invalidation so your Google Merchant feed reflects the change within seconds. Second, if you've enabled a domain analysis schedule, the system queues an AI re-enrichment job (text analysis, vision analysis, or both) so your 30+ attribute profile, injected HTML, and MCP server data are updated without any manual intervention.
What exactly is the Native MCP Server and how do agents use it?
Serpwise exposes a Model Context Protocol (MCP) server at /mcp on your domain. AI agents like Claude Desktop, ChatGPT, and Perplexity connect via API keys and use 5 structured tools: search_content, get_page, search_products, get_product_details, and list_categories. Agents can semantically search your entire site content and product catalog, retrieve full page details, and browse categories. This makes your site natively queryable by any MCP-compatible AI agent — zero code changes required.
How does Edge Content Injection bypass my CMS?
Serpwise acts as a reverse proxy between your CDN and your origin server. After our AI generates pre-rendered HTML (like Specs tables or an SEO-optimized FAQ with FAQPage microdata), our edge proxy intercepts the HTTP response from your origin and injects this HTML stream right before it hits the visitor or search engine crawler. Zero CMS plugins or code changes required.
How does the scheduled re-analysis system work?
You configure a domain analysis schedule with an interval (daily, weekly, or monthly), an analysis type (text, vision, or full), and a max products per run limit. On each scheduled run, the system queries all products where the current fingerprint differs from the last-analyzed fingerprint, checks your credit balance (text costs 5 credits, vision 3, full 7 per product), pre-deducts the total, and bulk-inserts AI jobs into the queue. Everything is idempotent—if a run finds no stale products, it simply advances the next run timestamp.
Stop wrestling with legacy PIMs and feeds.
Make your catalog Agent-Ready today. Cascading extraction, deterministic change detection, event-driven feed sync, and a native MCP server — all from your edge proxy.