Tutorial·8 min·2026-01-31

Building an Autonomous Web Scraping Agent with Firecrawl

Step-by-step: build an agent that crawls websites, extracts structured data, and pays per page — all with x402 and Mithril.

Web scraping is one of the most common tasks for AI agents. This tutorial shows how to build a scraping agent that pays per-page via x402 — no Firecrawl API key required.

What We're Building

An autonomous agent that:

Takes a list of URLs or a search query

Scrapes each page using Firecrawl (paid per-page via x402)

Extracts structured data using an LLM

Returns clean, structured results

Setup

import Mithril from "@mithril/sdk"

const m = new Mithril({
  apiKey: process.env.MITHRIL_API_KEY,
})

Scraping a Single Page

async function scrapePage(url: string) {
  const result = await m.pay({
    url: "https://api.firecrawl.dev/v1/scrape",
    method: "POST",
    body: JSON.stringify({
      url,
      formats: ["markdown", "html"],
      onlyMainContent: true,
    }),
  })
  return result.data
}

Cost: ~$0.01 per page. The x402 payment is handled automatically by the Mithril SDK.

Scraping Multiple Pages

async function scrapeMany(urls: string[]) {
  const results = await Promise.all(
    urls.map(url => scrapePage(url))
  )
  return results
}

For 100 pages, total cost: ~$1.00. No subscription needed.

Extracting Structured Data

Combine Firecrawl scraping with LLM extraction:

async function extractProducts(url: string) {
  const page = await scrapePage(url)

  const extraction = await m.pay({
    url: "https://openrouter.ai/api/v1/chat/completions",
    method: "POST",
    body: JSON.stringify({
      model: "anthropic/claude-sonnet-4-6",
      messages: [{
        role: "user",
        content: `Extract all products from this page as JSON:
        ${page.markdown}`,
      }],
    }),
  })

  return JSON.parse(extraction.data.choices[0].message.content)
}

Crawling an Entire Site

Firecrawl also supports site-wide crawling:

async function crawlSite(url: string, maxPages: number = 50) {
  const result = await m.pay({
    url: "https://api.firecrawl.dev/v1/crawl",
    method: "POST",
    body: JSON.stringify({
      url,
      limit: maxPages,
      scrapeOptions: {
        formats: ["markdown"],
        onlyMainContent: true,
      },
    }),
  })
  return result.data
}

Use Cases

Price monitoring: Scrape competitor product pages daily

Lead generation: Extract company information from websites

Content aggregation: Collect and summarize industry news

Market research: Analyze competitor messaging and positioning

Data enrichment: Supplement CRM data with web-sourced information

Cost Control

Set daily limits on the scraping agent's wallet:

Testing: $2/day (200 pages)

Production: $10/day (1,000 pages)

Heavy use: $50/day (5,000 pages)

Monitor via the Mithril dashboard to ensure costs align with expectations.

Building an Autonomous Web Scraping Agent with Firecrawl

What We're Building

Setup

Scraping a Single Page

Scraping Multiple Pages

Extracting Structured Data

Crawling an Entire Site

Use Cases

Cost Control

Related posts

Give Your AI Agent a Credit Line in 5 Minutes

Building a Research Agent with Exa Search + Mithril

Smart LLM Routing: Using OpenRouter with Mithril for Cost-Optimized AI