Choosing the Data Extractor

Firecrawl offers two approaches for extracting structured data from web pages. Each serves different use cases with varying levels of automation and control.

Quick Comparison

Feature	`/agent`	`/scrape` (JSON mode)
URL Required	No (optional)	Yes (single URL)
Scope	Web-wide discovery, multi-page, or single page	Single page
URL Discovery	Autonomous web search	None
Processing	Asynchronous	Synchronous
Schema Required	No (prompt or schema)	No (prompt or schema)
Pricing	Dynamic (5 free runs/day); 10 credits/cell on Parallel Agents fast path	5 credits/page (1 base + 4 for JSON mode)
Best For	Research, discovery, multi-page or batch gathering	Known single-page extraction

1. `/agent` Endpoint

The /agent endpoint is Firecrawl’s most advanced offering. It uses AI agents to autonomously search, navigate, and gather data from across the web.

Key Characteristics

URLs Optional: Just describe what you need via prompt; URLs are completely optional
Autonomous Navigation: The agent searches and navigates deep into sites to find your data
Deep Web Search: Autonomously discovers information across multiple domains and pages
Parallel Processing: Processes multiple sources simultaneously for faster results
Models Available: spark-1-fast (cheapest, used by Parallel Agents — 10 credits/cell), spark-1-mini (default, balanced cost/quality), and spark-1-pro (highest accuracy)

Example

from firecrawl import Firecrawl
from pydantic import BaseModel, Field
from typing import List, Optional

app = Firecrawl(api_key="fc-YOUR_API_KEY")

class Founder(BaseModel):
    name: str = Field(description="Full name of the founder")
    role: Optional[str] = Field(None, description="Role or position")
    background: Optional[str] = Field(None, description="Professional background")

class FoundersSchema(BaseModel):
    founders: List[Founder] = Field(description="List of founders")

result = app.agent(
    prompt="Find the founders of Firecrawl",
    schema=FoundersSchema,
    model="spark-1-mini",
    max_credits=100
)

print(result.data)

Best Use Case: Autonomous Research & Discovery

Scenario: You need to find information about AI startups that raised Series A funding, including their founders and funding amounts. Why /agent: You don’t know which websites contain this information. The agent will autonomously search the web, navigate to relevant sources (Crunchbase, news sites, company pages), and compile the structured data for you.

Parallel Agents

For high-volume batch extraction — e.g., enriching a list of 1,000 companies with funding data — use Parallel Agents. They run an intelligent waterfall: spark-1-fast handles simple cells at a flat 10 credits/cell, escalating to spark-1-mini only when needed. This is the right tool for grid/batch workflows where you’d otherwise be looping over many similar prompts. For more details, see the Agent documentation.

2. `/scrape` Endpoint with JSON Mode

The /scrape endpoint with JSON mode is the most controlled approach—it extracts structured data from a single known URL using an LLM to parse the page content into your specified schema.

Key Characteristics

Single URL Only: Designed for extracting data from one specific page at a time
Exact URL Required: You must know the precise URL containing the data
Schema Optional: Can use JSON schema OR just a prompt (LLM chooses structure)
Synchronous: Returns data immediately (no job polling needed)
Additional Formats: Can combine JSON extraction with markdown, HTML, screenshots in one request

Example

from firecrawl import Firecrawl
from pydantic import BaseModel

app = Firecrawl(api_key="fc-YOUR-API-KEY")

class CompanyInfo(BaseModel):
    company_mission: str
    supports_sso: bool
    is_open_source: bool
    is_in_yc: bool

result = app.scrape(
    'https://firecrawl.dev',
    formats=[{
      "type": "json",
      "schema": CompanyInfo.model_json_schema()
    }],
    only_main_content=False,
    timeout=120000
)

print(result)

Best Use Case: Single-Page Precision Extraction

Scenario: You’re building a price monitoring tool and need to extract the price, stock status, and product details from a specific product page you already have the URL for. Why /scrape with JSON mode: You know exactly which page contains the data, need precise single-page extraction, and want synchronous results without job management overhead. For more details, see the JSON mode documentation.

Decision Guide

Do you know the exact URL(s) containing your data?

NO → Use /agent (autonomous web discovery)
YES
- Single page? → Use /scrape with JSON mode
- Multiple pages? → Use /agent with URLs (or batch /scrape)
- Many similar prompts across a list? → Use /agent Parallel Agents

Recommendations by Scenario

Scenario	Recommended Endpoint
”Find all AI startups and their funding”	`/agent`
”Extract data from this specific product page”	`/scrape` (JSON mode)
“Get all blog posts from competitor.com”	`/agent` with URL
”Monitor prices across multiple known URLs”	`/scrape` with batch processing
”Research companies in a specific industry”	`/agent`
”Enrich 1,000 companies with funding data”	`/agent` Parallel Agents
”Extract contact info from 50 known company pages”	`/scrape` with batch processing

Pricing

Endpoint	Cost	Notes
`/scrape` (JSON mode)	5 credits/page (1 base + 4 for JSON mode)	Fixed, predictable
`/agent`	Dynamic	5 free runs/day; typical run ~100–500 credits
`/agent` (Parallel Agents)	10 credits/cell on the fast path	Batch/grid workflows

Example: “Find the founders of Firecrawl”

Endpoint	How It Works	Credits Used
`/scrape`	You find the URL manually, then scrape 1 page	~5 credits
`/agent`	Just send the prompt—agent finds and extracts	~100–500 credits

Tradeoff: /scrape is cheapest but requires you to know the URL. /agent costs more but handles discovery automatically. For detailed pricing, see Firecrawl Pricing.

Key Takeaways

Know the exact URL? Use /scrape with JSON mode—it’s the cheapest (5 credits/page), fastest (synchronous), and most predictable option.
Need autonomous research? Use /agent—it handles discovery automatically with 5 free runs/day, then dynamic pricing based on complexity.
Running batch workflows over a list? Use /agent Parallel Agents with spark-1-fast for a flat 10 credits/cell.
Cost vs. convenience tradeoff: /scrape is most cost-effective when you know your URLs; /agent costs more but eliminates manual URL discovery.

Get Started

Core Endpoints

More

Quickstarts

Developer Guides

Webhooks

Use Cases

Other

Contributing

Choosing the Data Extractor

Quick Comparison

1. `/agent` Endpoint

Key Characteristics

Example

Best Use Case: Autonomous Research & Discovery

Parallel Agents

2. `/scrape` Endpoint with JSON Mode

Key Characteristics

Example

Best Use Case: Single-Page Precision Extraction

Decision Guide

Recommendations by Scenario

Pricing

Example: “Find the founders of Firecrawl”

Key Takeaways

Further Reading

Get Started

Core Endpoints

More

Quickstarts

Developer Guides

Webhooks

Use Cases

Other

Contributing

Documentation Index

​Quick Comparison

​1. /agent Endpoint

​Key Characteristics

​Example

​Best Use Case: Autonomous Research & Discovery

​Parallel Agents

​2. /scrape Endpoint with JSON Mode

​Key Characteristics

​Example

​Best Use Case: Single-Page Precision Extraction

​Decision Guide

​Recommendations by Scenario

​Pricing

​Example: “Find the founders of Firecrawl”

​Key Takeaways

​Further Reading

Quick Comparison

1. `/agent` Endpoint

Key Characteristics

Example

Best Use Case: Autonomous Research & Discovery

Parallel Agents

2. `/scrape` Endpoint with JSON Mode

Key Characteristics

Example

Best Use Case: Single-Page Precision Extraction

Decision Guide

Recommendations by Scenario

Pricing

Example: “Find the founders of Firecrawl”

Key Takeaways

Further Reading