firecrawl-scrape

by firecrawl

firecrawl-scrape helps extract clean, LLM-friendly content from known URLs, including JS-rendered pages. Use it to scrape markdown, links, or page-specific answers with Firecrawl CLI or npx firecrawl.

Stars234

Favorites0

Comments0

AddedMar 31, 2026

CategoryWeb Scraping

Install Command

npx skills add firecrawl/cli --skill firecrawl-scrape

Curation Score

This skill scores 72/100, which means it is acceptable to list for directory users who want a clear URL-scraping command, but it is not especially complete as an install-decision page. The repository evidence shows strong triggerability and practical command examples for scraping static or JS-rendered pages into markdown, including multi-URL use, output formats, and query-based extraction. However, adoption clarity is held back by very sparse top-level description text, no install command in SKILL.md, and no support files or deeper operational guidance.

72/100

Strengths

Strong trigger cues in the description explicitly map user intents like "scrape", "fetch", and "read this webpage" to this skill.
Quick-start examples show concrete usage patterns: basic scrape, main-content-only, JS wait, multiple URLs, alternate formats, and page querying.
Operational value is specific versus a generic prompt: it directs agents to use `firecrawl scrape`/`npx firecrawl`, save outputs, and prefer this over WebFetch for webpage extraction.

Cautions

SKILL.md does not include an install command, so users still need outside context to set up the CLI before they can run it.
Repository support is thin beyond one markdown file; there are no scripts, references, or companion resources for troubleshooting, auth/setup, or edge-case handling.

Firecrawl Scraping Websites Markdown Cli Browser Automation

Overview

Overview of firecrawl-scrape skill

What firecrawl-scrape does

The firecrawl-scrape skill is for extracting clean, LLM-friendly content from one or more web pages when you already know the URL. It is built for practical page retrieval, not broad site discovery: give it a page, and it returns structured output such as markdown, links, or a direct query answer based on that page.

Who should use firecrawl-scrape

This skill fits users who need reliable page content from:

documentation pages
blog posts
pricing pages
product pages
JavaScript-rendered sites and SPAs

It is especially useful if ordinary fetch tools fail on client-rendered pages or return noisy HTML that is awkward to pass into an LLM.

The real job-to-be-done

Most users do not want “web scraping” in the abstract. They want one of these outcomes:

read a page into markdown for later analysis
pull the main content without headers and footers
extract links alongside page text
ask a focused question about a known URL
scrape several known URLs in parallel

That is where firecrawl-scrape is stronger than a generic prompt that says “read this webpage.”

Why users pick this skill over generic fetch

The main differentiator is that firecrawl-scrape is designed for webpage content extraction, including JS-rendered pages, and returns output optimized for LLM workflows. The upstream skill explicitly says to use it instead of WebFetch for webpage content extraction. That matters if your usual browser or fetch path misses rendered content, navigation clutter, or link context.

Best-fit and misfit in one glance

Best fit:

you already have the URL
you want page content, not site-wide exploration
you need markdown or links in a machine-usable format
the page may require render time before content appears

Misfit:

you need to discover URLs first
you need whole-site traversal
you need interaction beyond page scraping
you only need a simple static HTML fetch and already trust another tool

How to Use firecrawl-scrape skill

firecrawl-scrape install context

This skill lives in the firecrawl/cli repository under skills/firecrawl-scrape. The skill itself is invocation guidance for the Firecrawl CLI, so the practical requirement is access to the firecrawl command or npx firecrawl. The examples in the skill use both forms:

firecrawl scrape ...
npx firecrawl ...

If your environment does not already have the CLI available, use the npx firecrawl form to reduce setup friction.

What input firecrawl-scrape needs

At minimum, firecrawl-scrape needs a concrete URL. From there, the quality of output depends on what else you specify:

output format needed: markdown, links, or both
whether to keep only main content
whether the page needs render delay with --wait-for
whether you want raw page content saved to a file
whether you want a targeted answer using --query

This is not a skill for vague goals like “research this company online.” It is for “scrape this exact page and return useful output.”

The fastest successful first command

If you just need readable page content, start here:

firecrawl scrape "<url>" -o .firecrawl/page.md

If the page is cluttered with navigation or sidebars, use:

firecrawl scrape "<url>" --only-main-content -o .firecrawl/page.md

If the page is a SPA or loads content after render:

firecrawl scrape "<url>" --wait-for 3000 -o .firecrawl/page.md

When to use main-content mode

--only-main-content is one of the highest-value options because it often improves downstream summarization and extraction quality. Use it when your goal is:

summarizing an article
extracting product or pricing details
feeding content into another LLM step
reducing token waste from menus, footers, and repeated page chrome

Skip it if you explicitly need navigation links or surrounding layout context.

How to handle JavaScript-rendered pages

A common adoption blocker is pages that look fine in a browser but return incomplete content through simple fetch methods. firecrawl-scrape addresses that with render-aware scraping. In practice, if content appears late, add --wait-for with a realistic delay such as 3000.

Use render waiting when:

product specs populate after page load
documentation content hydrates client-side
pricing tables appear after scripts run

Do not add long waits by default. Start small and only increase delay when output is clearly missing content.

How to scrape multiple URLs efficiently

The skill supports multiple URLs in one command and notes that they are scraped concurrently. That makes it useful for small known-page batches such as:

several docs pages
a homepage, pricing page, and FAQ
a blog post set you already selected

Example:

firecrawl scrape https://example.com https://example.com/blog https://example.com/docs

This is more appropriate than a crawl when you already know the exact targets.

How to get markdown and links together

If your next step depends on both readable content and page references, request multiple formats:

firecrawl scrape "<url>" --format markdown,links -o .firecrawl/page.json

This is a strong choice for workflows like:

extract content, then inspect outbound links
build citation-aware notes
separate body text from navigation and referenced destinations

Choose JSON output when you need structured post-processing rather than a single markdown file.

How to use firecrawl-scrape for targeted questions

One of the most practical firecrawl-scrape usage patterns is asking a page-specific question during scraping:

firecrawl scrape "https://example.com/pricing" --query "What is the enterprise plan price?"

This works best when:

the answer is likely on one page
you want a focused extraction instead of full-page review
you want to reduce manual reading time

It is weaker when the answer spans multiple pages or requires comparing several documents.

Turn a rough request into a strong prompt

Weak request:

“Scrape this site and tell me what matters.”

Strong request:

“Use firecrawl-scrape on https://example.com/pricing with --only-main-content. Save markdown to .firecrawl/pricing.md. Then extract plan names, monthly prices, annual billing notes, and enterprise contact language.”

Why this is better:

it gives a specific URL
it chooses the right output mode
it defines what to extract after scraping
it reduces ambiguity about scope

Suggested workflow for firecrawl-scrape for Web Scraping

A good practical sequence is:

Confirm you have the exact page URL.
Start with markdown extraction.
Add --only-main-content if the page is noisy.
Add --wait-for if rendered content is missing.
Switch to --format markdown,links if link structure matters.
Use --query only when the task is narrow and page-bounded.

This follows the upstream positioning of scrape as a middle step in a broader workflow: search → scrape → map → crawl → interact.

Files to read first in the repository

Read skills/firecrawl-scrape/SKILL.md first. It contains nearly all of the practical value:

when to use the skill
quick-start commands
supported options
usage tips

Because this skill directory entry is install-oriented, the key pre-install takeaway is simple: the source document is concise, and there are no extra helper scripts or references you need to inspect before trying it.

Practical adoption tips that change output quality

A few choices matter disproportionately:

Prefer exact URLs over top-level domains.
Use --only-main-content for analysis-heavy tasks.
Use --wait-for only when output is visibly incomplete.
Save outputs to .firecrawl/ so you can inspect raw results before chaining more automation.
Use --query for page-local facts, not open-ended research.

These small decisions usually matter more than adding more prompt wording.

firecrawl-scrape skill FAQ

Is firecrawl-scrape better than a normal prompt with a URL?

Usually yes, if the job is actual webpage extraction. The firecrawl-scrape skill gives a clear invocation path, supports JS-rendered pages, can return markdown or links, and exposes scraping-specific options. A normal prompt may work for simple reading tasks, but it is less reliable when pages need rendering or cleaner output structure.

When should I use firecrawl-scrape instead of WebFetch?

Use firecrawl-scrape when you want webpage content extraction. The upstream skill explicitly recommends it instead of WebFetch for that purpose. That recommendation is most relevant for rendered pages, cleaner markdown output, and scraping workflows that need repeatable CLI behavior.

Is firecrawl-scrape beginner-friendly?

Yes, relative to many scraping tools. The first-run path is short: provide a URL, run a command, inspect the output. You do not need to understand full crawling strategy to get value. The main thing beginners must know is that this is page scraping, not site-wide exploration.

Can firecrawl-scrape handle SPAs and dynamic pages?

Yes. That is one of its core reasons to exist. If a page relies on JavaScript rendering, use --wait-for when needed so the content has time to appear before extraction.

When is firecrawl-scrape the wrong choice?

Avoid it when:

you do not know the target URL yet
you need broad domain discovery
you need recursive site traversal
your task requires interaction rather than extraction
the answer must be synthesized across many pages you have not identified

In those cases, search, map, crawl, or other tools are a better first step.

Do I need to install the whole repository to use it?

You need access to the Firecrawl CLI behavior the skill references, but the skill itself is lightweight. For decision-making, there is little repo overhead here: the practical instructions are concentrated in SKILL.md, and there are no companion scripts or resource folders you need to master first.

How to Improve firecrawl-scrape skill

Give firecrawl-scrape narrower goals

The most common quality issue is overbroad intent. Better results come from requests like:

“extract the pricing table”
“return markdown plus links”
“answer this one question from the page”
not:
“scrape everything useful”

The narrower the page task, the less cleanup you need afterward.

Improve inputs with page-aware instructions

Strong inputs combine URL, output mode, and extraction target. Example:

firecrawl scrape "https://example.com/docs/auth" \
  --only-main-content \
  -o .firecrawl/auth.md

Then tell the agent exactly what to do with that file:

summarize setup steps
list required headers
extract code examples
compare auth methods

This two-step pattern is often more dependable than asking for scraping and analysis in one vague request.

Fix missing content before changing the whole workflow

If output looks thin, first test whether the page needs rendering time:

firecrawl scrape "<url>" --wait-for 3000 -o .firecrawl/page.md

Many users switch tools too early when the real issue is simply that the page had not finished rendering.

Reduce noise before downstream analysis

If the result is full of navigation, cookie text, or footer content, switch to:

firecrawl scrape "<url>" --only-main-content -o .firecrawl/page.md

This often improves:

summarization quality
extraction precision
token efficiency
consistency across similar pages

Use structured output when you plan to automate

If the scraped page feeds another step, ask for structured formats up front rather than reparsing markdown later:

firecrawl scrape "<url>" --format markdown,links -o .firecrawl/page.json

That makes firecrawl-scrape install decisions easier too: if your workflow depends on link-aware automation, this skill has a clearer fit than plain text fetch tools.

Iterate after the first run, not before

A productive firecrawl-scrape guide pattern is:

run the simplest scrape
inspect what is missing or noisy
add one option to fix that specific issue
rerun and compare

Typical iteration path:

baseline scrape
add --only-main-content
add --wait-for
add --format markdown,links
use --query for direct extraction

This is faster than designing a complex command before you have seen the page output.

Common failure modes to watch for

The biggest practical issues are:

using a homepage when the real target is a subpage
expecting scrape to behave like crawl
not waiting for JS-rendered content
asking --query questions that require multiple pages
saving only final summaries instead of raw scrape output

Most of these are avoidable with clearer scope and one inspection pass.

How advanced users get more from firecrawl-scrape

Advanced users usually improve results by composing firecrawl-scrape with later steps, not by overcomplicating the scrape itself. A strong pattern is:

scrape exact pages cleanly
save raw outputs
run extraction, comparison, or synthesis afterward

That keeps firecrawl-scrape for Web Scraping focused on the page retrieval layer, where it performs best.

Ratings & Reviews

No ratings yet

Share your review

0/10000

Latest reviews

Saving...

more skill

huggingface-datasets

by huggingface

Use the huggingface-datasets skill for Hugging Face Dataset Viewer API workflows to validate datasets, resolve splits, preview and paginate rows, search text, apply filters, and fetch parquet links or statistics. It is a practical huggingface-datasets guide for read-only dataset exploration.

Web Scraping

Favorites 0GitHub 10.4k

data-scraper-agent

by affaan-m

data-scraper-agent helps build a repeatable public-data pipeline for web scraping, enrichment, and storage. It is designed for monitoring jobs, prices, news, repos, sports, and listings on a schedule using GitHub Actions, with outputs to Notion, Sheets, or Supabase. Best for ongoing tracking, not one-off extractions.

Web Scraping

Favorites 0GitHub 156.1k

baoyu-url-to-markdown

by JimLiu

baoyu-url-to-markdown converts live URLs to Markdown with a vendored baoyu-fetch CLI using Chrome CDP, site adapters, and generic fallback. Review Bun runtime needs, first-time EXTEND.md setup, and usage for X, YouTube, Hacker News, and rendered pages.

Format Conversion

Favorites 0GitHub 13.2k

x-twitter-scraper

by Xquik-dev

Use x-twitter-scraper to retrieve X (Twitter) data and confirmation-gated actions through Xquik. It supports tweet search, user lookup, follower extraction, media download, monitors, webhooks, MCP, and write actions. Best for web scraping-style research with an API key, not X login secrets.

Web Scraping

Favorites 0GitHub 71

exa-search

by K-Dense-AI

exa-search is a web research skill powered by Exa for finding current information and extracting content from URLs. Use it for search, source discovery, article and PDF extraction, and technical or scientific research with semantic retrieval, academic-style filtering, and clear install and usage guidance.

Web Research

Favorites 0GitHub 0

browser-use

by browser-use

browser-use is a browser automation skill for opening pages, inspecting state, clicking indexed elements, typing into fields, taking screenshots, and reusing a persistent browser session. Use it for reliable form filling, navigation, and logged-in workflows with the browser-use CLI.

Browser Automation

Favorites 0GitHub 84.9k

remote-browser

by browser-use

remote-browser helps sandboxed agents control a headless browser for Browser Automation. Use it to open pages, inspect state, click indexed elements, type input, take screenshots, and connect to local apps or CDP-backed browser sessions.

Browser Automation

Favorites 0GitHub 84.9k

firecrawl

by firecrawl

firecrawl skill for installing, authenticating, and using the official Firecrawl CLI for web scraping, search, crawling, and page interaction. Learn setup, `firecrawl --status`, login, safe file output to `.firecrawl/`, and practical usage patterns backed by the repo.

Web Scraping

Favorites 0GitHub 234

firecrawl-search

by firecrawl

firecrawl-search is a web research skill for finding sources, running structured search, and optionally scraping full page content as JSON with Firecrawl CLI.

Web Research

Favorites 0GitHub 234

parallel-web

by K-Dense-AI

parallel-web is a web research and extraction skill powered by parallel-cli. It helps you search the web, extract URL content, enrich data from sources, and run deeper research with academic and scientific sources prioritized. Use it for parallel-web usage, web research, citations, and evidence-first workflows.

Web Research

Favorites 0GitHub 0

geomaster

by K-Dense-AI

geomaster is a geospatial science skill for GIS, remote sensing, spatial analysis, and Earth observation workflows. Use it for Data Analysis tasks like raster and vector operations, satellite imagery processing, spatial metrics, and workflow planning. The geomaster guide helps you install, inspect, and apply the skill with less guesswork.

Data Analysis

Favorites 0GitHub 0

asc-aso-audit

by rudrankriyam

asc-aso-audit helps you run an offline ASO audit on canonical App Store metadata in `./metadata`, then surface keyword gaps with Astro MCP. Use the asc-aso-audit skill after `asc metadata pull` to review `subtitle`, `keywords`, `description`, and `whatsNew` with less guesswork.

Data Analysis

Favorites 0GitHub 0

ffuf-web-fuzzing

by jthack

ffuf-web-fuzzing is a practical skill for discovering hidden web content, testing routes and parameters, and fuzzing authenticated targets with raw requests, auto-calibration, and result analysis. It fits security testers who need a repeatable ffuf-web-fuzzing guide for penetration testing and Security Audit workflows.

Security Audit

Favorites 0GitHub 0

web-to-markdown

by softaworks

web-to-markdown is a Format Conversion skill that turns live web pages into clean Markdown through the local web2md CLI, using a Chromium-family browser for JS-rendered pages, interactive flows, and batch URL conversion. It only runs when explicitly invoked by name.

Format Conversion

Favorites 0GitHub 1.3k

firecrawl-agent

by firecrawl

firecrawl-agent helps extract structured JSON from complex, multi-page websites. Learn when to use it, how to run the Firecrawl CLI agent, add schemas, set starting URLs, and save outputs for pricing, products, and directory-style data extraction.

Web Scraping

Favorites 0GitHub 234

firecrawl-map

by firecrawl

firecrawl-map helps agents discover and list URLs on a site, with options for search filtering, limits, JSON output, sitemap modes, and subdomain control before deeper scraping or crawling.

Web Scraping

Favorites 0GitHub 234