firecrawl-download

by firecrawl

firecrawl-download helps you download a site or docs section into organized local files under .firecrawl/. It combines site mapping and scraping, supports markdown, links, and screenshots, and is useful for offline docs copies, bulk page capture, and practical Web Scraping workflows.

Stars234

Favorites0

Comments0

AddedMar 31, 2026

CategoryWeb Scraping

Install Command

npx skills add firecrawl/cli --skill firecrawl-download

Curation Score

This skill scores 73/100, which means it is listable for directory users: the trigger is clear and the workflow is real, but adoption still requires some guesswork because the repository provides only a single SKILL.md with limited operational detail beyond command examples.

73/100

Strengths

Strong triggerability: the description names concrete user intents like "download the site," "offline copy," and "download all the docs."
Real agent leverage: it combines site mapping and scraping into one command and documents useful options like formats, screenshots, include-paths, and limits.
Reasonably actionable examples: the SKILL.md includes quick-start commands and explicitly notes using `-y` to skip confirmation prompts.

Cautions

Operational depth is limited: there are no support files, references, install instructions, or decision rules for handling failures, scale limits, or output management.
The skill is explicitly marked experimental, which raises some trust and stability risk for production-style agent workflows.

Firecrawl Cli Scraping Websites Offline Markdown

Overview

Overview of firecrawl-download skill

What firecrawl-download does

The firecrawl-download skill is for one specific job: downloading a website or documentation section into organized local files. It combines site discovery and page scraping, then saves each page under .firecrawl/ as markdown, screenshots, or multiple output formats per page.

This is most useful if you want an offline copy of docs, a local research corpus, or a repeatable way to bulk-save pages for later analysis. Compared with a generic scraping prompt, firecrawl-download gives you a clearer path for whole-site capture instead of making you design a crawl workflow from scratch.

Who should use this firecrawl-download skill

Best-fit users are:

developers saving documentation locally
researchers collecting site content for review
teams building a lightweight content archive
agents that need a practical “download this site” workflow with less guesswork

If your real goal is “save this site as usable local files,” this skill is a better fit than a broad web scraping prompt.

What users care about before installing

Most install decisions for firecrawl-download come down to four questions:

Can it handle an entire site or docs section, not just one page?
Does it save output in a usable local structure?
Can it filter scope so you do not download the wrong pages?
Does it support multiple output types like markdown and screenshots?

Based on the skill source, the answer is yes to all four. The main caveat is that it is marked experimental, so treat it as a convenience workflow rather than a deeply hardened archival system.

Key differentiator for Web Scraping workflows

The differentiator of firecrawl-download for Web Scraping is not raw scraping power alone. It is that the command bundles:

site mapping first
scraping second
per-page file output
nested local directories
reuse of scrape options during download

That makes it more install-worthy for “download docs” use cases than a plain scrape command that only returns page content.

How to Use firecrawl-download skill

Install context for firecrawl-download

The repository evidence points to this skill living in firecrawl/cli under skills/firecrawl-download. A practical install path is:

npx skills add https://github.com/firecrawl/cli --skill firecrawl-download

After adding it, inspect:

skills/firecrawl-download/SKILL.md

This skill has minimal support files, so SKILL.md is the main source of truth.

Read this file first

Start with:

skills/firecrawl-download/SKILL.md

That file tells you the real scope quickly: firecrawl download is an experimental convenience command that combines map and scrape, saves results under .firecrawl/, and supports scrape options during download.

Basic firecrawl-download usage

The fastest way to use the firecrawl-download skill is to point it at a docs or content root:

firecrawl download https://docs.example.com

For unattended runs, the skill explicitly recommends:

firecrawl download https://docs.example.com -y

Use -y whenever you want to skip confirmation prompts in agentic or scripted workflows.

Inputs the skill needs to work well

A rough prompt like “download this site” is often too weak. Better inputs include:

the root URL
the section boundaries you actually want
max page count
output formats needed
whether screenshots matter
what to exclude

A stronger request looks like:

“Use firecrawl-download to save https://docs.example.com locally as markdown with screenshots, include only /guides and /api, limit to 50 pages, and skip translated pages.”

That gives the skill enough information to map the right scope before scraping.

Commands that matter most in practice

The source shows a few high-value patterns:

# With screenshots
firecrawl download https://docs.example.com --screenshot --limit 20 -y

# Multiple formats per page
firecrawl download https://docs.example.com --format markdown,links --screenshot --limit 20 -y

# Filter by section
firecrawl download https://docs.example.com --include-paths "/features,/sdks"

These examples matter because they reflect real adoption blockers: too much content, wrong sections, or not enough output fidelity.

What gets written locally

The skill saves output into nested directories under .firecrawl/. When you request multiple formats, each page can produce separate files such as:

index.md
links.txt
screenshot.png

That local file organization is one of the main reasons to choose firecrawl-download install over a one-off scrape prompt.

How to turn a rough goal into a usable prompt

If your first thought is:

“download this docs site”

rewrite it as:

target URL
desired section filters
file formats
screenshot yes/no
page limit
any exclusions

Example prompt for an agent:

“Use the firecrawl-download skill to download https://docs.example.com for offline use. Save as markdown plus screenshots, include only /getting-started,/api, cap at 30 pages, and use -y so the run is non-interactive.”

This works better because it removes ambiguity around scope and output.

Suggested workflow for reliable results

A practical firecrawl-download guide workflow is:

Start with the smallest useful docs section.
Add --include-paths before increasing page count.
Run with --limit on the first pass.
Check the .firecrawl/ output structure.
Add --screenshot or multiple formats only if you actually need them.
Expand the crawl after the first sample looks right.

This avoids the common failure mode of downloading too much, too soon.

When to use firecrawl-download instead of a normal scrape

Use firecrawl-download usage when you need:

many pages, not one
local files, not just returned text
a browsable offline copy
a quick docs snapshot for review or reference

Use a normal scrape when you only need one page or highly custom extraction logic. The value of firecrawl-download is workflow speed for site-scale saving.

Constraints and tradeoffs to know early

The biggest practical constraints from the skill source are:

it is marked experimental
it is optimized as a convenience command
output quality still depends on target site structure and your scope filters
broad runs without limits can be noisy or excessive

So the skill is a strong fit for controlled docs downloads, but not a guarantee of perfect archival completeness.

firecrawl-download skill FAQ

Is firecrawl-download good for beginners?

Yes, especially if your task is simply “save docs locally.” The command examples are straightforward, and the interactive wizard helps. Beginners should still start with a small --limit and narrow --include-paths to avoid oversized downloads.

What is the real difference from a generic AI scraping prompt?

A generic prompt can describe the task, but firecrawl-download already encodes the useful pattern: map the site, scrape each page, and save files in directories. That reduces setup friction and makes the workflow more repeatable.

Is firecrawl-download only for documentation sites?

No, but documentation is the clearest fit. It works best on sites where page structure and paths are reasonably predictable. Highly dynamic or poorly scoped sites may require more filtering or a different approach.

Can firecrawl-download save more than markdown?

Yes. The skill source explicitly shows multiple formats per page and optional screenshots. That is important if you need both readable text and supporting visual capture.

When should I not use firecrawl-download?

Skip firecrawl-download if you only need:

one page
a custom extraction schema
deep post-processing during scrape time
a fully robust archival pipeline with stricter guarantees

In those cases, a narrower scrape command or a more custom workflow may be a better fit.

How to Improve firecrawl-download skill

Give firecrawl-download tighter scope first

The easiest way to improve firecrawl-download results is to reduce ambiguity. Use:

--include-paths
--limit
a clear docs root URL

A scoped 20-page run is usually more useful than an uncontrolled full-site run.

Choose outputs based on the actual downstream job

Do not request every format by default. Pick formats that match the next step:

markdown for reading, search, and LLM ingestion
links when structure matters
--screenshot when layout or UI evidence matters

This keeps runs lighter and output easier to review.

Use a sample run before full download

A strong iteration pattern is:

firecrawl download https://docs.example.com --include-paths "/api" --limit 10 -y

Review the saved files, then expand to more sections or higher limits. This catches bad scope decisions early.

Common failure modes and how to avoid them

Typical problems are:

downloading the wrong sections
collecting too many pages
forgetting -y in automated runs
asking for outputs you do not actually need

The fix is simple: specify scope, limit the first run, and choose outputs intentionally.

Improve prompt quality for agent-driven usage

If an agent is calling the skill, ask for:

exact start URL
desired local output purpose
sections to include
sections to avoid
output formats
run size limit

Good prompt:

“Use firecrawl-download to create an offline markdown copy of https://docs.example.com, only for /guides and /reference, with screenshots for each page, limited to 40 pages, and save non-interactively.”

That produces better execution than “download the docs.”

How to iterate after the first output

After the first pass, evaluate:

Did .firecrawl/ contain the pages you expected?
Were there too many irrelevant pages?
Did you need screenshots or only text?
Should the next run widen or narrow include paths?

The best way to improve the firecrawl-download skill is not to rerun blindly, but to adjust scope and output choices based on what the first batch actually produced.

Ratings & Reviews

No ratings yet

Share your review

0/10000

Latest reviews

Saving...

more skill

huggingface-datasets

by huggingface

Use the huggingface-datasets skill for Hugging Face Dataset Viewer API workflows to validate datasets, resolve splits, preview and paginate rows, search text, apply filters, and fetch parquet links or statistics. It is a practical huggingface-datasets guide for read-only dataset exploration.

Web Scraping

Favorites 0GitHub 10.4k

data-scraper-agent

by affaan-m

data-scraper-agent helps build a repeatable public-data pipeline for web scraping, enrichment, and storage. It is designed for monitoring jobs, prices, news, repos, sports, and listings on a schedule using GitHub Actions, with outputs to Notion, Sheets, or Supabase. Best for ongoing tracking, not one-off extractions.

Web Scraping

Favorites 0GitHub 156.1k

baoyu-url-to-markdown

by JimLiu

baoyu-url-to-markdown converts live URLs to Markdown with a vendored baoyu-fetch CLI using Chrome CDP, site adapters, and generic fallback. Review Bun runtime needs, first-time EXTEND.md setup, and usage for X, YouTube, Hacker News, and rendered pages.

Format Conversion

Favorites 0GitHub 13.2k

x-twitter-scraper

by Xquik-dev

Use x-twitter-scraper to retrieve X (Twitter) data and confirmation-gated actions through Xquik. It supports tweet search, user lookup, follower extraction, media download, monitors, webhooks, MCP, and write actions. Best for web scraping-style research with an API key, not X login secrets.

Web Scraping

Favorites 0GitHub 71

exa-search

by K-Dense-AI

exa-search is a web research skill powered by Exa for finding current information and extracting content from URLs. Use it for search, source discovery, article and PDF extraction, and technical or scientific research with semantic retrieval, academic-style filtering, and clear install and usage guidance.

Web Research

Favorites 0GitHub 0

browser-use

by browser-use

browser-use is a browser automation skill for opening pages, inspecting state, clicking indexed elements, typing into fields, taking screenshots, and reusing a persistent browser session. Use it for reliable form filling, navigation, and logged-in workflows with the browser-use CLI.

Browser Automation

Favorites 0GitHub 84.9k

remote-browser

by browser-use

remote-browser helps sandboxed agents control a headless browser for Browser Automation. Use it to open pages, inspect state, click indexed elements, type input, take screenshots, and connect to local apps or CDP-backed browser sessions.

Browser Automation

Favorites 0GitHub 84.9k

firecrawl

by firecrawl

firecrawl skill for installing, authenticating, and using the official Firecrawl CLI for web scraping, search, crawling, and page interaction. Learn setup, `firecrawl --status`, login, safe file output to `.firecrawl/`, and practical usage patterns backed by the repo.

Web Scraping

Favorites 0GitHub 234

firecrawl-search

by firecrawl

firecrawl-search is a web research skill for finding sources, running structured search, and optionally scraping full page content as JSON with Firecrawl CLI.

Web Research

Favorites 0GitHub 234

parallel-web

by K-Dense-AI

parallel-web is a web research and extraction skill powered by parallel-cli. It helps you search the web, extract URL content, enrich data from sources, and run deeper research with academic and scientific sources prioritized. Use it for parallel-web usage, web research, citations, and evidence-first workflows.

Web Research

Favorites 0GitHub 0

geomaster

by K-Dense-AI

geomaster is a geospatial science skill for GIS, remote sensing, spatial analysis, and Earth observation workflows. Use it for Data Analysis tasks like raster and vector operations, satellite imagery processing, spatial metrics, and workflow planning. The geomaster guide helps you install, inspect, and apply the skill with less guesswork.

Data Analysis

Favorites 0GitHub 0

asc-aso-audit

by rudrankriyam

asc-aso-audit helps you run an offline ASO audit on canonical App Store metadata in `./metadata`, then surface keyword gaps with Astro MCP. Use the asc-aso-audit skill after `asc metadata pull` to review `subtitle`, `keywords`, `description`, and `whatsNew` with less guesswork.

Data Analysis

Favorites 0GitHub 0

ffuf-web-fuzzing

by jthack

ffuf-web-fuzzing is a practical skill for discovering hidden web content, testing routes and parameters, and fuzzing authenticated targets with raw requests, auto-calibration, and result analysis. It fits security testers who need a repeatable ffuf-web-fuzzing guide for penetration testing and Security Audit workflows.

Security Audit

Favorites 0GitHub 0

web-to-markdown

by softaworks

web-to-markdown is a Format Conversion skill that turns live web pages into clean Markdown through the local web2md CLI, using a Chromium-family browser for JS-rendered pages, interactive flows, and batch URL conversion. It only runs when explicitly invoked by name.

Format Conversion

Favorites 0GitHub 1.3k

firecrawl-agent

by firecrawl

firecrawl-agent helps extract structured JSON from complex, multi-page websites. Learn when to use it, how to run the Firecrawl CLI agent, add schemas, set starting URLs, and save outputs for pricing, products, and directory-style data extraction.

Web Scraping

Favorites 0GitHub 234

firecrawl-map

by firecrawl

firecrawl-map helps agents discover and list URLs on a site, with options for search filtering, limits, JSON output, sitemap modes, and subdomain control before deeper scraping or crawling.

Web Scraping

Favorites 0GitHub 234