firecrawl-map
by firecrawlfirecrawl-map helps agents discover and list URLs on a site, with options for search filtering, limits, JSON output, sitemap modes, and subdomain control before deeper scraping or crawling.
This skill scores 76/100, which means it is a solid directory listing candidate: agents get clear trigger cues, concrete CLI examples, and enough option coverage to use it with less guesswork than a generic prompt. Directory users can make a credible install decision, though they should expect a fairly lean skill page without much edge-case or setup guidance.
- Very strong triggerability: the description names explicit user intents like “map the site,” “find the URL for,” and “list all pages.”
- Operationally clear examples show real commands for both targeted search and full URL discovery, including output files and JSON mode.
- Useful leverage in a broader workflow: it positions map as a step in a search → scrape → map → crawl → interact pattern.
- Install/adoption clarity is limited because the skill does not include an install command or setup guidance in SKILL.md.
- Support material is minimal: no scripts, references, resources, or explicit constraints/edge-case guidance are present.
Overview of firecrawl-map skill
What firecrawl-map does
firecrawl-map is a focused skill for discovering URLs on a website. It is best used when you know the domain but do not know the exact page, or when you want a fast inventory of site structure before scraping, crawling, or extracting content.
Who should use firecrawl-map skill
The best fit for the firecrawl-map skill is anyone doing web research, site discovery, or pre-scrape planning:
- AI agents that need to find the right page before deeper extraction
- Developers building web scraping workflows
- Researchers auditing a site's public URL footprint
- Operators who need a quick list of URLs without launching a full crawl
The real job-to-be-done
Users typically do not want “all pages” as an end in itself. They want to answer questions like:
- “Where is the authentication doc on this site?”
- “What pages exist under this domain before I scrape?”
- “Is there a sitemap-backed shortcut to discover URLs quickly?”
- “Should I map first or jump straight to crawl?”
That makes firecrawl-map for Web Scraping especially useful as a discovery step, not a final data extraction step.
Why people choose firecrawl-map
The main differentiator is speed and scope control. Compared with a generic prompt like “find the docs page,” the firecrawl-map skill gives you a reproducible CLI path for listing URLs, filtering by search terms, and exporting output for later steps.
Key strengths surfaced by the repository:
- Direct CLI usage with
firecrawl map - Optional
--searchfiltering for large sites - URL inventory output in text or JSON
- Supports sitemap strategy selection
- Useful as a middle step between search and deeper crawl/scrape work
What it is not for
firecrawl-map is not the right tool when you need:
- Full page content extraction
- Interactive browsing
- Detailed structured scraping from each page
- Rich site traversal logic beyond URL discovery
In those cases, mapping is the setup step, not the finish line.
How to Use firecrawl-map skill
Install context for firecrawl-map skill
This skill lives in the firecrawl/cli repository under skills/firecrawl-map. It is designed to be invoked in environments that can run:
firecrawl *npx firecrawl *
If your agent or local workflow can execute Bash commands, this firecrawl-map install path is usually enough:
npx firecrawl map "<url>" --limit 100
If you already have the Firecrawl CLI available globally, use:
firecrawl map "<url>" --limit 100
Read this file first before using it
Start with:
skills/firecrawl-map/SKILL.md
This repository slice is small, so there is not much supporting material to inspect. That is good for adoption speed, but it also means you should be explicit in your prompts about domain, goal, and output format.
Basic firecrawl-map usage patterns
The skill supports two common usage modes.
- Find a likely page by topic:
firecrawl map "https://example.com" --search "authentication" -o .firecrawl/filtered.txt
- Get a broader URL inventory:
firecrawl map "https://example.com" --limit 500 --json -o .firecrawl/urls.json
This is the core firecrawl-map usage pattern: start narrow with search if you are hunting for one page, or start broad with a capped URL list if you are planning the next scraping step.
What input the skill needs
To use firecrawl-map skill well, provide these inputs clearly:
- The root URL or domain
- Whether you need one likely page or many URLs
- A search phrase, if you know the topic
- Desired limit on returned URLs
- Output format: plain text or JSON
- Whether subdomains should count
- How to treat sitemaps
Weak input:
- “Find docs on this site”
Strong input:
- “Map
https://docs.example.com, search forauthentication, return top matching URLs as JSON, and include subdomains only if the main docs domain has too few results.”
The stronger version reduces guesswork and makes the command choice obvious.
How to turn a rough request into a strong prompt
A good firecrawl-map guide for prompting is to specify five things in one sentence:
- site
- intent
- scope
- filter
- output
Example:
- “Use firecrawl-map on
https://example.comto list up to 200 public URLs, prefer sitemap discovery, skip unrelated subdomains, and save JSON output for later scraping.”
Example for targeted discovery:
- “Use firecrawl-map to find the page on
https://example.commost related topricing API limits, and write matching URLs to a text file.”
Best workflow: map before scrape or crawl
A practical workflow looks like this:
- Use
firecrawl mapwith--searchif you are trying to locate one page. - Use
firecrawl mapwith--limitand--jsonif you need a broader URL set. - Review the returned URLs.
- Select the most relevant pages.
- Move to scrape or crawl only after you know the site structure well enough.
This saves time and cost compared with scraping blindly.
Options that materially change output quality
The most important options are:
--search <query>: best for locating a topic page on a large site--limit <n>: prevents oversized result sets--json: makes downstream filtering and automation easier--sitemap <include|skip|only>: useful when sitemap coverage matters--include-subdomains: expands scope, but can add noise-o, --output <path>: makes results reusable in a pipeline
If results are noisy, the first things to tighten are search phrase, domain scope, and subdomain inclusion.
Choosing sitemap strategy
The --sitemap option matters more than many users expect:
only: fastest when you trust the site's sitemap and want cleaner coverageinclude: good default when you want sitemap help without depending on it fullyskip: useful when sitemap results are stale, incomplete, or misleading
For documentation sites, include or only often produces better firecrawl-map for Web Scraping results than unconstrained discovery.
When to include subdomains
Use --include-subdomains only if the target content may live outside the main hostname, such as:
docs.example.comdevelopers.example.comsupport.example.com
Do not enable it by default for corporate sites unless you truly want broader coverage. It can flood your URL list with marketing, support, or app surfaces unrelated to your goal.
Practical examples users actually need
Find a login or auth doc page:
firecrawl map "https://docs.example.com" --search "authentication" -o .firecrawl/auth-pages.txt
Get a reusable JSON URL inventory:
firecrawl map "https://example.com" --limit 300 --json -o .firecrawl/site-map.json
Prefer sitemap-only discovery for a docs site:
firecrawl map "https://docs.example.com" --sitemap only --limit 500 --json
Broaden scope to subdomains when docs location is unclear:
firecrawl map "https://example.com" --search "API reference" --include-subdomains
Common adoption blockers
The main reasons people struggle with the firecrawl-map skill are not installation issues but request quality issues:
- Starting with too broad a domain
- Forgetting to add
--searchwhen hunting one page - Pulling too many URLs without a limit
- Including subdomains too early
- Treating map as a content extraction tool
If the first result is messy, narrow the site and sharpen the topic before changing tools.
firecrawl-map skill FAQ
Is firecrawl-map better than a normal prompt?
Yes when the task is URL discovery on a known site. A normal prompt may guess likely pages, but firecrawl-map gives a concrete, repeatable way to enumerate and filter URLs from the target domain.
Is firecrawl-map skill good for beginners?
Yes, because the command surface is small. The easiest starting point is one of these two commands:
firecrawl map "https://example.com" --search "pricing"
firecrawl map "https://example.com" --limit 100 --json
The main beginner mistake is asking it to extract page content, which is outside the skill's core purpose.
When should I use firecrawl-map instead of crawling?
Use firecrawl-map first when you need to understand site structure or locate candidate pages. Use crawling later when you need broader traversal or page-level processing after discovery.
When should I not use firecrawl-map?
Skip it if:
- You already know the exact URL
- You need page text, metadata, or structured extraction
- You need browser interaction rather than URL listing
- The task is not site discovery
Does firecrawl-map work well for large sites?
Yes, but only if you control scope. Use --search, --limit, and sitemap strategy deliberately. Large sites are where firecrawl-map usage gets the most value, but also where loose prompts create the most noise.
What output format should I choose?
Choose plain text when a human just needs a quick page list. Choose --json when another tool, script, or downstream step will process the results.
How to Improve firecrawl-map skill
Start with a narrower target than you think
The easiest way to improve firecrawl-map results is to reduce scope early. If you know the content is likely in docs, use the docs hostname directly instead of the company's homepage.
Better:
https://docs.example.com
Worse:
https://example.com
Use search phrases that match page intent
For the firecrawl-map skill, search quality matters more than keyword quantity. Short intent phrases usually beat stuffed queries.
Better:
authenticationrate limitsAPI reference
Worse:
where can I find complete developer authentication API reference and login documentation
The better version is easier for URL filtering and usually returns cleaner matches.
Pick JSON whenever results feed another step
If your next step is scrape, filter, classify, or deduplicate, use:
--json
This small choice makes the firecrawl-map guide much more automation-friendly and reduces manual cleanup.
Use map iteratively, not once
A strong workflow is:
- Run a narrow
--search - Inspect likely URLs
- Run a second map on the best subdomain or section
- Increase
--limitonly if needed - Move to scrape/crawl after discovery stabilizes
This beats one huge run because it keeps signal high.
Watch for common failure modes
Typical failure modes with firecrawl-map for Web Scraping:
- Too many irrelevant URLs from broad domains
- Missing target pages because search terms are vague
- Incomplete inventories from relying on the wrong sitemap strategy
- Noisy results from enabling subdomains unnecessarily
Each has a simple fix: tighten site, sharpen query, change sitemap mode, or shrink scope.
Improve prompts by specifying success criteria
Do not just ask for “all URLs.” Say what would count as success.
Example:
- “Use firecrawl-map to find pages related to authentication setup on
https://docs.example.com. Return the most relevant URLs first, cap at 50, and save JSON output for follow-up scraping.”
That makes the tool choice, parameters, and stopping point much clearer.
Keep a simple escalation path
Use this practical decision path:
- Need one likely page:
map --search - Need a URL inventory:
map --limit --json - Need page content: scrape after map
- Need broader traversal: crawl after map
This is the most useful way to improve firecrawl-map outcomes without overcomplicating your workflow.
