web-to-markdown
by softaworksweb-to-markdown is a Format Conversion skill that turns live web pages into clean Markdown through the local web2md CLI, using a Chromium-family browser for JS-rendered pages, interactive flows, and batch URL conversion. It only runs when explicitly invoked by name.
This skill scores 77/100, which means it is a solid directory listing candidate for users who specifically want webpage-to-Markdown conversion via a local browser-driven CLI. It is clear enough for an agent to follow with less guesswork than a generic prompt, but install-decision clarity is held back by missing setup specifics in the skill itself and its dependence on an external local tool/browser environment.
- Strong operational framing: the skill clearly states what it does, what it will not do, and which inputs to collect before running.
- Real agent leverage over a generic prompt: it targets JS-rendered pages through a local browser stack and documents practical flags like `--print`, `--out`, `--chrome-path`, and `--interactive`.
- Repository evidence is substantive rather than placeholder content, with both SKILL.md and README explaining purpose, workflow, and usage constraints.
- Adoption is less turnkey because SKILL.md has no install command and the skill depends on a locally available `web2md` CLI plus a Chromium-family browser.
- The hard trigger gate requires the user to explicitly name `web-to-markdown`, which improves safety but makes the skill less naturally triggerable from ordinary web-extraction requests.
Overview of web-to-markdown skill
web-to-markdown is a narrowly scoped Format Conversion skill for turning live web pages into clean Markdown through a locally installed web2md CLI. Its value is not “summarize a page” but “render the actual page in a real browser, extract the main article or document body, and convert that result into portable Markdown.” That makes it a strong fit for users dealing with JavaScript-rendered pages, documentation sites, blog posts, gated flows that need interactive rendering, or archiving tasks where simple HTTP fetching is not enough.
Who web-to-markdown is best for
This web-to-markdown skill is best for users who need to:
- convert one or more URLs into readable Markdown
- handle pages that depend on client-side JavaScript
- save content to files for later analysis or reuse
- extract article-like content instead of scraping every page element
If your real goal is “get the main content from a page I can already access in a browser,” this skill is a better fit than a generic prompt.
What makes web-to-markdown different
The important differentiator is the pipeline:
Puppeteervia a local Chromium-family browserReadabilityfor main-content extractionTurndownfor Markdown conversion
That combination is designed for rendered content, not raw HTML. In practice, that means the web-to-markdown skill can work on pages where ordinary fetch-based tools fail or return incomplete content.
The hard trigger gate matters
This skill has an unusual but important constraint: it must only be used when the user explicitly requests it by name, with wording like use the skill web-to-markdown. If that explicit trigger is missing, the skill should not be applied. For directory users, this means adoption is simple, but invocation discipline matters.
Real job-to-be-done
Most users are not looking for “a browser automation skill.” They want one of these outcomes:
- “Turn this article into Markdown I can keep.”
- “Convert this docs page, even though it renders client-side.”
- “Process a batch of URLs into
.mdfiles.” - “Open the page in a real browser so I can get past login or verification, then save the content.”
That is the real use case web-to-markdown is optimized for.
When not to choose this skill
Skip web-to-markdown if:
- you only need a quick summary, not Markdown output
- a plain HTTP fetch already gives you the content cleanly
- you need a full crawler or site scraper
- you want Playwright-based automation; this skill explicitly uses
web2md, not other browser stacks
How to Use web-to-markdown skill
Install context before first use
Treat web-to-markdown as two dependencies:
- the skill itself in your agent environment
- a working local
web2mdCLI plus an available Chromium-family browser
A practical skill install path is:
npx skills add softaworks/agent-toolkit --skill web-to-markdown
The repository is at:
https://github.com/softaworks/agent-toolkit/tree/main/skills/web-to-markdown
Just adding the skill is not enough if your machine cannot run web2md or launch Chrome/Chromium/Brave/Edge. That local browser requirement is the main adoption blocker to check early.
Read these files first
This skill is small, so the best reading order is:
skills/web-to-markdown/SKILL.mdskills/web-to-markdown/README.md
SKILL.md gives you the trigger rule, required inputs, and workflow shape. README.md is where you confirm intended use cases such as JS-rendered pages, interactive mode, and batch conversion.
What input web-to-markdown needs
For reliable web-to-markdown usage, provide:
- a
urlor list of URLs - output mode:
- print to stdout with
--print - write to a file with
--out ./file.md - write to a directory with
--out ./some-dir/
- print to stdout with
- optional browser controls when needed:
--chrome-path <path>if browser detection fails--interactivefor login walls, consent screens, or human verification
If you do not specify output behavior, the agent has to guess. That is unnecessary friction and often the easiest thing to make explicit.
The exact invocation requirement
This web-to-markdown skill should only be triggered when the user explicitly writes something like:
use the skill web-to-markdown ...use a skill web-to-markdown ...
If you are testing the skill, say the name directly. This is not optional repository etiquette; it is core execution logic.
Turn a rough request into a strong prompt
Weak request:
convert this page
Strong request:
use the skill web-to-markdown to convert https://example.com/article to Markdown and save it to ./notes/article.md
Even better:
use the skill web-to-markdown to convert these 5 docs URLs to Markdown, save them in ./docs-md/, and use interactive mode if a consent screen appears
Good prompts reduce failure by telling the skill:
- what page(s) to process
- where output should go
- whether browser interaction may be needed
- whether this is a one-off or a batch job
Practical command patterns to ask for
Useful web-to-markdown usage patterns include:
- single page to terminal:
--print - single page to file:
--out ./page.md - many pages to a folder:
--out ./pages/ - difficult page with visible browser:
--interactive - explicit browser binary path:
--chrome-path <path>
The repository guidance makes these patterns more valuable than open-ended requests like “scrape this site,” which are broader than the skill’s design.
Best workflow for one page
A high-success workflow looks like this:
- confirm the user explicitly invoked
web-to-markdown - collect the URL
- decide whether output should print or save
- use
--interactiveonly for pages that need human help - review the Markdown result for missing sections or navigation noise
- rerun with better browser settings if extraction was incomplete
This is faster than trying to overdesign the prompt up front.
Best workflow for multiple URLs
For batch work:
- give the skill a list of URLs
- choose a directory output target
- expect filenames to be derived from page titles when saving to a folder
- spot-check a few outputs before running a large batch
The main reason to batch is consistency. The main risk is assuming every page template on a site extracts equally well.
Common local setup blockers
Most failed web-to-markdown installs are not prompt problems. They are local environment issues:
web2mdis not installed or not onPATH- no supported browser is available locally
- browser auto-detection fails, requiring
--chrome-path - the page needs a visible browser and human interaction
If you want a quick adoption test, try one public article page and one JS-heavy page before using the skill in production workflows.
Output quality expectations
web-to-markdown aims for clean main-content Markdown, not a pixel-perfect copy of the original page. That means:
- article and documentation body content should come through well
- headers, footers, ads, and page chrome are usually de-emphasized
- unusual widgets, app shells, and embedded tools may not convert neatly
That tradeoff is usually desirable for archiving and analysis, but it is worth knowing before you install.
web-to-markdown skill FAQ
Is web-to-markdown better than an ordinary prompt?
Yes, when the real need is rendered-page conversion. A generic prompt can discuss a URL, but it does not inherently open a browser, wait for JavaScript, extract the readable body, and produce Markdown. This web-to-markdown skill is useful because it operationalizes that workflow.
Is web-to-markdown good for beginners?
Yes, if your task is simple: one URL, one output file, straightforward page. The main beginner challenge is local setup, not the skill design. If you can run a local browser automation CLI, the skill is approachable.
Does web-to-markdown handle JavaScript-heavy pages?
That is one of its main reasons to exist. It uses a real local browser through Puppeteer, so it is more suitable for JS-rendered pages than raw-fetch approaches.
Can web-to-markdown get past login or verification screens?
Sometimes, with --interactive. The repository explicitly supports a mode where Chrome is shown and paused so the user can complete human steps. This is a practical advantage for protected or semi-protected pages.
When should I not use the web-to-markdown skill?
Do not use it when:
- the user did not explicitly request
web-to-markdown - a simple page fetch would already solve the task
- you need structured scraping across many page components
- you want a non-browser conversion path
The skill is specialized, and that specialization is a strength, not a weakness.
Does it work with any browser?
The documented fit is Chromium-family browsers such as Chrome, Chromium, Brave, or Edge via puppeteer-core. If auto-detection fails, expect to supply a path manually.
Is this only for articles?
No. Articles are the easiest fit, but the web-to-markdown skill can also help with docs pages and other content-heavy pages where “main body extraction” is the right output model. It is less ideal for dashboards or highly interactive apps.
How to Improve web-to-markdown skill
Give web-to-markdown explicit output instructions
A better request is not just “convert this URL,” but:
print itsave it to ./tmp/page.mdsave all results under ./exports/
This removes guesswork and makes the first run more likely to match your workflow.
Use interactive mode only when the page needs it
--interactive is valuable for consent gates, login flows, and verification prompts, but it is slower and less automatable. For routine public pages, avoid it. For blocked pages, use it early instead of retrying blind.
Test browser detection early
If the first run fails to launch a browser, do not keep changing the prompt. Fix the execution context:
- confirm a Chromium-family browser exists
- provide
--chrome-path <path>when needed
For many users, this is the single most important web-to-markdown install tip.
Choose representative pages before a big rollout
Before converting hundreds of URLs, test:
- one simple article page
- one JS-rendered page
- one page behind consent or login friction
This tells you whether the skill is a fit for your actual site mix, not just for ideal cases.
Strengthen prompts with page-specific constraints
If you know a page is tricky, say so:
use the skill web-to-markdown on this docs page; it renders client-side, save to ./docs/intro.mduse the skill web-to-markdown on this member page with interactive mode because I need to pass a verification screen first
That extra context changes execution quality more than adding generic wording.
Validate the first Markdown result, then iterate
After the first output, check:
- was the main content captured?
- did the output include too much nav or boilerplate?
- was the page only partially rendered?
- did the filename or folder behavior match expectations?
Then rerun with better controls. web-to-markdown usually improves through one targeted retry, not through long speculative prompting.
Know the main failure modes
Common failure modes are:
- no explicit trigger phrase, so the skill should not run
- local browser launch issues
- pages that need visible interaction
- pages whose “main content” is ambiguous to Readability
- users expecting full-site scraping instead of page conversion
Recognizing these early helps you decide whether to keep using web-to-markdown or switch tools.
Use web-to-markdown for the right output standard
You will get the best results when your success criterion is:
- clean, readable Markdown
- main content over page chrome
- portable output for notes, archives, analysis, or downstream AI processing
If your success criterion is “preserve every layout detail,” this skill is the wrong tool. Matching your expectation to its design is the fastest way to improve results.
