S

web-to-markdown

by softaworks

web-to-markdown is a Format Conversion skill that turns live web pages into clean Markdown through the local web2md CLI, using a Chromium-family browser for JS-rendered pages, interactive flows, and batch URL conversion. It only runs when explicitly invoked by name.

Stars1.3k
Favorites0
Comments0
AddedApr 1, 2026
CategoryFormat Conversion
Install Command
npx skills add softaworks/agent-toolkit --skill web-to-markdown
Curation Score

This skill scores 77/100, which means it is a solid directory listing candidate for users who specifically want webpage-to-Markdown conversion via a local browser-driven CLI. It is clear enough for an agent to follow with less guesswork than a generic prompt, but install-decision clarity is held back by missing setup specifics in the skill itself and its dependence on an external local tool/browser environment.

77/100
Strengths
  • Strong operational framing: the skill clearly states what it does, what it will not do, and which inputs to collect before running.
  • Real agent leverage over a generic prompt: it targets JS-rendered pages through a local browser stack and documents practical flags like `--print`, `--out`, `--chrome-path`, and `--interactive`.
  • Repository evidence is substantive rather than placeholder content, with both SKILL.md and README explaining purpose, workflow, and usage constraints.
Cautions
  • Adoption is less turnkey because SKILL.md has no install command and the skill depends on a locally available `web2md` CLI plus a Chromium-family browser.
  • The hard trigger gate requires the user to explicitly name `web-to-markdown`, which improves safety but makes the skill less naturally triggerable from ordinary web-extraction requests.
Overview

Overview of web-to-markdown skill

web-to-markdown is a narrowly scoped Format Conversion skill for turning live web pages into clean Markdown through a locally installed web2md CLI. Its value is not “summarize a page” but “render the actual page in a real browser, extract the main article or document body, and convert that result into portable Markdown.” That makes it a strong fit for users dealing with JavaScript-rendered pages, documentation sites, blog posts, gated flows that need interactive rendering, or archiving tasks where simple HTTP fetching is not enough.

Who web-to-markdown is best for

This web-to-markdown skill is best for users who need to:

  • convert one or more URLs into readable Markdown
  • handle pages that depend on client-side JavaScript
  • save content to files for later analysis or reuse
  • extract article-like content instead of scraping every page element

If your real goal is “get the main content from a page I can already access in a browser,” this skill is a better fit than a generic prompt.

What makes web-to-markdown different

The important differentiator is the pipeline:

  • Puppeteer via a local Chromium-family browser
  • Readability for main-content extraction
  • Turndown for Markdown conversion

That combination is designed for rendered content, not raw HTML. In practice, that means the web-to-markdown skill can work on pages where ordinary fetch-based tools fail or return incomplete content.

The hard trigger gate matters

This skill has an unusual but important constraint: it must only be used when the user explicitly requests it by name, with wording like use the skill web-to-markdown. If that explicit trigger is missing, the skill should not be applied. For directory users, this means adoption is simple, but invocation discipline matters.

Real job-to-be-done

Most users are not looking for “a browser automation skill.” They want one of these outcomes:

  • “Turn this article into Markdown I can keep.”
  • “Convert this docs page, even though it renders client-side.”
  • “Process a batch of URLs into .md files.”
  • “Open the page in a real browser so I can get past login or verification, then save the content.”

That is the real use case web-to-markdown is optimized for.

When not to choose this skill

Skip web-to-markdown if:

  • you only need a quick summary, not Markdown output
  • a plain HTTP fetch already gives you the content cleanly
  • you need a full crawler or site scraper
  • you want Playwright-based automation; this skill explicitly uses web2md, not other browser stacks

How to Use web-to-markdown skill

Install context before first use

Treat web-to-markdown as two dependencies:

  1. the skill itself in your agent environment
  2. a working local web2md CLI plus an available Chromium-family browser

A practical skill install path is:

npx skills add softaworks/agent-toolkit --skill web-to-markdown

The repository is at:
https://github.com/softaworks/agent-toolkit/tree/main/skills/web-to-markdown

Just adding the skill is not enough if your machine cannot run web2md or launch Chrome/Chromium/Brave/Edge. That local browser requirement is the main adoption blocker to check early.

Read these files first

This skill is small, so the best reading order is:

  1. skills/web-to-markdown/SKILL.md
  2. skills/web-to-markdown/README.md

SKILL.md gives you the trigger rule, required inputs, and workflow shape. README.md is where you confirm intended use cases such as JS-rendered pages, interactive mode, and batch conversion.

What input web-to-markdown needs

For reliable web-to-markdown usage, provide:

  • a url or list of URLs
  • output mode:
    • print to stdout with --print
    • write to a file with --out ./file.md
    • write to a directory with --out ./some-dir/
  • optional browser controls when needed:
    • --chrome-path <path> if browser detection fails
    • --interactive for login walls, consent screens, or human verification

If you do not specify output behavior, the agent has to guess. That is unnecessary friction and often the easiest thing to make explicit.

The exact invocation requirement

This web-to-markdown skill should only be triggered when the user explicitly writes something like:

  • use the skill web-to-markdown ...
  • use a skill web-to-markdown ...

If you are testing the skill, say the name directly. This is not optional repository etiquette; it is core execution logic.

Turn a rough request into a strong prompt

Weak request:

  • convert this page

Strong request:

  • use the skill web-to-markdown to convert https://example.com/article to Markdown and save it to ./notes/article.md

Even better:

  • use the skill web-to-markdown to convert these 5 docs URLs to Markdown, save them in ./docs-md/, and use interactive mode if a consent screen appears

Good prompts reduce failure by telling the skill:

  • what page(s) to process
  • where output should go
  • whether browser interaction may be needed
  • whether this is a one-off or a batch job

Practical command patterns to ask for

Useful web-to-markdown usage patterns include:

  • single page to terminal: --print
  • single page to file: --out ./page.md
  • many pages to a folder: --out ./pages/
  • difficult page with visible browser: --interactive
  • explicit browser binary path: --chrome-path <path>

The repository guidance makes these patterns more valuable than open-ended requests like “scrape this site,” which are broader than the skill’s design.

Best workflow for one page

A high-success workflow looks like this:

  1. confirm the user explicitly invoked web-to-markdown
  2. collect the URL
  3. decide whether output should print or save
  4. use --interactive only for pages that need human help
  5. review the Markdown result for missing sections or navigation noise
  6. rerun with better browser settings if extraction was incomplete

This is faster than trying to overdesign the prompt up front.

Best workflow for multiple URLs

For batch work:

  1. give the skill a list of URLs
  2. choose a directory output target
  3. expect filenames to be derived from page titles when saving to a folder
  4. spot-check a few outputs before running a large batch

The main reason to batch is consistency. The main risk is assuming every page template on a site extracts equally well.

Common local setup blockers

Most failed web-to-markdown installs are not prompt problems. They are local environment issues:

  • web2md is not installed or not on PATH
  • no supported browser is available locally
  • browser auto-detection fails, requiring --chrome-path
  • the page needs a visible browser and human interaction

If you want a quick adoption test, try one public article page and one JS-heavy page before using the skill in production workflows.

Output quality expectations

web-to-markdown aims for clean main-content Markdown, not a pixel-perfect copy of the original page. That means:

  • article and documentation body content should come through well
  • headers, footers, ads, and page chrome are usually de-emphasized
  • unusual widgets, app shells, and embedded tools may not convert neatly

That tradeoff is usually desirable for archiving and analysis, but it is worth knowing before you install.

web-to-markdown skill FAQ

Is web-to-markdown better than an ordinary prompt?

Yes, when the real need is rendered-page conversion. A generic prompt can discuss a URL, but it does not inherently open a browser, wait for JavaScript, extract the readable body, and produce Markdown. This web-to-markdown skill is useful because it operationalizes that workflow.

Is web-to-markdown good for beginners?

Yes, if your task is simple: one URL, one output file, straightforward page. The main beginner challenge is local setup, not the skill design. If you can run a local browser automation CLI, the skill is approachable.

Does web-to-markdown handle JavaScript-heavy pages?

That is one of its main reasons to exist. It uses a real local browser through Puppeteer, so it is more suitable for JS-rendered pages than raw-fetch approaches.

Can web-to-markdown get past login or verification screens?

Sometimes, with --interactive. The repository explicitly supports a mode where Chrome is shown and paused so the user can complete human steps. This is a practical advantage for protected or semi-protected pages.

When should I not use the web-to-markdown skill?

Do not use it when:

  • the user did not explicitly request web-to-markdown
  • a simple page fetch would already solve the task
  • you need structured scraping across many page components
  • you want a non-browser conversion path

The skill is specialized, and that specialization is a strength, not a weakness.

Does it work with any browser?

The documented fit is Chromium-family browsers such as Chrome, Chromium, Brave, or Edge via puppeteer-core. If auto-detection fails, expect to supply a path manually.

Is this only for articles?

No. Articles are the easiest fit, but the web-to-markdown skill can also help with docs pages and other content-heavy pages where “main body extraction” is the right output model. It is less ideal for dashboards or highly interactive apps.

How to Improve web-to-markdown skill

Give web-to-markdown explicit output instructions

A better request is not just “convert this URL,” but:

  • print it
  • save it to ./tmp/page.md
  • save all results under ./exports/

This removes guesswork and makes the first run more likely to match your workflow.

Use interactive mode only when the page needs it

--interactive is valuable for consent gates, login flows, and verification prompts, but it is slower and less automatable. For routine public pages, avoid it. For blocked pages, use it early instead of retrying blind.

Test browser detection early

If the first run fails to launch a browser, do not keep changing the prompt. Fix the execution context:

  • confirm a Chromium-family browser exists
  • provide --chrome-path <path> when needed

For many users, this is the single most important web-to-markdown install tip.

Choose representative pages before a big rollout

Before converting hundreds of URLs, test:

  • one simple article page
  • one JS-rendered page
  • one page behind consent or login friction

This tells you whether the skill is a fit for your actual site mix, not just for ideal cases.

Strengthen prompts with page-specific constraints

If you know a page is tricky, say so:

  • use the skill web-to-markdown on this docs page; it renders client-side, save to ./docs/intro.md
  • use the skill web-to-markdown on this member page with interactive mode because I need to pass a verification screen first

That extra context changes execution quality more than adding generic wording.

Validate the first Markdown result, then iterate

After the first output, check:

  • was the main content captured?
  • did the output include too much nav or boilerplate?
  • was the page only partially rendered?
  • did the filename or folder behavior match expectations?

Then rerun with better controls. web-to-markdown usually improves through one targeted retry, not through long speculative prompting.

Know the main failure modes

Common failure modes are:

  • no explicit trigger phrase, so the skill should not run
  • local browser launch issues
  • pages that need visible interaction
  • pages whose “main content” is ambiguous to Readability
  • users expecting full-site scraping instead of page conversion

Recognizing these early helps you decide whether to keep using web-to-markdown or switch tools.

Use web-to-markdown for the right output standard

You will get the best results when your success criterion is:

  • clean, readable Markdown
  • main content over page chrome
  • portable output for notes, archives, analysis, or downstream AI processing

If your success criterion is “preserve every layout detail,” this skill is the wrong tool. Matching your expectation to its design is the fastest way to improve results.

Ratings & Reviews

No ratings yet
Share your review
Sign in to leave a rating and comment for this skill.
G
0/10000
Latest reviews
Saving...