The pdf skill guides PDF Processing tasks like text extraction, merge and split operations, rendering pages to images, and PDF form workflows. It is especially useful for checking fillable fields, extracting form metadata, and validating non-fillable form layouts with scripts.
This skill scores 84/100, which means it is a strong directory listing candidate for agents that need to work with PDFs. Directory users get broad trigger coverage, substantial procedural content, and concrete helper scripts—especially for form filling—so an agent can usually act with less guesswork than a generic prompt, though environment/setup expectations are not fully spelled out in the skill itself.
- Very strong triggerability: the description explicitly says to use it whenever the user mentions a .pdf or asks to produce one, and names many common PDF tasks.
- Operationally useful workflow content: SKILL.md provides examples for core PDF operations, while forms.md gives ordered instructions and command-level steps for fillable vs non-fillable forms.
- Real execution leverage from included scripts: the repo ships multiple utilities for checking form fields, extracting structure, converting PDFs to images, validating bounding boxes, and filling forms.
- Install/runtime requirements are implied rather than clearly packaged: SKILL.md has no install command, even though the skill relies on Python libraries and command-line tooling.
- The scope is very broad, but some advanced capabilities are pushed into reference material, so users may still need to choose among libraries and approaches.
Overview of pdf skill
What the pdf skill does
The pdf skill is a practical guide for PDF Processing tasks, with the strongest value in routine operations and form workflows. It helps an agent choose working tools and steps for reading PDFs, extracting text, merging or splitting files, rendering pages to images, and especially filling PDF forms correctly.
Who should install this pdf skill
This pdf skill is best for users who regularly handle PDFs in automation, data entry, document pipelines, or agent workflows. It is a strong fit if you want more than a generic “use a PDF library” answer and need concrete paths for fillable vs non-fillable forms, page rendering, and validation.
Real job-to-be-done
Most users do not need a broad PDF theory guide. They need a dependable way to answer questions like:
- “How do I extract text from this PDF?”
- “How do I merge or split pages safely?”
- “Does this form have actual fillable fields?”
- “If not, how do I locate where values should be placed?”
- “How do I validate that my field boxes do not overlap?”
This skill is useful because it turns those questions into a workflow instead of leaving the agent to guess.
What makes pdf different from a generic prompt
The main differentiator is form handling discipline. The repository includes dedicated instructions in forms.md and helper scripts such as:
scripts/check_fillable_fields.pyscripts/extract_form_field_info.pyscripts/extract_form_structure.pyscripts/fill_fillable_fields.pyscripts/fill_pdf_form_with_annotations.pyscripts/check_bounding_boxes.pyscripts/create_validation_image.py
That means the pdf guide is not just about libraries; it gives a decision path for forms and validation, which is where many PDF automations fail.
Best-fit and misfit cases
Use pdf for PDF Processing when you need actionable instructions for Python-based workflows, image conversion, rendering, or form filling.
It is less compelling if you only need a one-line reminder for a standard library call, or if your stack is entirely outside Python and you do not want to translate examples from reference.md.
How to Use pdf skill
Install context for pdf
Install the skill from the Anthropic skills repository:
npx skills add https://github.com/anthropics/skills --skill pdf
After install, work from the skill directory rather than only skimming the top file, because the most valuable guidance is split across SKILL.md, forms.md, reference.md, and the scripts/ folder.
Read these files first
For a fast adoption path, open files in this order:
SKILL.mdforms.mdreference.mdscripts/check_fillable_fields.pyscripts/extract_form_field_info.pyscripts/fill_fillable_fields.py
Why this order matters:
SKILL.mdcovers common operations and library direction.forms.mdcontains the strict branching logic for form tasks.reference.mdexpands into rendering and JavaScript options.- The scripts show the real expected inputs and outputs.
Choose the right workflow before writing code
A good pdf usage pattern starts with task classification:
- Text extraction
- Page manipulation
- Render PDF pages as images
- Fill a form
- Build a PDF from data
Do this first because form tasks follow a very different path from merge/split/extract tasks. The repository is explicit that form filling should not start with ad hoc code.
How to handle ordinary PDF operations
For basic PDF Processing, the skill points first to pypdf. That is the default path for:
- reading PDFs
- counting pages
- extracting text
- merging files
- splitting pages
If your task is “combine these files” or “extract the text page by page,” the examples in SKILL.md are the quickest starting point.
How to handle rendering and image conversion
If your goal is page screenshots, previews, visual inspection, or image-based downstream processing, use the rendering-oriented materials:
reference.mdforpypdfium2scripts/convert_pdf_to_images.pyfor PNG conversion
This matters when text extraction alone is insufficient, such as scanned PDFs, visual form review, or validating page layout before annotation.
The critical branch for PDF forms
For forms, the skill gives a stricter process than generic prompting. Start with:
python scripts/check_fillable_fields.py <file.pdf>
This answers the first decision that blocks many automations:
- If the PDF has fillable fields, extract field info and populate those fields directly.
- If it does not, use the non-fillable workflow from
forms.md, which relies on visual structure and bounding boxes.
Skipping this check is the most common way to waste time.
Inputs that produce better pdf results
When invoking the pdf skill, provide:
- the exact file path or file names
- whether the PDF is digital or scanned
- the intended output format
- whether forms are fillable
- whether you need text fidelity, layout fidelity, or visual output
- whether you can run Python scripts locally
A weak request:
- “Help with this PDF.”
A strong request:
- “I need to fill a 6-page government form PDF. First determine whether it has fillable fields. If yes, extract field metadata to JSON. If no, convert pages to images, identify entry regions, and generate a validation image before placing values.”
The stronger version lets the agent trigger the right path immediately.
How to prompt the pdf skill well
A reliable prompt format is:
- goal
- file(s)
- constraints
- desired output
- validation requirement
Example:
- Goal: extract tables and page text from
report.pdf - Constraints: Python only, no cloud OCR
- Desired output: CSV tables plus a text dump per page
- Validation: preserve page numbers and report pages with no text
This is better than just asking for “PDF extraction” because the skill covers multiple methods and quality depends on choosing the correct one.
Form workflow for fillable PDFs
If the PDF has real fields, the useful next step is:
python scripts/extract_form_field_info.py <input.pdf> <field_info.json>
The extracted JSON includes field IDs, page numbers, rectangles, and field types such as:
textcheckboxradio_groupchoice
This is the practical core of the pdf guide for forms, because it gives structured targets instead of relying on visual guessing.
Form workflow for non-fillable PDFs
If the PDF is not fillable, forms.md indicates that you must visually determine where values belong. The supporting scripts suggest a workflow like:
- convert the PDF to images
- infer form structure and bounding boxes
- validate box placement
- write annotations or filled output
This is slower than fillable-field handling, but the repository gives a more realistic path than “just OCR it.”
Use validation scripts before trusting output
Two scripts materially improve reliability:
scripts/check_bounding_boxes.pyscripts/create_validation_image.py
Use them when working with non-fillable forms or inferred field locations. They help catch overlapping entry areas, label collisions, and placement mistakes before you generate final output.
That is a real adoption advantage of this pdf install: it includes validation helpers, not just transformation code.
Libraries and tool choices inside the skill
The repository’s practical tool split is:
pypdffor standard document operationspypdfium2for rendering and image-oriented workpdf2imagein the helper script for conversion to PNGpdf-libinreference.mdif you prefer JavaScript for creation/manipulation
If you are deciding whether to install this pdf skill, that tool coverage is useful: it is not locked to one library, but it still has a clear default path.
pdf skill FAQ
Is this pdf skill only for form filling?
No. The pdf skill also covers extraction, merge/split operations, rendering, creation, and general PDF manipulation. But form workflows are where it adds the most decision value over an ordinary prompt.
Is pdf good for beginners?
Yes, if you can run Python scripts. The best beginner path is to start with SKILL.md for simple operations, then use forms.md only when your task is actually a form. The scripts reduce guesswork, but they do assume a local Python environment and basic command-line comfort.
What does this skill do better than a normal LLM prompt?
It gives a concrete workflow for branching between fillable and non-fillable PDFs, plus validation tooling. A normal prompt may suggest libraries; this skill shows when to inspect fields, when to render pages, and how to verify bounding boxes.
When should I not use this pdf guide?
Do not rely on this pdf guide if:
- you need a fully packaged end-user app rather than a skill/workflow
- you cannot execute local scripts
- you need advanced OCR-first pipelines beyond what the repository explicitly supports
- you want a single opinionated production framework instead of a mixed-reference toolkit
Does pdf support JavaScript too?
Partly. The main workflow is Python-first, but reference.md includes pdf-lib examples for JavaScript. If your team is JS-native, the skill still helps with concepts and task decomposition, but the strongest operational support is in Python.
Can this skill handle scanned PDFs?
Partially. It can help render pages to images and structure workflows around visual processing. But scanned PDFs often require OCR or visual placement logic, so results depend heavily on document quality and your chosen downstream tools.
How to Improve pdf skill
Start with the right PDF diagnosis
The best way to improve pdf usage is to classify the document before acting:
- text-based vs scanned
- fillable vs non-fillable
- document extraction vs form completion
- visual fidelity vs text fidelity
Most failures come from choosing the wrong path, not from bad code syntax.
Provide stronger task inputs
Better inputs produce better outputs. Include:
- sample file name
- number of pages
- whether there are tables, forms, or signatures
- whether you need editable output or just extracted data
- the exact fields to fill, preferably as a JSON mapping
For forms, this is much better than a prose list because the scripts and workflows naturally map to structured data.
Validate before scaling up
Do not test on 200 PDFs first. Run the pdf skill on one representative file and inspect:
- extracted text quality
- field metadata completeness
- page image rendering
- bounding box overlap warnings
- final visual output
This small-batch validation catches the errors that become expensive later.
Common failure modes in pdf workflows
Watch for these:
- assuming a PDF is fillable without checking
- using text extraction on scanned files and getting near-empty output
- writing field values without first inspecting field IDs and field types
- skipping validation images for non-fillable forms
- treating rendering output as if it were structured text extraction
These are exactly the areas where the repository’s scripts help.
Improve prompts by asking for the full workflow
A better prompt for pdf for PDF Processing asks the agent to:
- identify the document type
- select the library/tool path
- show intermediate outputs
- validate before finalizing
Example:
“Use the pdf skill to inspect application.pdf. First check if it has fillable fields. If yes, extract field metadata and propose a JSON payload for completion. If no, convert each page to images, identify entry regions, generate a validation image for page 1, and only then suggest the filling approach.”
This kind of prompt improves both accuracy and trust.
Iterate after the first output
If the first result is weak, do not just ask for “better.” Ask for a narrower correction:
- “Re-run using rendered images because text extraction returned little content.”
- “List all checkbox and radio fields separately.”
- “Generate validation overlays for pages 2 and 3.”
- “Preserve original page order and output one file per page.”
Specific iteration requests make the pdf skill much more effective than broad retries.
Use repository scripts as truth anchors
When agent output and document reality differ, trust the repository scripts over freeform reasoning. For this skill, the scripts are the strongest source of operational truth because they define expected inputs, field structures, and validation checks.
Know the adoption tradeoff
The pdf install is worth it if PDF forms, layout-sensitive workflows, or repeated document handling are part of your work. If your use case is only occasional page merging, a generic prompt may be enough. The skill pays off most when you need repeatable, validated PDF Processing rather than one-off advice.
