docx
by anthropicsInstall and use the docx skill to create, inspect, edit, validate, comment on, and convert Word .docx files with practical document workflows.
Overview
What the docx skill does
The docx skill is built for end-to-end Microsoft Word .docx workflows. It is intended for situations where you need to create, read, edit, analyze, validate, comment on, or repackage Word documents rather than just generate plain text.
Repository materials show a practical XML-based workflow for Office files, with Word-focused support for unpacking .docx archives, editing the internal XML, repacking files, validating structure, adding comments, and handling tracked changes. The skill also includes guidance and helpers for converting legacy .doc files to .docx before editing.
Who this skill is for
This skill is a good fit for:
- teams producing polished Word deliverables such as reports, memos, letters, and templates
- users who need to inspect or transform existing
.docxfiles instead of writing from scratch - workflows that must preserve Word-native features like comments or tracked changes
- document automation tasks where direct XML access is more reliable than manual editing
Problems the docx skill helps solve
Use docx when you need to:
- read document content from an existing
.docx - unpack a Word file into editable XML
- make structured edits and then rebuild the document
- add review comments to a document package
- accept tracked changes with LibreOffice-based tooling
- validate a rebuilt Office file before handing it off
- convert an older
.docfile into.docxso it can be processed safely
How it works at a high level
The core idea behind docx is that a .docx file is a ZIP archive containing XML and related assets. The repository includes scripts such as:
scripts/office/unpack.pyto extract and pretty-print Office document contentsscripts/office/pack.pyto rebuild.docx,.pptx, or.xlsxfiles from an unpacked directoryscripts/office/validate.pyand validator modules underscripts/office/validators/to check document structurescripts/comment.pyto add Word comments into an unpacked documentscripts/accept_changes.pyto accept tracked changes using LibreOfficescripts/office/soffice.pyto runsofficemore reliably in constrained environments
When docx is a strong fit
Choose docx if your main job is Word document manipulation. It is especially useful when a user explicitly asks for a Word file, references .docx, or needs Word-specific features such as:
- headings, page numbers, and professional formatting
- extraction or reorganization of document content
- comments and review workflows
- tracked changes handling
- XML-level edits for precise transformations
When docx is not the best fit
This skill is not the right choice for every file workflow. It is a weaker fit if you primarily need:
- PDF-first processing
- spreadsheets or presentation work as the main task
- Google Docs collaboration rather than Office package editing
- generic programming help unrelated to document generation or transformation
Installation snapshot
To install the docx skill from the Anthropic skills repository, use:
npx skills add https://github.com/anthropics/skills --skill docx
After installation, start by reviewing SKILL.md, then inspect the supporting scripts under scripts/ to understand the available document operations.
How to Use
Install the docx skill
Install docx with:
npx skills add https://github.com/anthropics/skills --skill docx
Once added, review these files first:
SKILL.mdscripts/office/unpack.pyscripts/office/pack.pyscripts/office/validate.pyscripts/comment.pyscripts/accept_changes.pyscripts/office/soffice.py
These files reflect the real working path of the skill and are the best starting point for installation evaluation.
Check prerequisites before you commit
The repository evidence supports a few practical dependencies and assumptions:
- Python is required for the included scripts
- LibreOffice
sofficeis required for some operations, including the tracked-changes acceptance script and.docto.docxconversion workflow pandocis referenced for text extraction from.docx
If your environment cannot run Python scripts or LibreOffice, docx may still be useful conceptually, but the included workflow will be less convenient.
Typical workflow: inspect, edit, rebuild
A common docx workflow is:
- Convert old
.docfiles to.docxif needed. - Unpack the
.docxarchive into a working directory. - Edit the extracted XML and related assets.
- Optionally add comments or handle redlines.
- Repack the directory into a new
.docx. - Validate the rebuilt document.
This approach is well suited to repeatable document transformations where precision matters more than interactive editing in Word.
Convert legacy .doc files first
The skill documentation explicitly notes that legacy .doc files should be converted before editing. The documented command is:
python scripts/office/soffice.py --headless --convert-to docx document.doc
If your incoming files are older Word binaries rather than modern .docx, this conversion step is important for a stable workflow.
Read document content
For document reading and analysis, the repository points to two practical options:
- use
pandocwhen you want extracted text, including tracked changes handling - unpack the document when you need raw XML access
This makes docx useful both for content analysis and for structure-aware editing.
Unpack a Word document for editing
The unpack script is the foundation of the editing workflow. It extracts the Office archive, pretty-prints XML files, and for DOCX can optionally merge adjacent runs or simplify tracked changes.
A typical usage pattern from the repository is:
python unpack.py document.docx unpacked/
The actual script file is scripts/office/unpack.py, so in practice you will usually run it from that location or adapt it to your environment.
Edit XML carefully
After unpacking, you work directly with the WordprocessingML files inside the extracted directory. This is best for controlled changes such as:
- replacing text at known XML locations
- adjusting document metadata or structure
- inserting references for comments
- preparing a cleaned package for downstream generation
This is a strong fit for automation engineers and agent workflows, but less ideal for casual one-off editing by nontechnical users.
Add comments to a DOCX package
The repository includes scripts/comment.py for adding comments to unpacked DOCX content. The script documentation shows that comments can be added and replies can be attached with a parent comment reference.
A practical detail supported by the source: comment text must be XML-escaped, and comment markers must also be placed in document.xml correctly. This means docx is useful for programmatic review workflows, but it expects careful handling of Word XML conventions.
Accept tracked changes
If you need a clean version of a reviewed Word document, scripts/accept_changes.py is designed to accept all tracked changes using LibreOffice. The source explicitly states that LibreOffice is required.
This is one of the more valuable installation considerations for docx: if your workflow depends on tracked changes resolution, the included script gives you an automated path without requiring manual acceptance in Word.
Repack and validate the final file
Once edits are complete, scripts/office/pack.py rebuilds the Office file. According to the source, it can validate, auto-repair, condense XML formatting, and write a .docx, .pptx, or .xlsx package.
For Word workflows, the main value is producing a valid .docx after direct XML edits. Validation support matters because Office files can fail in subtle ways after manual package changes.
Environment and sandbox considerations
scripts/office/soffice.py includes helper logic for running LibreOffice in environments where AF_UNIX sockets may be restricted. That is a useful implementation detail if you run document workflows in containers, remote sandboxes, or VM-based automation setups.
In short, docx is not just about document editing commands; it also includes operational tooling for making those commands work in less predictable runtime environments.
Best use cases for teams
The docx skill is a strong match when your team needs:
- repeatable Word document generation or cleanup
- automated review and comment insertion
- XML-level transformations that ordinary Office scripting cannot handle cleanly
- validation before delivery to clients or internal stakeholders
- migration from old
.docassets into modern.docxworkflows
Reasons you might choose another approach
You may want another tool if:
- you only need simple text export and do not care about Word-native fidelity
- users will mostly edit interactively in Word rather than through automation
- you need a fully open, redistributable library workflow rather than skill-bound materials
- your environment cannot support the supporting Python and LibreOffice toolchain
FAQ
What is the docx skill mainly used for?
docx is mainly used for Word .docx workflows: creating, reading, editing, validating, commenting on, and transforming Word documents. It is especially useful when a task depends on Word-specific structure rather than plain text.
How do I install the docx skill?
Install the skill with npx skills add https://github.com/anthropics/skills --skill docx. After that, review SKILL.md and the scripts under scripts/ to understand the supported workflow.
Does docx support legacy .doc files?
Yes, but indirectly. The repository guidance says legacy .doc files should be converted to .docx before editing, using the LibreOffice-based scripts/office/soffice.py workflow.
Can docx work with tracked changes?
Yes. Repository evidence shows support for tracked-changes-related workflows. scripts/accept_changes.py accepts tracked changes using LibreOffice, and the unpack workflow can simplify adjacent tracked changes in DOCX files.
Can I add comments with the docx skill?
Yes. scripts/comment.py is specifically included for adding comments to DOCX documents, including reply relationships. This is one of the clearest Word review features provided by the skill.
Does docx validate documents after editing?
Yes. The repository includes validation logic under scripts/office/validate.py and scripts/office/validators/. The pack process also supports validation when rebuilding files.
Is docx only for Word files?
The docx skill is Word-centered, but some helper scripts under scripts/office/ also support .pptx and .xlsx packaging and validation. For installation decisions, though, the main value of docx is DOCX document work.
Is docx a good fit for nontechnical users?
Usually not as a primary editing tool. docx is best for technical, agent-assisted, or automation-heavy workflows because it relies on unpacking Office files, editing XML, and repacking them. If someone just wants to make a quick manual edit, Word itself is often simpler.
What should I inspect in the repository first?
Start with SKILL.md, then check scripts/office/unpack.py, scripts/office/pack.py, scripts/comment.py, scripts/accept_changes.py, and the validator modules. That gives a realistic picture of whether the docx skill matches your workflow and runtime environment.
