Install and use the docx skill to create, inspect, edit, validate, comment on, and convert Word .docx files with practical document workflows.

Stars0
Favorites0
Comments0
CategoryDOCX Workflows
Install Command
npx skills add https://github.com/anthropics/skills --skill docx
Overview

Overview

What the docx skill does

The docx skill is built for end-to-end Microsoft Word .docx workflows. It is intended for situations where you need to create, read, edit, analyze, validate, comment on, or repackage Word documents rather than just generate plain text.

Repository materials show a practical XML-based workflow for Office files, with Word-focused support for unpacking .docx archives, editing the internal XML, repacking files, validating structure, adding comments, and handling tracked changes. The skill also includes guidance and helpers for converting legacy .doc files to .docx before editing.

Who this skill is for

This skill is a good fit for:

  • teams producing polished Word deliverables such as reports, memos, letters, and templates
  • users who need to inspect or transform existing .docx files instead of writing from scratch
  • workflows that must preserve Word-native features like comments or tracked changes
  • document automation tasks where direct XML access is more reliable than manual editing

Problems the docx skill helps solve

Use docx when you need to:

  • read document content from an existing .docx
  • unpack a Word file into editable XML
  • make structured edits and then rebuild the document
  • add review comments to a document package
  • accept tracked changes with LibreOffice-based tooling
  • validate a rebuilt Office file before handing it off
  • convert an older .doc file into .docx so it can be processed safely

How it works at a high level

The core idea behind docx is that a .docx file is a ZIP archive containing XML and related assets. The repository includes scripts such as:

  • scripts/office/unpack.py to extract and pretty-print Office document contents
  • scripts/office/pack.py to rebuild .docx, .pptx, or .xlsx files from an unpacked directory
  • scripts/office/validate.py and validator modules under scripts/office/validators/ to check document structure
  • scripts/comment.py to add Word comments into an unpacked document
  • scripts/accept_changes.py to accept tracked changes using LibreOffice
  • scripts/office/soffice.py to run soffice more reliably in constrained environments

When docx is a strong fit

Choose docx if your main job is Word document manipulation. It is especially useful when a user explicitly asks for a Word file, references .docx, or needs Word-specific features such as:

  • headings, page numbers, and professional formatting
  • extraction or reorganization of document content
  • comments and review workflows
  • tracked changes handling
  • XML-level edits for precise transformations

When docx is not the best fit

This skill is not the right choice for every file workflow. It is a weaker fit if you primarily need:

  • PDF-first processing
  • spreadsheets or presentation work as the main task
  • Google Docs collaboration rather than Office package editing
  • generic programming help unrelated to document generation or transformation

Installation snapshot

To install the docx skill from the Anthropic skills repository, use:

npx skills add https://github.com/anthropics/skills --skill docx

After installation, start by reviewing SKILL.md, then inspect the supporting scripts under scripts/ to understand the available document operations.

How to Use

Install the docx skill

Install docx with:

npx skills add https://github.com/anthropics/skills --skill docx

Once added, review these files first:

  • SKILL.md
  • scripts/office/unpack.py
  • scripts/office/pack.py
  • scripts/office/validate.py
  • scripts/comment.py
  • scripts/accept_changes.py
  • scripts/office/soffice.py

These files reflect the real working path of the skill and are the best starting point for installation evaluation.

Check prerequisites before you commit

The repository evidence supports a few practical dependencies and assumptions:

  • Python is required for the included scripts
  • LibreOffice soffice is required for some operations, including the tracked-changes acceptance script and .doc to .docx conversion workflow
  • pandoc is referenced for text extraction from .docx

If your environment cannot run Python scripts or LibreOffice, docx may still be useful conceptually, but the included workflow will be less convenient.

Typical workflow: inspect, edit, rebuild

A common docx workflow is:

  1. Convert old .doc files to .docx if needed.
  2. Unpack the .docx archive into a working directory.
  3. Edit the extracted XML and related assets.
  4. Optionally add comments or handle redlines.
  5. Repack the directory into a new .docx.
  6. Validate the rebuilt document.

This approach is well suited to repeatable document transformations where precision matters more than interactive editing in Word.

Convert legacy .doc files first

The skill documentation explicitly notes that legacy .doc files should be converted before editing. The documented command is:

python scripts/office/soffice.py --headless --convert-to docx document.doc

If your incoming files are older Word binaries rather than modern .docx, this conversion step is important for a stable workflow.

Read document content

For document reading and analysis, the repository points to two practical options:

  • use pandoc when you want extracted text, including tracked changes handling
  • unpack the document when you need raw XML access

This makes docx useful both for content analysis and for structure-aware editing.

Unpack a Word document for editing

The unpack script is the foundation of the editing workflow. It extracts the Office archive, pretty-prints XML files, and for DOCX can optionally merge adjacent runs or simplify tracked changes.

A typical usage pattern from the repository is:

  • python unpack.py document.docx unpacked/

The actual script file is scripts/office/unpack.py, so in practice you will usually run it from that location or adapt it to your environment.

Edit XML carefully

After unpacking, you work directly with the WordprocessingML files inside the extracted directory. This is best for controlled changes such as:

  • replacing text at known XML locations
  • adjusting document metadata or structure
  • inserting references for comments
  • preparing a cleaned package for downstream generation

This is a strong fit for automation engineers and agent workflows, but less ideal for casual one-off editing by nontechnical users.

Add comments to a DOCX package

The repository includes scripts/comment.py for adding comments to unpacked DOCX content. The script documentation shows that comments can be added and replies can be attached with a parent comment reference.

A practical detail supported by the source: comment text must be XML-escaped, and comment markers must also be placed in document.xml correctly. This means docx is useful for programmatic review workflows, but it expects careful handling of Word XML conventions.

Accept tracked changes

If you need a clean version of a reviewed Word document, scripts/accept_changes.py is designed to accept all tracked changes using LibreOffice. The source explicitly states that LibreOffice is required.

This is one of the more valuable installation considerations for docx: if your workflow depends on tracked changes resolution, the included script gives you an automated path without requiring manual acceptance in Word.

Repack and validate the final file

Once edits are complete, scripts/office/pack.py rebuilds the Office file. According to the source, it can validate, auto-repair, condense XML formatting, and write a .docx, .pptx, or .xlsx package.

For Word workflows, the main value is producing a valid .docx after direct XML edits. Validation support matters because Office files can fail in subtle ways after manual package changes.

Environment and sandbox considerations

scripts/office/soffice.py includes helper logic for running LibreOffice in environments where AF_UNIX sockets may be restricted. That is a useful implementation detail if you run document workflows in containers, remote sandboxes, or VM-based automation setups.

In short, docx is not just about document editing commands; it also includes operational tooling for making those commands work in less predictable runtime environments.

Best use cases for teams

The docx skill is a strong match when your team needs:

  • repeatable Word document generation or cleanup
  • automated review and comment insertion
  • XML-level transformations that ordinary Office scripting cannot handle cleanly
  • validation before delivery to clients or internal stakeholders
  • migration from old .doc assets into modern .docx workflows

Reasons you might choose another approach

You may want another tool if:

  • you only need simple text export and do not care about Word-native fidelity
  • users will mostly edit interactively in Word rather than through automation
  • you need a fully open, redistributable library workflow rather than skill-bound materials
  • your environment cannot support the supporting Python and LibreOffice toolchain

FAQ

What is the docx skill mainly used for?

docx is mainly used for Word .docx workflows: creating, reading, editing, validating, commenting on, and transforming Word documents. It is especially useful when a task depends on Word-specific structure rather than plain text.

How do I install the docx skill?

Install the skill with npx skills add https://github.com/anthropics/skills --skill docx. After that, review SKILL.md and the scripts under scripts/ to understand the supported workflow.

Does docx support legacy .doc files?

Yes, but indirectly. The repository guidance says legacy .doc files should be converted to .docx before editing, using the LibreOffice-based scripts/office/soffice.py workflow.

Can docx work with tracked changes?

Yes. Repository evidence shows support for tracked-changes-related workflows. scripts/accept_changes.py accepts tracked changes using LibreOffice, and the unpack workflow can simplify adjacent tracked changes in DOCX files.

Can I add comments with the docx skill?

Yes. scripts/comment.py is specifically included for adding comments to DOCX documents, including reply relationships. This is one of the clearest Word review features provided by the skill.

Does docx validate documents after editing?

Yes. The repository includes validation logic under scripts/office/validate.py and scripts/office/validators/. The pack process also supports validation when rebuilding files.

Is docx only for Word files?

The docx skill is Word-centered, but some helper scripts under scripts/office/ also support .pptx and .xlsx packaging and validation. For installation decisions, though, the main value of docx is DOCX document work.

Is docx a good fit for nontechnical users?

Usually not as a primary editing tool. docx is best for technical, agent-assisted, or automation-heavy workflows because it relies on unpacking Office files, editing XML, and repacking them. If someone just wants to make a quick manual edit, Word itself is often simpler.

What should I inspect in the repository first?

Start with SKILL.md, then check scripts/office/unpack.py, scripts/office/pack.py, scripts/comment.py, scripts/accept_changes.py, and the validator modules. That gives a realistic picture of whether the docx skill matches your workflow and runtime environment.

Ratings & Reviews

No ratings yet
Share your review
Sign in to leave a rating and comment for this skill.
G
0/10000
Latest reviews
Saving...