OCR

Browse agent skills tagged with OCR and compare related workflows across the directory.

12 skills

visa-doc-translate

by affaan-m

visa-doc-translate translates visa application document images to English and creates a bilingual PDF with the original page and translation. It is built for structured visa paperwork, OCR fallback, rotation handling, and preserving names, dates, and amounts.

Translation

Favorites 0GitHub 156.3k

nutrient-document-processing

by affaan-m

nutrient-document-processing skill for PDF processing and document automation with the Nutrient DWS API. Convert, OCR, extract, redact, sign, watermark, and fill files like PDFs, DOCX, XLSX, PPTX, HTML, and images.

PDF Processing

Favorites 0GitHub 156.2k

pdf

by anthropics

The pdf skill guides PDF Processing tasks like text extraction, merge and split operations, rendering pages to images, and PDF form workflows. It is especially useful for checking fillable fields, extracting form metadata, and validating non-fillable form layouts with scripts.

PDF Processing

Favorites 0GitHub 105.1k

azure-ai-vision-imageanalysis-py

by microsoft

The azure-ai-vision-imageanalysis-py skill helps you install and use the Azure AI Vision Image Analysis SDK for Python. It covers captions, tags, objects, OCR, people detection, and smart cropping, with backend-focused setup, authentication, and environment guidance for Azure-based image understanding workflows.

Backend Development

Favorites 0GitHub 2.3k

azure-ai-document-intelligence-ts

by microsoft

azure-ai-document-intelligence-ts is a TypeScript skill for extracting text, tables, key-value fields, and structured data with Azure Document Intelligence. Use it for OCR Extraction from invoices, receipts, IDs, and forms, or when you need prebuilt and custom model workflows in Node.js with Azure REST SDK authentication.

OCR Extraction

Favorites 0GitHub 2.3k

azure-ai-contentunderstanding-py

by microsoft

azure-ai-contentunderstanding-py is the Python skill for Azure AI Content Understanding. It extracts structured content from documents, images, audio, and video for RAG workflows and automation. Use it when you need reliable multimodal extraction, Azure authentication, and repeatable pipeline-ready output.

RAG Workflows

Favorites 0GitHub 2.2k

azure-ai-vision-imageanalysis-java

by microsoft

azure-ai-vision-imageanalysis-java helps you build Java image analysis apps with Azure AI Vision. Use it for captioning, OCR, object detection, tagging, people detection, smart cropping, and API Development with SDK setup, auth, and examples.

API Development

Favorites 0GitHub 2.2k

azure-ai-formrecognizer-java

by microsoft

The azure-ai-formrecognizer-java skill helps Java developers use Azure AI Document Intelligence for OCR extraction, tables, key-value pairs, invoices, receipts, IDs, and custom document models. It aligns with the current com.azure:azure-ai-documentintelligence SDK and is useful when you need practical Java setup, API guidance, and repeatable document analysis.

OCR Extraction

Favorites 0GitHub 2.2k

azure-ai-document-intelligence-dotnet

by microsoft

azure-ai-document-intelligence-dotnet helps .NET developers install and use Azure AI Document Intelligence to extract text, tables, key-value pairs, and structured fields from invoices, receipts, IDs, and custom documents. It includes practical setup, authentication, and OCR Extraction guidance for reliable document analysis.

OCR Extraction

Favorites 0GitHub 2.2k

pdf

by K-Dense-AI

The pdf skill is a practical guide for PDF Processing when you need to read, extract, transform, or create PDF files in a workflow you can ship. It covers text extraction, merging, splitting, rotation, form filling, encryption, image extraction, and OCR for scanned PDFs. Use it when you need a repeatable pdf guide instead of a one-off prompt.

PDF Processing

Favorites 0GitHub 0

markitdown

by K-Dense-AI

markitdown converts files and office documents to Markdown for easier reading, chunking, search, and LLM workflows. This markitdown skill supports PDF, DOCX, PPTX, XLSX, HTML, CSV, JSON, XML, ZIP, EPUB, images with OCR, and audio transcription, making it a practical markitdown guide for format conversion.

Format Conversion

Favorites 0GitHub 0

nutrient-document-processing

by PSPDFKit-labs

nutrient-document-processing is a workflow skill for PDF Processing with Nutrient DWS. It helps you install, understand, and use repeatable document workflows for convert, merge, split, OCR, extract, redact, sign, optimize, and compliance outputs like PDF/A or PDF/UA.

PDF Processing

Favorites 0GitHub 0