read-file

by duckdb

read-file helps an agent read and inspect CSV, JSON, Parquet, Avro, Excel, SQLite, spatial files, or remote URLs with DuckDB. Use it to preview rows, check schema, profile data, and answer what’s in this file. It’s best for read-file usage on real data artifacts, not source code.

Stars443

Favorites0

Comments0

AddedMay 9, 2026

CategoryOffice Documents

Install Command

npx skills add duckdb/duckdb-skills --skill read-file

Curation Score

This skill scores 74/100, which means it is worth listing for directory users: it has a real, usable workflow for reading many file types and remote URLs with DuckDB, but it is still somewhat limited in discoverability and adoption guidance. Users can likely trigger it successfully, yet they may need some extra judgment around setup and fit.

74/100

Strengths

Strong triggerability: the frontmatter says it is for reading data files or remote URLs and explicitly excludes source code, which helps agents route requests correctly.
Concrete operational workflow: it gives a step-by-step DuckDB command pattern, including a single inline macro and protocol-specific handling for HTTP, S3, GCS, and Azure.
Good agent leverage: it covers many data formats in one skill (CSV, JSON, Parquet, Avro, Excel, spatial, SQLite, blob), reducing guesswork versus a generic prompt.

Cautions

Install decision context is somewhat thin: description length is very short and there are no support files, references, or README to help users evaluate edge cases or integration fit.
The file is workflow-heavy but not fully self-contained in the preview; users may still need to inspect the full SQL/bash example to understand exact behavior and limits.

Duckdb Files CSV Json Parquet Excel Sqlite XLSX

Overview

Overview of read-file skill

The read-file skill helps an agent read and inspect data files with DuckDB instead of guessing from filename alone. It is best for users who need a fast preview, schema check, or lightweight profile of CSV, JSON, Parquet, Avro, Excel, SQLite, spatial files, or a remote URL. If your job is “tell me what’s in this file” or “summarize this dataset,” the read-file skill is a strong fit; if you need to edit source code, it is not.

What read-file is for

The core job-to-be-done is quick data understanding: read the file, identify the format, and answer a question about contents, shape, or obvious issues. This is more useful than a generic prompt because the skill is built around DuckDB’s file readers and supports local paths plus common remote sources such as https:// and s3://.

When it fits best

Use the read-file skill when the input is a real data artifact and you need an answer grounded in the file itself. It is especially useful for first-pass analysis before loading data into a notebook, pipeline, or BI tool.

Key differentiators

The main advantage of read-file is its format breadth and its one-command workflow. It is designed to reduce setup friction, resolve bare filenames, and handle multiple storage backends without asking the agent to invent a parser from scratch.

How to Use read-file skill

Install and invoke read-file

Install the read-file skill in the repository’s skill system, then call it with a path or URL plus a short question. A practical invocation looks like: read-file sales_q1.csv what columns exist and are there nulls? The read-file install flow matters because the skill expects a DuckDB-backed environment, not a generic chat-only prompt.

Give the skill the right input

The best read-file usage starts with a concrete file reference and a question that matches the file type. Strong inputs name the file, source, and outcome you want: read-file s3://bucket/events.parquet summarize row count, key columns, and date range. Weak inputs like “analyze this” force the skill to guess what matters.

Read the repository files first

For read-file guide work, start with SKILL.md and then inspect any adjacent repo files that explain conventions or agent behavior. In this repository, SKILL.md is the primary source of truth; there are no supporting rules/, resources/, or scripts/ folders to widen the workflow. That means the most important decision is understanding the macro-based DuckDB read path and the remote-file prefixes.

Workflow tips that improve output

Transform a vague task into a specific analysis request before invoking the skill. Ask for the exact slice you need, such as “show columns, types, first 20 rows, and suspicious blanks” or “compare sheets in this Excel file.” For read-file for Office Documents, be explicit about the workbook or sheet if you already know it, because that reduces misreads and saves tool calls.

read-file skill FAQ

Is read-file only for data files?

Yes. The skill is intended for structured or semi-structured data, not for application source code or prose docs. If the user wants code review, use a different skill or a direct code-reading prompt.

Do I need DuckDB knowledge to use it?

No. The skill hides most of the DuckDB complexity, but better results come from giving a focused question. Beginners can use it safely if they can point to a file and say what they want to know.

How is this different from asking an AI to “open the file”?

read-file is more reliable because it uses an explicit file-reading workflow and format-aware loaders. That reduces hallucinated summaries and improves behavior on mixed file types, remote URLs, and larger datasets.

When should I not use read-file?

Do not use it when the file is source code, when you need heavy transformation, or when the input is not actually a file or URL. It is also a poor fit if you need full database operations rather than inspection and summary.

How to Improve read-file skill

Ask for the analysis you actually need

The biggest quality jump comes from narrowing the task. Instead of “summarize this spreadsheet,” try “identify the top 10 categories, missing values by column, and any suspicious outliers.” The read-file skill responds best to questions that map cleanly to table inspection.

Provide format-specific hints

If the file is an Excel workbook, say whether you care about one sheet or all sheets. If it is a remote file, include the full URL and, when relevant, the storage type. These details help the skill choose the correct read path and avoid wasted probing.

Watch for common failure modes

The most common issue is ambiguity: bare filenames, multiple similar files, or asking for a business answer without defining the dataset slice. Another failure mode is treating read-file like an editing or ETL skill. Keep the task centered on reading, profiling, and explaining the file contents.

Iterate after the first pass

Use the first output to refine the next prompt. If the initial read reveals columns, ask for deeper checks on only the important fields: duplicates, null patterns, date coverage, or group-level totals. That is the fastest way to get better read-file results without overloading the first call.

Ratings & Reviews

No ratings yet

Share your review

0/10000

Latest reviews

Saving...

more skill

docx

by anthropics

The docx skill helps agents create, inspect, convert, and edit .docx files with practical workflows for pandoc, unpack/repack, comments, tracked changes, and LibreOffice-based conversion.

DOCX Workflows

Favorites 1GitHub 0

xlsx

by anthropics

The xlsx skill helps agents read, edit, repair, create, and convert .xlsx, .xlsm, .csv, and .tsv files when the required deliverable is a spreadsheet. It is strongest for template-preserving updates, formula-safe workbook edits, messy tabular cleanup, and practical spreadsheet workflows backed by repo scripts for packing, validation, and recalculation.

Spreadsheet Workflows

Favorites 0GitHub 105.1k

nutrient-document-processing

by PSPDFKit-labs

nutrient-document-processing is a workflow skill for PDF Processing with Nutrient DWS. It helps you install, understand, and use repeatable document workflows for convert, merge, split, OCR, extract, redact, sign, optimize, and compliance outputs like PDF/A or PDF/UA.

PDF Processing

Favorites 0GitHub 0

minimax-xlsx

by MiniMax-AI

The minimax-xlsx skill helps create, read, edit, validate, and format Excel workbooks with an Excel-first workflow. Use minimax-xlsx for Spreadsheet Workflows when you need structured files that preserve formulas, styles, sheet layout, and workbook behavior. It supports .xlsx, .xlsm, .csv, and .tsv tasks, including analysis, new workbook creation, minimal-invasive edits, formula repair, and validation. The minimax-xlsx guide is designed for real workbook handoff, not flat tables.

Spreadsheet Workflows

Favorites 0GitHub 0

analyzing-macro-malware-in-office-documents

by mukul975

analyzing-macro-malware-in-office-documents helps malware analysts inspect malicious VBA in Word, Excel, and PowerPoint files, decode obfuscation, and extract IOCs, execution paths, and payload staging logic for phishing triage, incident response, and document malware analysis.

Malware Analysis

Favorites 0GitHub 0

notion-knowledge-capture

by makenotion

notion-knowledge-capture turns conversations into structured Notion docs, wikis, FAQs, and decision records. It captures key context, organizes the right content type, and saves it where others can find it. Best for notion-knowledge-capture for Knowledge Base Writing, with practical notion-knowledge-capture usage and guide details for teams.

Knowledge Base Writing

Favorites 0GitHub 107

pptx-generator

by MiniMax-AI

The pptx-generator skill helps you create, edit, and inspect PowerPoint files with less guesswork. Use it for Slide Decks built from notes, outlines, templates, or existing .pptx files, with PptxGenJS creation, XML-based editing, and markitdown text extraction. It is a practical pptx-generator guide for business, product, teaching, and review workflows.

Slide Decks

Favorites 0GitHub 0

visa-doc-translate

by affaan-m

visa-doc-translate translates visa application document images to English and creates a bilingual PDF with the original page and translation. It is built for structured visa paperwork, OCR fallback, rotation handling, and preserving names, dates, and amounts.

Translation

Favorites 0GitHub 156.3k

nutrient-document-processing

by affaan-m

nutrient-document-processing skill for PDF processing and document automation with the Nutrient DWS API. Convert, OCR, extract, redact, sign, watermark, and fill files like PDFs, DOCX, XLSX, PPTX, HTML, and images.

PDF Processing

Favorites 0GitHub 156.2k

minimax-docx

by MiniMax-AI

minimax-docx is a DOCX-focused skill for creating, editing, and formatting Word documents with OpenXML SDK and .NET. It supports three paths: create from scratch, edit existing content, and apply template formatting with XSD validation. Use it when you need a real .docx with structure, style preservation, and fewer layout surprises.

DOCX Workflows

Favorites 0GitHub 11.7k

azure-ai-formrecognizer-java

by microsoft

The azure-ai-formrecognizer-java skill helps Java developers use Azure AI Document Intelligence for OCR extraction, tables, key-value pairs, invoices, receipts, IDs, and custom document models. It aligns with the current com.azure:azure-ai-documentintelligence SDK and is useful when you need practical Java setup, API guidance, and repeatable document analysis.

OCR Extraction

Favorites 0GitHub 2.2k

notion-meeting-intelligence

by makenotion

The notion-meeting-intelligence skill prepares meetings by gathering context from Notion, adding Claude research when useful, and creating an internal pre-read plus an external agenda in Notion. Use it for decision reviews, status updates, customer calls, and other Meeting Prep workflows that need a clear, shareable structure.

Meeting Prep

Favorites 0GitHub 0

notion-knowledge-capture

by makenotion

notion-knowledge-capture turns conversation context into structured Notion pages, including how-to guides, FAQs, decision records, and wiki updates. It fits Knowledge Base Writing when you need the `notion-knowledge-capture` skill to classify content, find the right Notion destination, and make pages easy to discover.

Knowledge Base Writing

Favorites 0GitHub 0

gws-sheets

by googleworkspace

gws-sheets is the Google Sheets skill in googleworkspace/cli for reading, writing, appending, and updating spreadsheets through the Sheets API. Use it for repeatable spreadsheet workflows when you need clearer control than a generic prompt and a practical gws-sheets guide for API-backed tasks.

Spreadsheet Workflows

Favorites 0GitHub 0

gws-slides

by googleworkspace

gws-slides is the Google Slides skill in googleworkspace/cli for reading, creating, and updating presentations with the gws CLI. Use this gws-slides guide for schema-first workflow, install prerequisites, and safe batch updates.

Slide Decks

Favorites 0GitHub 25.5k

gws-docs

by googleworkspace

gws-docs helps you read and write Google Docs through the gws CLI with method-level control for creation, fetching, and batch updates. Follow the install, schema, and usage flow in the gws-docs guide for technical writing and documentation automation.

Technical Writing

Favorites 0GitHub 25.5k