read-file
by duckdbread-file helps an agent read and inspect CSV, JSON, Parquet, Avro, Excel, SQLite, spatial files, or remote URLs with DuckDB. Use it to preview rows, check schema, profile data, and answer what’s in this file. It’s best for read-file usage on real data artifacts, not source code.
This skill scores 74/100, which means it is worth listing for directory users: it has a real, usable workflow for reading many file types and remote URLs with DuckDB, but it is still somewhat limited in discoverability and adoption guidance. Users can likely trigger it successfully, yet they may need some extra judgment around setup and fit.
- Strong triggerability: the frontmatter says it is for reading data files or remote URLs and explicitly excludes source code, which helps agents route requests correctly.
- Concrete operational workflow: it gives a step-by-step DuckDB command pattern, including a single inline macro and protocol-specific handling for HTTP, S3, GCS, and Azure.
- Good agent leverage: it covers many data formats in one skill (CSV, JSON, Parquet, Avro, Excel, spatial, SQLite, blob), reducing guesswork versus a generic prompt.
- Install decision context is somewhat thin: description length is very short and there are no support files, references, or README to help users evaluate edge cases or integration fit.
- The file is workflow-heavy but not fully self-contained in the preview; users may still need to inspect the full SQL/bash example to understand exact behavior and limits.
Overview of read-file skill
The read-file skill helps an agent read and inspect data files with DuckDB instead of guessing from filename alone. It is best for users who need a fast preview, schema check, or lightweight profile of CSV, JSON, Parquet, Avro, Excel, SQLite, spatial files, or a remote URL. If your job is “tell me what’s in this file” or “summarize this dataset,” the read-file skill is a strong fit; if you need to edit source code, it is not.
What read-file is for
The core job-to-be-done is quick data understanding: read the file, identify the format, and answer a question about contents, shape, or obvious issues. This is more useful than a generic prompt because the skill is built around DuckDB’s file readers and supports local paths plus common remote sources such as https:// and s3://.
When it fits best
Use the read-file skill when the input is a real data artifact and you need an answer grounded in the file itself. It is especially useful for first-pass analysis before loading data into a notebook, pipeline, or BI tool.
Key differentiators
The main advantage of read-file is its format breadth and its one-command workflow. It is designed to reduce setup friction, resolve bare filenames, and handle multiple storage backends without asking the agent to invent a parser from scratch.
How to Use read-file skill
Install and invoke read-file
Install the read-file skill in the repository’s skill system, then call it with a path or URL plus a short question. A practical invocation looks like: read-file sales_q1.csv what columns exist and are there nulls? The read-file install flow matters because the skill expects a DuckDB-backed environment, not a generic chat-only prompt.
Give the skill the right input
The best read-file usage starts with a concrete file reference and a question that matches the file type. Strong inputs name the file, source, and outcome you want: read-file s3://bucket/events.parquet summarize row count, key columns, and date range. Weak inputs like “analyze this” force the skill to guess what matters.
Read the repository files first
For read-file guide work, start with SKILL.md and then inspect any adjacent repo files that explain conventions or agent behavior. In this repository, SKILL.md is the primary source of truth; there are no supporting rules/, resources/, or scripts/ folders to widen the workflow. That means the most important decision is understanding the macro-based DuckDB read path and the remote-file prefixes.
Workflow tips that improve output
Transform a vague task into a specific analysis request before invoking the skill. Ask for the exact slice you need, such as “show columns, types, first 20 rows, and suspicious blanks” or “compare sheets in this Excel file.” For read-file for Office Documents, be explicit about the workbook or sheet if you already know it, because that reduces misreads and saves tool calls.
read-file skill FAQ
Is read-file only for data files?
Yes. The skill is intended for structured or semi-structured data, not for application source code or prose docs. If the user wants code review, use a different skill or a direct code-reading prompt.
Do I need DuckDB knowledge to use it?
No. The skill hides most of the DuckDB complexity, but better results come from giving a focused question. Beginners can use it safely if they can point to a file and say what they want to know.
How is this different from asking an AI to “open the file”?
read-file is more reliable because it uses an explicit file-reading workflow and format-aware loaders. That reduces hallucinated summaries and improves behavior on mixed file types, remote URLs, and larger datasets.
When should I not use read-file?
Do not use it when the file is source code, when you need heavy transformation, or when the input is not actually a file or URL. It is also a poor fit if you need full database operations rather than inspection and summary.
How to Improve read-file skill
Ask for the analysis you actually need
The biggest quality jump comes from narrowing the task. Instead of “summarize this spreadsheet,” try “identify the top 10 categories, missing values by column, and any suspicious outliers.” The read-file skill responds best to questions that map cleanly to table inspection.
Provide format-specific hints
If the file is an Excel workbook, say whether you care about one sheet or all sheets. If it is a remote file, include the full URL and, when relevant, the storage type. These details help the skill choose the correct read path and avoid wasted probing.
Watch for common failure modes
The most common issue is ambiguity: bare filenames, multiple similar files, or asking for a business answer without defining the dataset slice. Another failure mode is treating read-file like an editing or ETL skill. Keep the task centered on reading, profiling, and explaining the file contents.
Iterate after the first pass
Use the first output to refine the next prompt. If the initial read reveals columns, ask for deeper checks on only the important fields: duplicates, null patterns, date coverage, or group-level totals. That is the fastest way to get better read-file results without overloading the first call.
