imaging-data-commons
by K-Dense-AIimaging-data-commons helps you query and download public cancer imaging data from NCI Imaging Data Commons with idc-index. Use it for imaging-data-commons usage across CT, MR, PET, and pathology datasets, including metadata search, browser preview, licensing checks, and AI training or data analysis workflows. No authentication required.
This skill scores 82/100, which means it is a solid directory listing for users who need IDC cancer imaging access. The repository gives enough operational detail for an agent to trigger the skill correctly, understand when to use idc-index versus BigQuery/DICOMweb/cloud storage, and execute common workflows with less guesswork than a generic prompt.
- Strong triggerability: the frontmatter clearly says it is for querying and downloading public cancer imaging data from NCI IDC, with no authentication required.
- Good workflow depth: the SKILL.md is large and supported by 10 reference guides covering CLI, clinical data, DICOMweb, BigQuery, cloud storage, pathology, index tables, and SQL patterns.
- High practical leverage: includes version pinning and explicit guidance for when to use each access path, reducing agent ambiguity for real tasks.
- No install command in SKILL.md, so users may need to infer setup steps from the references and code snippets.
- The repository is heavily reference-driven rather than script-backed, so some advanced workflows may still require the agent to synthesize steps from multiple docs.
Overview of imaging-data-commons skill
What imaging-data-commons does
The imaging-data-commons skill helps you query and download public cancer imaging data from the NCI Imaging Data Commons using idc-index. It is best for researchers, ML engineers, and analysts who need radiology or pathology cohorts without first building a custom data ingestion stack.
Who should install it
Use the imaging-data-commons skill if you need to find studies by metadata, inspect available collections, check licensing, preview data in a browser, or pull data for AI training and analysis. It is a strong fit when you want public IDC data with no authentication required.
Why it is different
This skill is not just a generic prompt for “find medical images.” It is anchored to IDC’s data model, versioning, and access patterns, so it can guide you toward the right path for CT, MR, PET, and digital pathology. The main value is reducing guesswork around where to query, what to download, and when to use index tables versus broader access methods.
How to Use imaging-data-commons skill
Install imaging-data-commons
Install the imaging-data-commons skill from the directory package first, then open the skill file and follow its linked references:
npx skills add K-Dense-AI/claude-scientific-skills --skill imaging-data-commons
Start with the right inputs
The imaging-data-commons usage workflow works best when you provide a concrete target, not a vague “help me explore IDC.” Good inputs include the modality, cancer type, collection name, desired output format, and whether you need metadata only or actual file downloads.
Example of a strong prompt:
“Use the imaging-data-commons skill to find public CT lung cancer collections with clinical labels, then show the best collection IDs and the download path for a small pilot cohort.”
Read these files first
For practical execution, read SKILL.md first, then inspect references/use_cases.md, references/cli_guide.md, references/index_tables_guide.md, and the domain guide that matches your task, such as references/digital_pathology_guide.md or references/cloud_storage_guide.md. Those files tell you whether to use the CLI, SQL patterns, index tables, BigQuery, DICOMweb, or direct cloud storage.
Use a decision-first workflow
A good imaging-data-commons guide workflow is: identify the data type, choose the least complex access method that fits, confirm collection-level licensing, then query or download only the subset you need. For data extraction tasks, ask the skill to return the exact collection or series filters, the expected file counts, and the recommended access route before you move to download.
imaging-data-commons skill FAQ
Is imaging-data-commons only for radiology?
No. The imaging-data-commons skill covers radiology and pathology workflows, including slide microscopy, segmentations, and related metadata access. If your task is pathology-heavy, use the matching reference guide rather than assuming the same query pattern fits every dataset.
Do I need cloud credentials or special access?
Usually no. The core imaging-data-commons install and usage flow is designed around public data access, and many common queries do not require authentication. You may need extra setup only for specific paths such as BigQuery or cloud-native workflows.
When should I not use this skill?
Do not use it if you need private hospital data, fully harmonized clinical data across unrelated sources, or a one-line generic image search. It is also a poor fit if you have not decided whether you need metadata discovery, browser visualization, or actual download automation.
Is it beginner friendly?
Yes, if you begin with a concrete objective and let the skill choose the access method. Beginners usually struggle when they ask for “everything in IDC”; they get better results when they specify a disease area, modality, and the intended downstream task.
How to Improve imaging-data-commons skill
Give the skill a tighter target
The fastest way to improve imaging-data-commons usage is to state the cohort boundary and output need upfront. Compare “find IDC data” with “find 50 public PET-CT series for NSCLC, favor collections with clinical labels, and give me a download-ready shortlist.”
Include constraints that change the path
Tell the skill about licensing limits, commercial use restrictions, storage limits, and whether you prefer CLI, Python, SQL, or browser-based inspection. These constraints matter because they determine whether idc-index, BigQuery, DICOMweb, or direct cloud storage is the right route.
Ask for a two-step output
For better imaging-data-commons for Data Analysis results, ask first for discovery and then for execution details: the relevant collections, the recommended filters, and the exact command or query skeleton. That reduces false starts and makes it easier to validate the first answer before downloading large datasets.
Iterate with evidence, not guesswork
If the first result is too broad, narrow it by modality, anatomy, license, or collection name, then ask for a smaller cohort or an alternative access path. The best improvement signal is usually not “more detail,” but a better-defined retrieval target and a clearer handoff from discovery to download.
