diffdock

by K-Dense-AI

diffdock is a docking skill for predicting protein-ligand binding poses from PDB structures or protein sequences plus ligands in SMILES, SDF, or MOL2. Use the diffdock skill for structure-based drug design, virtual screening, and confidence-scored pose analysis. It is not for binding affinity prediction.

Stars21.3k

Favorites0

Comments0

AddedMay 14, 2026

CategoryData Analysis

Install Command

npx skills add K-Dense-AI/claude-scientific-skills --skill diffdock

Curation Score

This skill scores 78/100, which means it is a solid listing candidate for Agent Skills Finder. Directory users get enough real workflow content to decide on install: the skill clearly targets DiffDock protein-ligand docking, includes batch and single-complex workflows, and adds supporting scripts plus reference docs that reduce guesswork beyond a generic prompt.

78/100

Strengths

Clear task trigger: the frontmatter and overview explicitly frame the skill for diffusion-based molecular docking from PDB/SMILES inputs.
Operational workflow support: repository includes 3 scripts plus batch CSV and inference config templates, which help agents prepare inputs and analyze outputs.
Good guidance depth: reference docs cover parameters, workflows/examples, and confidence/limitations, improving install decision value and execution clarity.

Cautions

No install command in SKILL.md, so users may need to infer setup from referenced workflows rather than follow an in-repo one-step install path.
The skill is focused on pose prediction and confidence, not affinity prediction, so users seeking binding-energy estimation will need additional tools.

Bioinformatics Machine Learning Python Scientific Drug Sensitivity Protein Biology Pharmaceutical

Overview

Overview of diffdock skill

What diffdock is for

DiffDock is a docking-focused skill for predicting protein-ligand binding poses from a protein structure or sequence plus a ligand input. Use the diffdock skill when you need a practical answer to “where and how might this compound bind?” rather than a binding-affinity estimate.

Best fit and decision boundary

It fits structure-based drug design, virtual screening, and pose generation for downstream analysis. It is a weaker fit if you only need ranking by potency, if your protein target is highly flexible, or if you want a generic chemistry workflow instead of a pose-prediction workflow.

What makes it useful

The main value is that diffdock combines single-complex docking, batch screening, confidence scoring, and sequence-based protein input in one workflow. That makes diffdock install worthwhile when you want both an executable docking path and enough guidance to avoid misreading the scores.

How to Use diffdock skill

Install and inspect the workflow

Install the diffdock skill in your Claude skills setup, then open SKILL.md first. After that, read references/workflows_examples.md, references/parameters_reference.md, and references/confidence_and_limitations.md to understand the actual input shapes, defaults, and score interpretation before running a job.

Turn your task into a usable prompt

For diffdock usage, give the skill the protein format, ligand format, and job type up front. Good input is specific, for example: “Dock this SMILES to this PDB and return the top 5 poses with confidence interpretation,” or “Prepare batch docking for these ligands against one receptor.” Weak input is just “run diffdock,” because it hides whether the skill should use a file, a sequence, or a CSV batch.

Use the right files and outputs

For single docking, start with a protein PDB and a ligand in SMILES, SDF, or MOL2. For batch work, use the CSV template in assets/batch_template.csv and check scripts/prepare_batch_csv.py if you need validation before execution. After a run, scripts/analyze_results.py helps summarize pose ranks and confidence scores so you do not manually inspect every output file.

Practical setup tips

DiffDock install and first run can be slowed by model weights and lookup-table generation, so plan for that setup cost. If your protein is not available as a structure, the skill supports sequence-based folding, but that adds uncertainty; use it when no experimental structure exists, not as a default shortcut. Adjust sampling only when the task is hard, because more samples improve search coverage but also increase compute and post-processing work.

diffdock skill FAQ

Is diffdock only for PDB files?

No. The diffdock skill supports protein structures and, in some workflows, protein sequences that are folded before docking. It is still best to use an actual PDB when you have one, because sequence-derived structures add another source of error.

Does diffdock predict affinity?

No. DiffDock predicts binding poses and confidence, not binding affinity. If you need affinity-like prioritization, pair diffdock with a scoring or rescoring step instead of treating confidence as potency.

Is the diffdock skill beginner friendly?

Yes, if your job is straightforward: one receptor, one ligand, one pose question. It becomes harder when you need batch curation, flexible proteins, or careful interpretation of low-confidence samples. The skill is beginner-friendly for docking, not for replacing domain judgment.

When should I not use it?

Do not rely on diffdock for targets where conformational change is the main binding mechanism, or when you only have a very uncertain ligand representation. It is also a poor substitute for a full medicinal chemistry analysis workflow if your real question is SAR, selectivity, or ADMET.

How to Improve diffdock skill

Give the skill better molecular context

The strongest diffdock results usually come from clean inputs: a correct receptor file, a ligand with a known protonation assumption, and a clear definition of the binding problem. If the site is known, say so. If it is a blind docking task, say that too, because the search strategy and expected confidence differ.

Ask for the output you will actually use

Improve diffdock usage by specifying whether you want the top pose, top 5 poses, batch screening, or confidence-ranked candidates. If you plan to compare results later, ask for consistent file naming and a summary table. This reduces ambiguity and makes the output easier to integrate into analysis for Data Analysis or screening reports.

Watch the common failure modes

The most common mistakes are treating confidence as affinity, using poor ligand preparation, and overtrusting runs on proteins outside the model’s comfort zone. If results look unstable, rerun with more samples, compare multiple top poses, and inspect whether the ligand chemistry or protein state is the actual blocker rather than the model.

Iterate with targeted follow-up prompts

After the first run, improve the next diffdock prompt with the specific problem: bad site placement, inconsistent pose clustering, or low confidence scores. That is more useful than asking for a generic rerun. When you need diffdock for Data Analysis, include the metric you want extracted from outputs, such as rank distribution, score thresholds, or per-complex summaries.

Ratings & Reviews

No ratings yet

Share your review

0/10000

Latest reviews

Saving...

more skill

clickhouse-best-practices

by ClickHouse

clickhouse-best-practices is a ClickHouse best practices skill for Database Engineering. It guides schema design, query tuning, insert strategy, and agent connectivity with rule-based recommendations, making clickhouse-best-practices usage easier to trigger, review, and cite in ClickHouse workflows.

Database Engineering

Favorites 0GitHub 412

chdb-datastore

by ClickHouse

chdb-datastore is a pandas-compatible skill for fast data analysis with a ClickHouse-backed DataStore API. It supports file, database, and cloud connectors, cross-source joins, and minimal code changes for pandas-style workflows. Use this chdb-datastore guide when you want a drop-in analysis layer for larger datasets.

Data Analysis

Favorites 0GitHub 0

sympy

by K-Dense-AI

Use the sympy skill for exact symbolic math in Python, including algebra, calculus, matrices, physics formulas, number theory, geometry, and code generation. It helps you keep expressions exact, choose the right SymPy modules, and avoid float-heavy mistakes. Best for users who need a practical sympy guide for symbolic workflows and sympy for Data Analysis.

Data Analysis

Favorites 0GitHub 21.4k

interpreting-culture-index

by trailofbits

interpreting-culture-index helps interpret Culture Index surveys, profile exports, and related hiring or coaching notes. Use this interpreting-culture-index skill for role fit, team dynamics, burnout risk, candidate debriefs, onboarding plans, and conflict mediation. It emphasizes arrow-relative reading, anti-pattern checks, and practical outputs for data analysis and decision support.

Data Analysis

Favorites 0GitHub 5k

azure-search-documents-py

by microsoft

azure-search-documents-py is the Python Azure AI Search skill for backend development, covering install, auth, index design, vector search, hybrid search, semantic ranking, and agentic retrieval. Use the azure-search-documents-py skill when you need practical guidance from setup to working query patterns.

Backend Development

Favorites 0GitHub 2.3k

gget

by K-Dense-AI

gget is a bioinformatics skill for fast, unified access to 20+ genomic databases and analysis tools from CLI or Python. Use it for gene info, BLAST-related lookups, AlphaFold structures, expression data, disease associations, and enrichment-style analysis. It suits quick exploration and gget for Data Analysis workflows.

Data Analysis

Favorites 0GitHub 0

channel-economics

by alirezarezvani

channel-economics helps RevOps and commercial leaders compare direct, partner, marketplace, reseller, or OEM channels with fully loaded cost-to-serve, ROI lenses, and constrained channel-mix recommendations. Includes Python scripts, data templates, and guidance for channel-economics usage.

Revenue Operations

Favorites 0GitHub 22.1k

torch-geometric

by K-Dense-AI

torch-geometric skill guide for PyTorch Geometric graph neural networks. Use it for torch-geometric install help, torch-geometric usage, graph classification, node classification, link prediction, heterogeneous graphs, custom MessagePassing layers, and scaling GNNs for Machine Learning workflows.

Machine Learning

Favorites 0GitHub 21.4k

rdkit

by K-Dense-AI

The rdkit skill helps with precise cheminformatics workflows: parsing SMILES, SDF, MOL, PDB, and InChI; calculating descriptors; generating fingerprints; running substructure search; handling reactions; and building 2D/3D coordinates. Use this rdkit guide for advanced control, custom sanitization, and rdkit for Data Analysis workflows.

Data Analysis

Favorites 0GitHub 21.4k

huggingface-vision-trainer

by huggingface

huggingface-vision-trainer helps you install and use a Hugging Face skill for vision training jobs: object detection, image classification, and SAM/SAM2 segmentation. It covers dataset prep, cloud GPU setup, evaluation, Trackio logging, and pushing results to the Hub. Ideal for backend automation and repeatable training workflows.

Backend Development

Favorites 0GitHub 10.4k

seo-dataforseo

by AgriciDaniel

seo-dataforseo connects Claude to live SEO data through the DataForSEO MCP server for SERP checks, keyword research, backlinks, on-page analysis, competitor research, business listings, and AI visibility tracking. It is best for data-backed workflows when you need real search evidence, clear install guidance, and practical seo-dataforseo usage.

Keyword Research

Favorites 0GitHub 6.2k

pymc

by K-Dense-AI

PyMC is a Bayesian modeling skill for building, fitting, checking, and comparing probabilistic models in Python. Use pymc for hierarchical regression, multilevel analysis, time series, missing data, measurement error, and model comparison with LOO or WAIC.

Data Analysis

Favorites 0GitHub 0

pymatgen

by K-Dense-AI

pymatgen is a Python materials science toolkit for crystal structures, phase diagrams, electronic structure, and file conversion. This pymatgen skill helps with scientific workflows using CIF, POSCAR, VASP, and Materials Project data.

Scientific

Favorites 0GitHub 0

geopandas

by K-Dense-AI

geopandas skill for Python geospatial vector data analysis, including shapefiles, GeoJSON, and GeoPackage files. Use it to read, clean, join, buffer, clip, reproject, and export spatial data with less guesswork.

Data Analysis

Favorites 0GitHub 0

analyzing-threat-intelligence-feeds

by mukul975

Analyzing-threat-intelligence-feeds helps you ingest CTI feeds, normalize indicators, assess feed quality, and enrich IOCs for STIX 2.1 workflows. This analyzing-threat-intelligence-feeds skill is built for threat intel operations and Data Analysis, with practical guidance for TAXII, MISP, and commercial feeds.

Data Analysis

Favorites 0GitHub 0

azure-ai-textanalytics-py

by microsoft

azure-ai-textanalytics-py is a skill for Azure AI Text Analytics in Python. It helps with sentiment analysis, entity recognition, key phrase extraction, language detection, PII detection, and healthcare NLP. Use it when you need a fast path to Azure client setup, authentication, and practical text analytics usage for apps, notebooks, or data analysis workflows.

Data Analysis

Favorites 0GitHub 0