rdkit

by K-Dense-AI

The rdkit skill helps with precise cheminformatics workflows: parsing SMILES, SDF, MOL, PDB, and InChI; calculating descriptors; generating fingerprints; running substructure search; handling reactions; and building 2D/3D coordinates. Use this rdkit guide for advanced control, custom sanitization, and rdkit for Data Analysis workflows.

Stars21.4k

Favorites0

Comments0

AddedMay 14, 2026

CategoryData Analysis

Install Command

npx skills add K-Dense-AI/claude-scientific-skills --skill rdkit

Curation Score

This skill scores 84/100, which means it is a solid directory listing for users who need RDKit-specific cheminformatics control. The repository shows real workflow content, clear trigger guidance, and helper scripts that reduce guesswork versus a generic prompt, though it is more reference-heavy than turnkey.

84/100

Strengths

Explicitly scopes when to use rdkit vs. datamol, helping agents choose the right tool for advanced molecular control.
Includes substantial workflow coverage in SKILL.md plus three supporting scripts for properties, similarity search, and substructure filtering.
Backed by reference files for API calls, descriptors, and SMARTS patterns, which improves triggerability and operational clarity.

Cautions

No install command in SKILL.md, so users may need to handle environment setup separately.
Some content is reference-oriented rather than step-by-step, so first-time adoption may still require RDKit familiarity.

Python Chemistry Drug Discovery Bioinformatics

Overview

Overview of rdkit skill

What rdkit is for

The rdkit skill is for cheminformatics work that needs precise molecular handling: parsing SMILES, SDF/MOL/PDB/InChI, computing descriptors, generating fingerprints, running substructure search, and working with reactions or 2D/3D coordinates. It is most useful when a simple prompt is not enough and you need the rdkit skill to apply the right API patterns, sanitization steps, and file formats.

Best-fit users and jobs

Use this rdkit guide if you are doing molecule cleanup, property calculation, similarity screening, library filtering, or structure-based data preparation for drug discovery and computational chemistry. It is also a strong fit for rdkit for Data Analysis when you need reproducible batch processing over many molecules instead of one-off notebook exploration.

Why this skill is different

This rdkit skill favors fine-grained control over convenience. The repository supports direct Python API use plus helper scripts and reference files for descriptors, SMARTS, and similarity workflows. That makes it better for advanced control, custom sanitization, and specialized algorithms than a generic prompt or a lightweight wrapper.

How to Use rdkit skill

Install and trigger context

Install the skill in your Claude skills environment, then make your request explicit about the molecule source, output goal, and constraints. A good rdkit install flow is to provide both the chemistry task and the data shape, such as SMILES in CSV, SDF file, batch library, or single query molecule.

Give the skill the right input

Strong inputs include the exact structure format, the target operation, and any chemistry rules. For example: “Use rdkit to read this SDF, remove invalid molecules, calculate MW/LogP/TPSA, and export a CSV with canonical SMILES.” If you need substructure work, include the SMARTS pattern and whether matching is inclusive or exclusive.

Read these files first

Start with SKILL.md, then inspect references/api_reference.md, references/descriptors_reference.md, and references/smarts_patterns.md for the supported methods and pattern syntax. If you plan to automate batch work, read scripts/molecular_properties.py, scripts/similarity_search.py, and scripts/substructure_filter.py to see the repo’s practical workflow shape.

Workflow tips that improve output

Prefer a staged prompt: parse, validate, transform, then export. State whether sanitization should be strict or permissive, whether stereochemistry matters, and whether you want canonical SMILES or original ordering preserved. For rdkit usage, this prevents the common failure mode where molecules parse but downstream descriptors or fingerprints are computed on the wrong form.

rdkit skill FAQ

Is rdkit better than a normal prompt?

Usually yes when the task depends on exact APIs, file I/O, SMARTS syntax, or batch processing. A normal prompt can describe cheminformatics concepts, but the rdkit skill is better when you need reliable rdkit install guidance, concrete code paths, and fewer assumptions about molecule formats.

When should I not use rdkit?

Do not choose rdkit if you only need high-level molecule summaries with minimal control. The repository itself notes that datamol can be a simpler wrapper around RDKit for standard workflows, so rdkit is the better fit when you need direct API control rather than convenience.

Is it beginner-friendly?

Yes, if the task is scoped tightly. Beginners can ask for simple rdkit usage like converting SMILES to properties or filtering molecules by a SMARTS pattern. The main blocker is usually not chemistry knowledge but ambiguous input: unclear file type, missing charge/stereo rules, or no target output schema.

What should I expect from the ecosystem?

Expect Python-first workflows with RDKit modules, helper scripts, and reference tables rather than a large app framework. The rdkit skill works best when you already know the molecule source and want a practical analysis or transformation pipeline.

How to Improve rdkit skill

Start with the decision that matters most

The biggest quality gain comes from specifying the molecular representation and the success criterion. Tell the rdkit skill whether the task is descriptor calculation, similarity search, substructure filtering, or structure conversion, and define what counts as a valid result, such as “only sanitized molecules” or “keep stereochemistry intact.”

Provide chemistry constraints up front

Common failure modes are hidden assumptions about salts, tautomers, explicit hydrogens, aromaticity, and invalid structures. If those matter, say so directly: for example, “strip salts before descriptors,” “preserve original stereochemistry,” or “treat failed sanitization as a rejection instead of repairing it.”

Use concrete prompt patterns

Stronger prompts look like this: “Using rdkit, read molecules.smi, reject invalid SMILES, compute MW, LogP, TPSA, and produce a CSV with canonical SMILES and a passed flag.” That is better than “analyze these molecules,” because it tells the skill what to parse, what to calculate, and how to format the result.

Iterate from output quality, not just code

After the first pass, check whether the output matches your chemistry rules and downstream toolchain. If results look off, refine the prompt with one additional constraint at a time: fingerprint type, SMARTS library, descriptor set, or export format. For rdkit for Data Analysis, this usually improves reproducibility more than asking for more features.

Ratings & Reviews

No ratings yet

Share your review

0/10000

Latest reviews

Saving...

more skill

clickhouse-best-practices

by ClickHouse

clickhouse-best-practices is a ClickHouse best practices skill for Database Engineering. It guides schema design, query tuning, insert strategy, and agent connectivity with rule-based recommendations, making clickhouse-best-practices usage easier to trigger, review, and cite in ClickHouse workflows.

Database Engineering

Favorites 0GitHub 412

chdb-datastore

by ClickHouse

chdb-datastore is a pandas-compatible skill for fast data analysis with a ClickHouse-backed DataStore API. It supports file, database, and cloud connectors, cross-source joins, and minimal code changes for pandas-style workflows. Use this chdb-datastore guide when you want a drop-in analysis layer for larger datasets.

Data Analysis

Favorites 0GitHub 0

sympy

by K-Dense-AI

Use the sympy skill for exact symbolic math in Python, including algebra, calculus, matrices, physics formulas, number theory, geometry, and code generation. It helps you keep expressions exact, choose the right SymPy modules, and avoid float-heavy mistakes. Best for users who need a practical sympy guide for symbolic workflows and sympy for Data Analysis.

Data Analysis

Favorites 0GitHub 21.4k

interpreting-culture-index

by trailofbits

interpreting-culture-index helps interpret Culture Index surveys, profile exports, and related hiring or coaching notes. Use this interpreting-culture-index skill for role fit, team dynamics, burnout risk, candidate debriefs, onboarding plans, and conflict mediation. It emphasizes arrow-relative reading, anti-pattern checks, and practical outputs for data analysis and decision support.

Data Analysis

Favorites 0GitHub 5k

azure-search-documents-py

by microsoft

azure-search-documents-py is the Python Azure AI Search skill for backend development, covering install, auth, index design, vector search, hybrid search, semantic ranking, and agentic retrieval. Use the azure-search-documents-py skill when you need practical guidance from setup to working query patterns.

Backend Development

Favorites 0GitHub 2.3k

gget

by K-Dense-AI

gget is a bioinformatics skill for fast, unified access to 20+ genomic databases and analysis tools from CLI or Python. Use it for gene info, BLAST-related lookups, AlphaFold structures, expression data, disease associations, and enrichment-style analysis. It suits quick exploration and gget for Data Analysis workflows.

Data Analysis

Favorites 0GitHub 0

channel-economics

by alirezarezvani

channel-economics helps RevOps and commercial leaders compare direct, partner, marketplace, reseller, or OEM channels with fully loaded cost-to-serve, ROI lenses, and constrained channel-mix recommendations. Includes Python scripts, data templates, and guidance for channel-economics usage.

Revenue Operations

Favorites 0GitHub 22.1k

torch-geometric

by K-Dense-AI

torch-geometric skill guide for PyTorch Geometric graph neural networks. Use it for torch-geometric install help, torch-geometric usage, graph classification, node classification, link prediction, heterogeneous graphs, custom MessagePassing layers, and scaling GNNs for Machine Learning workflows.

Machine Learning

Favorites 0GitHub 21.4k

huggingface-vision-trainer

by huggingface

huggingface-vision-trainer helps you install and use a Hugging Face skill for vision training jobs: object detection, image classification, and SAM/SAM2 segmentation. It covers dataset prep, cloud GPU setup, evaluation, Trackio logging, and pushing results to the Hub. Ideal for backend automation and repeatable training workflows.

Backend Development

Favorites 0GitHub 10.4k

seo-dataforseo

by AgriciDaniel

seo-dataforseo connects Claude to live SEO data through the DataForSEO MCP server for SERP checks, keyword research, backlinks, on-page analysis, competitor research, business listings, and AI visibility tracking. It is best for data-backed workflows when you need real search evidence, clear install guidance, and practical seo-dataforseo usage.

Keyword Research

Favorites 0GitHub 6.2k

pymc

by K-Dense-AI

PyMC is a Bayesian modeling skill for building, fitting, checking, and comparing probabilistic models in Python. Use pymc for hierarchical regression, multilevel analysis, time series, missing data, measurement error, and model comparison with LOO or WAIC.

Data Analysis

Favorites 0GitHub 0

pymatgen

by K-Dense-AI

pymatgen is a Python materials science toolkit for crystal structures, phase diagrams, electronic structure, and file conversion. This pymatgen skill helps with scientific workflows using CIF, POSCAR, VASP, and Materials Project data.

Scientific

Favorites 0GitHub 0

geopandas

by K-Dense-AI

geopandas skill for Python geospatial vector data analysis, including shapefiles, GeoJSON, and GeoPackage files. Use it to read, clean, join, buffer, clip, reproject, and export spatial data with less guesswork.

Data Analysis

Favorites 0GitHub 0

analyzing-threat-intelligence-feeds

by mukul975

Analyzing-threat-intelligence-feeds helps you ingest CTI feeds, normalize indicators, assess feed quality, and enrich IOCs for STIX 2.1 workflows. This analyzing-threat-intelligence-feeds skill is built for threat intel operations and Data Analysis, with practical guidance for TAXII, MISP, and commercial feeds.

Data Analysis

Favorites 0GitHub 0

azure-ai-textanalytics-py

by microsoft

azure-ai-textanalytics-py is a skill for Azure AI Text Analytics in Python. It helps with sentiment analysis, entity recognition, key phrase extraction, language detection, PII detection, and healthcare NLP. Use it when you need a fast path to Azure client setup, authentication, and practical text analytics usage for apps, notebooks, or data analysis workflows.

Data Analysis

Favorites 0GitHub 0

chdb-sql

by ClickHouse

chdb-sql is a GitHub skill for running ClickHouse SQL in Python without a server. It covers chdb.query(), Session, DB-API connections, table functions like file() and s3(), parametrized queries, and backend development workflows for local files and external data sources.

Backend Development

Favorites 0GitHub 0