data-analytics

by markdown-viewer

The data-analytics skill creates PlantUML diagrams for data analysis workflows, including ETL, ELT, data lakes, warehouses, streaming pipelines, log analytics, and BI dashboards. It is optimized for clear source-to-destination flow, AWS analytics/database stencils, and practical data-analytics guide output—not generic software or cloud architecture diagrams.

Stars1.1k

Favorites0

Comments0

AddedApr 13, 2026

CategoryData Analysis

Install Command

npx skills add markdown-viewer/skills --skill data-analytics

Curation Score

This skill scores 78/100, which means it is a solid listing candidate for directory users. It gives enough concrete workflow guidance to help an agent trigger the right kind of output (data analytics and pipeline diagrams in PlantUML) with less guesswork than a generic prompt, though users should expect a few adoption gaps such as missing install command and limited supporting files.

78/100

Strengths

Strong triggerability: the frontmatter clearly scopes the skill to data analytics and pipeline diagrams, with explicit NOT-use guidance against general UML/cloud modeling.
Operationally useful workflow: it gives a quick start, critical rules, and PlantUML-specific constraints like @startuml/@enduml, left-to-right flow, and async dashed links.
Good install decision value: multiple example files cover real analytics patterns such as ETL, data lakes, warehouses, CDC, log analytics, and BI dashboards.

Cautions

No support files or install command are provided, so adoption depends mainly on SKILL.md and examples rather than executable tooling.
The skill is narrowly specialized to AWS/MxGraph analytics stencils, so it is less useful for non-AWS analytics architectures or general diagramming.

Analytics Data Engineering Data Pipelines Business Analytics Dashboard Aws Plantuml

Overview

Overview of data-analytics skill

The data-analytics skill helps you generate PlantUML diagrams for analytics systems: ETL flows, data lakes, warehouses, streaming pipelines, log analytics, and BI dashboards. It is the right fit when you need a data-analytics guide for turning a rough architecture into a clear diagram with AWS analytics and database stencils, not just a generic prompt that names components.

Use this data-analytics skill if you want fast, readable diagrams for data analysis workflows where the pipeline order matters: source, ingest, transform, store, and visualize. It is especially useful when you need to show governance, staging, cataloging, or near-real-time movement across systems.

Best fit for pipeline and warehouse diagrams

The skill is strongest when the output should communicate how data moves, not just what tools exist. That includes ETL/ELT, CDC, lakehouse-style layouts, Redshift-centered warehouses, and operational-to-analytics handoffs. If your goal is a data-analytics for Data Analysis diagram that stakeholders can skim quickly, this skill is a good match.

What makes this skill different

The repository is opinionated about diagram structure and syntax: it expects PlantUML fences, @startuml / @enduml, left-to-right flow, and mxgraph.aws4.* stencil icons. That makes the resulting diagrams more consistent than a free-form prompt, and it reduces guesswork around icon choice and layout.

When not to use it

Do not use data-analytics for general software architecture, UML class diagrams, or broad cloud infrastructure maps. If the main story is application components rather than data movement, a different skill will produce a better result and fewer corrections.

How to Use data-analytics skill

Install and verify the skill context

For a normal data-analytics install, add the skill from the repo and then inspect the top-level instruction file first:

Install with npx skills add markdown-viewer/skills --skill data-analytics.
Open SKILL.md to confirm the diagram rules.
Check the example files in examples/ before drafting your own prompt.

The skill is compact, so the examples matter more than a long rules section. They show the actual syntax patterns the model is expected to follow.

Start from the workflow, not the tool list

A strong data-analytics usage request describes the data story in stages, not as a bag of AWS services. For example, instead of “make a warehouse diagram with Redshift and Glue,” use a prompt that specifies:

sources: RDS, S3, Kafka, DynamoDB
ingest path: batch, streaming, CDC, or scheduled ETL
transforms: validation, schema mapping, enrichment
destination: S3 lake, Redshift, Athena, or OpenSearch
consumers: dashboards, analysts, ML features, or alerts

That structure helps the skill choose the right stencils and arrows.

Read the right examples first

For the fastest ramp-up, preview these files in order:

SKILL.md
examples/etl-pipeline.md
examples/data-lake.md
examples/data-warehouse.md
examples/real-time-streaming.md
examples/multi-source-bi.md

If your use case is specialized, also inspect examples/cdc-pipeline.md, examples/log-analytics.md, or examples/ml-feature-pipeline.md. These examples show how the data-analytics skill handles edge cases like asynchronous flow, warehouse loading, and feature engineering.

Prompting tips that improve output quality

A good prompt for this skill gives enough domain detail to prevent generic diagrams. Include the source systems, whether flow is batch or streaming, and what “done” means for the data. For example, “show daily orders from PostgreSQL into S3 Parquet, then Glue ETL into Redshift for QuickSight reporting” is much better than “draw an analytics pipeline.”

If you need a tighter result, specify the stages you want visible and the ones you want omitted. That keeps the diagram focused and avoids unnecessary boxes.

data-analytics skill FAQ

Is this only for AWS-based diagrams?

Mostly yes. The data-analytics skill is built around mxgraph.aws4.* stencils, so it is best when AWS services are part of the architecture or when you want AWS-style analytics symbols. If your stack is mostly non-AWS, the skill may still work, but the output will be less natural.

How is this different from a normal prompt?

A normal prompt can describe a pipeline, but the data-analytics skill encodes diagram syntax, flow direction, and icon conventions. That matters when you want reliable PlantUML output instead of a one-off sketch. The skill is more repeatable for data-analytics usage because it nudges the model toward consistent structure.

Is it beginner-friendly?

Yes, if you can describe your data flow in plain language. You do not need to know PlantUML deeply, but you do need to name the major stages and endpoints clearly. Beginners usually get the best results by copying one example pattern and replacing the systems with their own.

When should I choose a different skill?

Use something else if you need generic UML, app service topology, or provider-neutral cloud infrastructure. data-analytics is strongest when the primary object is the movement and transformation of data, not the deployment of applications.

How to Improve data-analytics skill

Give the skill the business outcome

The best data-analytics results come from prompts that explain why the diagram exists. Say whether the audience is an engineer, analyst, or executive, and whether the diagram must emphasize latency, governance, cost, or reporting. That changes which stages deserve visual prominence.

Include the constraints that affect the design

If the pipeline has schema drift, late-arriving events, compliance boundaries, or multiple consumers, mention that up front. Those constraints help the skill choose meaningful elements like crawlers, catalogs, staging buckets, or async arrows instead of a simplistic straight line.

Use concrete inputs and preferred shape

Stronger inputs look like this:

“Batch ETL from Salesforce and PostgreSQL into S3, then Redshift, with a Glue crawler and data quality gate”
“Real-time clickstream from Kinesis to Lambda enrichment, then OpenSearch and S3 archive”
“CDC from Aurora and DynamoDB into a warehouse with staging and replay handling”

These are better than vague requests because they define the path, not just the destination.

Iterate by checking the weakest stage first

After the first diagram, review the part that most often breaks trust: source labeling, transform naming, or sink selection. If the flow is correct but too broad, narrow the prompt to a single pipeline. If the diagram is correct but too thin, add one more stage that matters operationally, such as a catalog, validation step, or BI consumer.

Ratings & Reviews

No ratings yet

Share your review

0/10000

Latest reviews

Saving...

more skill

clickhouse-best-practices

by ClickHouse

clickhouse-best-practices is a ClickHouse best practices skill for Database Engineering. It guides schema design, query tuning, insert strategy, and agent connectivity with rule-based recommendations, making clickhouse-best-practices usage easier to trigger, review, and cite in ClickHouse workflows.

Database Engineering

Favorites 0GitHub 412

chdb-datastore

by ClickHouse

chdb-datastore is a pandas-compatible skill for fast data analysis with a ClickHouse-backed DataStore API. It supports file, database, and cloud connectors, cross-source joins, and minimal code changes for pandas-style workflows. Use this chdb-datastore guide when you want a drop-in analysis layer for larger datasets.

Data Analysis

Favorites 0GitHub 0

sympy

by K-Dense-AI

Use the sympy skill for exact symbolic math in Python, including algebra, calculus, matrices, physics formulas, number theory, geometry, and code generation. It helps you keep expressions exact, choose the right SymPy modules, and avoid float-heavy mistakes. Best for users who need a practical sympy guide for symbolic workflows and sympy for Data Analysis.

Data Analysis

Favorites 0GitHub 21.4k

interpreting-culture-index

by trailofbits

interpreting-culture-index helps interpret Culture Index surveys, profile exports, and related hiring or coaching notes. Use this interpreting-culture-index skill for role fit, team dynamics, burnout risk, candidate debriefs, onboarding plans, and conflict mediation. It emphasizes arrow-relative reading, anti-pattern checks, and practical outputs for data analysis and decision support.

Data Analysis

Favorites 0GitHub 5k

azure-search-documents-py

by microsoft

azure-search-documents-py is the Python Azure AI Search skill for backend development, covering install, auth, index design, vector search, hybrid search, semantic ranking, and agentic retrieval. Use the azure-search-documents-py skill when you need practical guidance from setup to working query patterns.

Backend Development

Favorites 0GitHub 2.3k

gget

by K-Dense-AI

gget is a bioinformatics skill for fast, unified access to 20+ genomic databases and analysis tools from CLI or Python. Use it for gene info, BLAST-related lookups, AlphaFold structures, expression data, disease associations, and enrichment-style analysis. It suits quick exploration and gget for Data Analysis workflows.

Data Analysis

Favorites 0GitHub 0

channel-economics

by alirezarezvani

channel-economics helps RevOps and commercial leaders compare direct, partner, marketplace, reseller, or OEM channels with fully loaded cost-to-serve, ROI lenses, and constrained channel-mix recommendations. Includes Python scripts, data templates, and guidance for channel-economics usage.

Revenue Operations

Favorites 0GitHub 22.1k

torch-geometric

by K-Dense-AI

torch-geometric skill guide for PyTorch Geometric graph neural networks. Use it for torch-geometric install help, torch-geometric usage, graph classification, node classification, link prediction, heterogeneous graphs, custom MessagePassing layers, and scaling GNNs for Machine Learning workflows.

Machine Learning

Favorites 0GitHub 21.4k

rdkit

by K-Dense-AI

The rdkit skill helps with precise cheminformatics workflows: parsing SMILES, SDF, MOL, PDB, and InChI; calculating descriptors; generating fingerprints; running substructure search; handling reactions; and building 2D/3D coordinates. Use this rdkit guide for advanced control, custom sanitization, and rdkit for Data Analysis workflows.

Data Analysis

Favorites 0GitHub 21.4k

huggingface-vision-trainer

by huggingface

huggingface-vision-trainer helps you install and use a Hugging Face skill for vision training jobs: object detection, image classification, and SAM/SAM2 segmentation. It covers dataset prep, cloud GPU setup, evaluation, Trackio logging, and pushing results to the Hub. Ideal for backend automation and repeatable training workflows.

Backend Development

Favorites 0GitHub 10.4k

seo-dataforseo

by AgriciDaniel

seo-dataforseo connects Claude to live SEO data through the DataForSEO MCP server for SERP checks, keyword research, backlinks, on-page analysis, competitor research, business listings, and AI visibility tracking. It is best for data-backed workflows when you need real search evidence, clear install guidance, and practical seo-dataforseo usage.

Keyword Research

Favorites 0GitHub 6.2k

pymc

by K-Dense-AI

PyMC is a Bayesian modeling skill for building, fitting, checking, and comparing probabilistic models in Python. Use pymc for hierarchical regression, multilevel analysis, time series, missing data, measurement error, and model comparison with LOO or WAIC.

Data Analysis

Favorites 0GitHub 0

pymatgen

by K-Dense-AI

pymatgen is a Python materials science toolkit for crystal structures, phase diagrams, electronic structure, and file conversion. This pymatgen skill helps with scientific workflows using CIF, POSCAR, VASP, and Materials Project data.

Scientific

Favorites 0GitHub 0

geopandas

by K-Dense-AI

geopandas skill for Python geospatial vector data analysis, including shapefiles, GeoJSON, and GeoPackage files. Use it to read, clean, join, buffer, clip, reproject, and export spatial data with less guesswork.

Data Analysis

Favorites 0GitHub 0

analyzing-threat-intelligence-feeds

by mukul975

Analyzing-threat-intelligence-feeds helps you ingest CTI feeds, normalize indicators, assess feed quality, and enrich IOCs for STIX 2.1 workflows. This analyzing-threat-intelligence-feeds skill is built for threat intel operations and Data Analysis, with practical guidance for TAXII, MISP, and commercial feeds.

Data Analysis

Favorites 0GitHub 0

azure-ai-textanalytics-py

by microsoft

azure-ai-textanalytics-py is a skill for Azure AI Text Analytics in Python. It helps with sentiment analysis, entity recognition, key phrase extraction, language detection, PII detection, and healthcare NLP. Use it when you need a fast path to Azure client setup, authentication, and practical text analytics usage for apps, notebooks, or data analysis workflows.

Data Analysis

Favorites 0GitHub 0