S

datadog-cli

by softaworks

datadog-cli helps agents run Datadog CLI workflows for logs, traces, metrics, services, and dashboards. Learn setup with DD_API_KEY and DD_APP_KEY, use npx @leoflores/datadog-cli commands, and handle --site plus dashboard update safety for incident triage.

Stars0
Favorites0
Comments0
AddedApr 1, 2026
CategoryObservability
Install Command
npx skills add softaworks/agent-toolkit --skill datadog-cli
Curation Score

This skill scores 82/100, which means it is a solid directory listing candidate for users who want Datadog debugging workflows an agent can invoke with less guesswork than a generic prompt. The repository gives substantial command coverage, concrete examples, and reference docs, though install/setup guidance is slightly fragmented between the skill and README.

82/100
Strengths
  • Strong operational references cover logs, metrics, query syntax, dashboards, and common workflows, reducing command guesswork for agents.
  • Good triggerability: the description and examples clearly map to real debugging tasks like incident triage, trace following, log tailing, and dashboard work.
  • Trust-building safety guidance is explicit, especially the dashboards reference warning that updates are destructive and should follow a backup-first workflow.
Cautions
  • Setup/install path is split between SKILL.md's direct `npx @leoflores/datadog-cli` usage and README's plugin install flow, which may cause some adoption guesswork.
  • The skill depends on users already having valid Datadog API/app keys and Datadog query familiarity; there is no bundled automation or helper scripts.
Overview

Overview of datadog-cli skill

The datadog-cli skill helps an agent use Datadog from the command line for practical observability work: searching logs, tracing requests, querying metrics, listing services, and managing dashboards. It is best for engineers, SREs, platform teams, and AI-assisted incident responders who already have Datadog access and want faster triage without manually clicking through the UI.

What datadog-cli is for

Use datadog-cli when the real job is not “summarize Datadog,” but “investigate a production symptom with repeatable commands.” The skill is strongest when you need to:

  • narrow an incident by service, error type, or time window
  • pivot from logs to trace context
  • check whether a spike is new or normal
  • pull metrics quickly for a service or environment
  • inspect or update dashboards with CLI-driven workflows

Best-fit users

This datadog-cli skill fits users who:

  • already use Datadog for logs, metrics, traces, or dashboards
  • want an agent to generate correct commands instead of vague search suggestions
  • need incident triage workflows, not generic observability advice
  • are comfortable providing service names, time ranges, trace IDs, or dashboard IDs

If you do not have Datadog keys or do not know your service/tag conventions, setup and prompt quality will matter more than the skill itself.

Why this skill is more useful than a generic prompt

A normal prompt might say “look at Datadog logs.” This skill gives the agent a command-level path: logs search, logs tail, logs trace, logs context, logs patterns, logs compare, metrics query, errors, services, and dashboard operations. It also points to reference docs that matter for correct execution, especially query syntax and the dashboard update warnings.

Key adoption blockers to know first

The main blockers are operational, not conceptual:

  • DD_API_KEY and DD_APP_KEY are required
  • non-US Datadog accounts may need --site, such as datadoghq.eu
  • results depend heavily on correct Datadog query syntax
  • dashboard updates are destructive if fields are omitted

Those are the first things to verify before you judge datadog-cli usage quality.

How to Use datadog-cli skill

Install and runtime context

The skill itself lives in softaworks/agent-toolkit, but the actual CLI it teaches the agent to run is:

npx @leoflores/datadog-cli <command>

Set credentials first:

export DD_API_KEY="your-api-key"
export DD_APP_KEY="your-app-key"

For non-US Datadog sites, pass --site:

npx @leoflores/datadog-cli logs search --query "*" --site datadoghq.eu

For a practical datadog-cli install decision, the dependency to validate is the external CLI plus working Datadog API access.

Read these files before first real use

This skill is unusually reference-driven. Read in this order:

  1. SKILL.md
  2. references/query-syntax.md
  3. references/logs-commands.md
  4. references/metrics.md
  5. references/workflows.md
  6. references/dashboards.md

That path reduces most first-run mistakes: bad filters, weak time windows, and unsafe dashboard edits.

Inputs the skill needs to work well

The datadog-cli skill performs best when your request includes at least some of:

  • service name, team name, or environment
  • time window like 15m, 1h, or 24h
  • symptom type: errors, latency, failed requests, deployment regression
  • trace ID, request ID, or timestamp if you have one
  • whether you want logs, metrics, dashboards, or a triage workflow
  • Datadog site if not default US

Weak input: “Check Datadog.”
Strong input: “Investigate payment-api 5xx errors in prod for the last hour, compare against the previous hour, then pull any related traces and CPU metrics.”

Turn a rough goal into a usable prompt

A good datadog-cli guide prompt should tell the agent both the objective and the narrowing dimensions.

Try this pattern:

Use datadog-cli for Observability triage.
Goal: identify why checkout failures increased after the last deploy.
Scope: service:payment-api env:prod
Time: last 1h, compare with previous 1h
Need: error summary, common log patterns, likely trace IDs, and key metrics
Site: datadoghq.eu

Why this works:

  • it gives the agent a workflow, not a single command
  • it includes query tags the CLI can actually use
  • it prevents the agent from searching too broadly

Best first commands for common jobs

For incident triage, start broad, then narrow:

npx @leoflores/datadog-cli errors --from 1h --pretty
npx @leoflores/datadog-cli logs compare --query "status:error" --period 1h --pretty
npx @leoflores/datadog-cli logs patterns --query "status:error" --from 1h --pretty

Then scope to service:

npx @leoflores/datadog-cli logs search --query "service:payment-api status:error env:prod" --from 1h --pretty

If you already have a trace:

npx @leoflores/datadog-cli logs trace --id "TRACE_ID" --from 24h --pretty

For service health:

npx @leoflores/datadog-cli metrics query --query "avg:system.cpu.user{env:prod,service:payment-api}" --from 1h --pretty

Query syntax matters more than most users expect

Many weak datadog-cli usage results are really query-quality problems. The skill relies on Datadog search syntax like:

  • service:api status:error
  • @http.status_code:>=500
  • service:api OR service:payment
  • @duration:[1000 TO 5000]
  • -status:info

If you know your fields, include them explicitly. If you do not, ask the agent to start with broader discovery queries, then tighten based on returned attributes.

Practical workflow for incident response

A strong investigation loop with datadog-cli is:

  1. get error overview with errors
  2. compare current period with prior period using logs compare
  3. cluster repeated failures with logs patterns
  4. narrow by service/env using logs search
  5. inspect surrounding activity with logs context
  6. pivot into distributed flow using logs trace
  7. confirm resource or throughput signals with metrics query

This is much better than repeatedly asking for “more logs,” because each command answers a different diagnostic question.

Dashboards need extra caution

The most important safety note in this repo is that dashboards update replaces the whole dashboard, not just changed fields. If fields like template variables, description, or notify list are omitted, they can be removed.

Before any update, the safe workflow is:

  1. fetch the dashboard to a temp file with --output
  2. preserve existing fields
  3. update using the full retained structure

This makes the datadog-cli skill suitable for dashboard work only if you are disciplined about backups and full-state updates.

Output-quality tips that change results

To get better answers from the agent:

  • specify whether you want discovery, explanation, or exact commands
  • include service and env tags together when possible
  • choose a bounded time window first; widen only if needed
  • ask for comparison against a previous period when evaluating regressions
  • prefer a trace ID or timestamp if you already have one
  • ask for --pretty when human review matters

The biggest quality gain usually comes from giving a precise query target, not from asking for more verbose analysis.

When to use logs vs metrics vs dashboards

Use logs when you need concrete events, errors, or request details.
Use metrics when you need trends, resource usage, or rate/latency signals.
Use dashboards when you need existing operational context or want to package a view for a team.

If you ask the agent for all three at once, tell it the decision goal: root cause, blast radius, regression check, or dashboard creation.

datadog-cli skill FAQ

Is datadog-cli good for beginners?

Yes, if you already have Datadog access and basic concepts like services, tags, and time windows. No, if you are still learning what logs, traces, and metrics represent. The skill reduces command guesswork, but it does not remove the need to know your environment names and observability conventions.

What makes this different from using Datadog UI directly?

datadog-cli is better when you want repeatable, scriptable, agent-generated investigation steps. It is especially useful for rapid triage, prompt-driven debugging, and sharing exact commands. The UI is still better for deep visual exploration and ad hoc browsing.

When is datadog-cli not a good fit?

Do not use this skill if:

  • your organization blocks Datadog API key use
  • you need UI-only features not exposed by the CLI workflow
  • you want broad observability theory rather than Datadog-specific execution
  • you cannot provide enough context for the agent to form valid queries

Do I need to install anything besides the skill?

Yes. The critical runtime dependency is the Datadog CLI invoked as:

npx @leoflores/datadog-cli <command>

You also need DD_API_KEY and DD_APP_KEY. For some accounts, you must pass --site.

Is datadog-cli for Observability only, or can it change things too?

Mostly it helps inspect and investigate, but dashboard commands can modify state. That is where caution matters most. Read references/dashboards.md before allowing any update flow.

Is it better than asking an agent to “check logs”?

Yes, because the skill gives the agent concrete command families and reference docs. That usually means faster narrowing, fewer malformed queries, and more useful incident workflows than ordinary freeform prompting.

How to Improve datadog-cli skill

Start prompts with operational constraints

The fastest way to improve datadog-cli output is to include the constraints the CLI actually needs:

  • Datadog site
  • environment
  • service names
  • time range
  • identifiers like trace ID or dashboard ID
  • whether the task is read-only or allowed to modify dashboards

Without that, the agent often defaults to broad, low-signal commands.

Ask for a workflow, not just one command

A common failure mode is prompting for a single lookup when the problem needs a sequence. Better prompt:

Use datadog-cli to triage a spike in 5xx responses for service:checkout in env:prod over the last hour.
First compare against the prior hour, then identify top error patterns, then pull relevant traces, then check CPU and memory metrics.

This produces better investigations because it maps onto the repo's workflow references.

Provide stronger query ingredients

Good inputs include actual Datadog fields:

  • service:payment-api
  • env:prod
  • @http.status_code:>=500
  • @error.kind:TimeoutError
  • @duration:>=1000

If you only provide natural language like “the API is slow,” the agent must guess field names and filters. Better field-level inputs lead to better datadog-cli usage.

Handle dashboard edits with a safety-first prompt

If your task touches dashboards, explicitly require a backup-first workflow:

Use datadog-cli to update dashboard abc-def-ghi, but first export the current dashboard to a temp file, preserve template variables and description, and show the exact safe update command.
Do not produce a partial update.

This sharply reduces the biggest destructive risk in the skill.

Iterate after first output instead of broadening blindly

After the first command set, improve results by narrowing:

  • from all errors to one service
  • from 24h to the exact failure window
  • from generic logs to pattern grouping
  • from symptom to trace-level evidence
  • from logs to confirming metrics

This is better than asking the agent for “more detail,” which often just expands noise.

Common mistakes to avoid

The most common adoption and output problems are:

  • missing DD_API_KEY or DD_APP_KEY
  • forgetting --site for non-US Datadog
  • using weak or invalid query syntax
  • searching too wide a time range first
  • treating dashboard update as patch-like instead of full replacement
  • asking for observability help without naming the affected service or env

What to inspect in the repo when results feel weak

If the agent seems generic, go back to:

  • references/query-syntax.md for filter precision
  • references/logs-commands.md for command choice
  • references/workflows.md for investigation order
  • references/dashboards.md for safe modification patterns

That reading path usually fixes poor prompts faster than rewriting the whole request from scratch.

Best way to evaluate datadog-cli after installation

A practical acceptance test for datadog-cli install is:

  1. run a known logs search
  2. run a scoped metrics query
  3. test one workflow command like errors or logs patterns
  4. confirm --site behavior if outside US
  5. avoid dashboard writes until backup workflow is verified

If those succeed, the datadog-cli skill is likely ready for real incident and observability work.

Ratings & Reviews

No ratings yet
Share your review
Sign in to leave a rating and comment for this skill.
G
0/10000
Latest reviews
Saving...