DING

Don't store it. Stream it. DING it.

One binary that wraps your CI job, training run, or batch script — and pings you when it matters.

$ brew install ding-labs/tap/ding

$ curl -sf https://start.ding.ing | sh

The 60-second example

Drop a ding.yaml next to your job. Two rules: one fires the moment a test fails, one fires once at the end with the totals.

rules:
  # during the run — ping Slack the second a test fails
  - name: test_failed
    match:
      metric: test.failed
    condition: value > 0
    message: "❌ {{ .test_name }} failed on {{ .branch }}"
    alert:
      - notifier: slack

  # at end-of-run — one summary alert with the totals
  - name: run_summary
    mode: end-of-run
    match:
      metric: run.exit
    condition: exit_code != 0
    message: "Job failed in {{ .duration_seconds }}s on {{ .commit }}"
    alert:
      - notifier: slack

Then wrap your job:

ding run -- npm test

DING auto-detects GitHub Actions, GitLab CI, CircleCI, Jenkins, Buildkite, Argo, MLflow, and bare Kubernetes — and attaches run_id, branch, commit, workflow, job, and actor labels automatically. They're available in every alert message.

What

DING is a streaming alert engine that ships with your workload, not next to it. The job emits events; DING evaluates rules in-process; alerts fire during the run and a summary fires when the job exits. Both die together.

Two ways to run it:

ding run -- <cmd> — wraps an ephemeral job (CI, training, batch). Captures its event stream, evaluates rules in real time, fires a summary at exit.
ding serve — long-running HTTP daemon for steady-state apps. Accepts events on POST /ingest, hot-reloads config.

Single static binary. No database, no agents, no cloud account. MIT licensed. No telemetry.

Where DING fits

Ephemeral compute

CI jobs, ML training runs, batch ETL, scheduled tasks. Anything that starts, does work, exits.

Single-server apps

One VM, edge boxes, internal tools. The cases where standing up Prometheus and Alertmanager is more infra than the app itself.

Inside an existing stack

Drop into one job inside a Prometheus shop. DING doesn't replace your fleet metrics — it covers the workloads they miss.

Prometheus assumes a fleet. DING assumes a 4-minute CI job. Pull-based monitoring needs a target that lives long enough to be scraped. Most jobs don't.

Rules

One YAML file. Lives in your repo. Ships with your code.

rules:
  # during-run: fires whenever the condition is true
  - name: cpu_spike
    match:
      metric: cpu_usage
    condition: value > 95
    cooldown: 1m
    message: "CPU spike on {{ .host }}: {{ .value }}%"
    alert:
      - notifier: stdout

  # windowed: avg over 5m, fires while sustained
  - name: cpu_sustained
    match:
      metric: cpu_usage
    condition: avg(value) over 5m > 80
    cooldown: 10m
    message: "Sustained high CPU: {{ .avg }}% on {{ .host }}"
    alert:
      - notifier: stdout

  # end-of-run: fires once at exit, against accumulated state
  - name: slow_run
    mode: end-of-run
    match:
      metric: request.latency
    condition: avg(value) over 1h > 200
    message: "Avg latency this run: {{ .avg }}ms"
    alert:
      - notifier: slack

All condition forms:

value > 95                       # single event
avg(value) over 5m > 80         # average over window
max(value) over 1m >= 100
min(value) over 10s < 10
sum(value) over 30s > 0
count(value) over 2m > 50       # number of events, not sum of values
condition_a AND condition_b     # compound
condition_a OR condition_b

Template variables in message::

Variable	When	Description
`.metric`	always	metric name
`.value`	always	raw event value
`.rule`	always	rule name
`.fired_at`	always	RFC3339 timestamp
`.host`, `.region`, …	always	any label from the event
`.run_id`, `.branch`, `.commit`, `.workflow`, …	`ding run`	auto-detected run-context labels
`.exit_code`, `.duration_seconds`	`run.exit` event	synthetic event emitted at end of run
`.avg` `.max` `.min` `.sum` `.count`	windowed only	aggregate result

Notifiers

stdout and github_actions are built in — no config needed. Everything else is declared once and referenced by name.

Built-in

stdout — JSON line per alert. Pipe it anywhere.
github_actions — writes ::warning:: annotations to the live log and appends markdown to $GITHUB_STEP_SUMMARY. Falls back to stdout outside Actions.

Configurable

notifiers:
  slack:
    type: slack
    url: ${SLACK_WEBHOOK_URL}

  on-call:
    type: discord
    url: ${DISCORD_WEBHOOK_URL}

  custom:
    type: webhook
    url: https://example.com/hook
    max_attempts: 3       # retries on 5xx (default: 3)
    initial_backoff: 1s   # doubles each attempt (default: 1s)

Also supported: type: teams, type: pagerduty, type: telegram. Slack, Discord, and Teams auto-surface run-context fields (exit code, duration, branch, commit, workflow, actor) so alerts from a CI job land actionable. ${VAR} works in any string value — fail fast if unset.

The generic webhook payload:

{"rule":"cpu_spike","message":"CPU spike on web-01: 97%",
 "metric":"cpu_usage","value":97.0,"fired_at":"...","host":"web-01"}

4xx responses are dropped. 5xx responses retry with exponential backoff.

Why

Fires alerts in 4ms. Prometheus default scrape + eval + Alertmanager dispatch: ~62 seconds minimum. That's not a knock on Prometheus — it's a pull-based system built for persistence and fleet-wide aggregation. DING is push-based and stateless. The architecture is the difference.

Zero infrastructure — no Prometheus, no Alertmanager, no storage, no agents
Windowed aggregations — avg(value) over 5m works with no database, just memory
Per-host cooldowns — web-01 being loud doesn't silence web-02
Composable — stdin in, JSON lines out, pipes into anything
Config in your repo — 12 lines, 1 file vs 30 lines across 3 files for the Prometheus equivalent. Alerting is a dev artifact, not an ops artifact.
5MB static binary, 9ms cold start — runs on linux/arm64, amd64, macOS, Windows. Prometheus cold start: 185ms.

Performance

4ms

Alert latency p50

p99: 16ms — Prometheus default: ~62s

116k

Requests / second

50 concurrent workers, 30s window

9ms

Cold start p50

fork → first /health — Prometheus: 185ms

106ns

Per rule evaluation

simple threshold — windowed: 157ns

Benchmarked 2026-03-23 on Apple M3. Full methodology and raw results →

Recipes

Ready-made configs for the platforms you already use. Drop them in, edit the rules, ship.

GitHub Actionsaction

Marketplace action. One step in your workflow.

GitLab CI

Pipeline job, artifacts upload, MR comments.

CircleCI

Orb-friendly. Wraps any step's command.

Jenkins

Declarative + scripted pipelines.

Buildkite

Step plugin. Works with agent fleet.

Kubernetes Jobs

initContainer pattern. Helm chart available.

Argo Workflows

Sidecar with downward API for run context.

MLflow

Auto-attaches experiment ID and tracking URI.

`ding serve` mode

For steady-state apps — anything that runs longer than the workload it watches. Starts an HTTP server on :8080, accepts events, evaluates rules continuously, hot-reloads config without restarting.

ding serve --config ./ding.yaml

# or pipe stdin
your-app | ding serve

# or POST events
curl -X POST http://localhost:8080/ingest \
  -H "Content-Type: application/json" \
  -d '{"metric":"cpu_usage","value":97,"host":"web-01"}'

Accepts JSON lines:

{"metric": "cpu_usage", "value": 92.5, "host": "web-01"}

or Prometheus text:

cpu_usage{host="web-01"} 92.5

HTTP API

Method	Path	Description
`POST`	`/ingest`	Send events
`GET`	`/health`	Liveness probe
`GET`	`/rules`	List rules + cooldown state
`POST`	`/reload`	Hot-reload config
`GET`	`/metrics`	Prometheus-format self-metrics

Operations

Reload config without restarting:

kill -HUP <pid>
# or
curl -X POST http://localhost:8080/reload

Survive restarts — persist cooldown state and windowed buffers to disk:

persistence:
  state_file: /var/lib/ding/state.json
  flush_interval: 30s

SIGTERM / SIGINT — drains in-flight requests, flushes state, exits 0.

Install

Homebrew:

brew install ding-labs/tap/ding

Binary:

curl -sf https://start.ding.ing | sh

Docker:

docker run -v ./ding.yaml:/etc/ding/ding.yaml \
  ghcr.io/ding-labs/ding

GitHub Actions: ding-labs/ding-action on the marketplace — one step in your workflow.

Kubernetes initContainer (self-copy into a shared volume):

ding install /shared/ding

DING

The 60-second example

What

Where DING fits

Rules

Notifiers

Built-in

Configurable

Why

Performance

Recipes

ding serve mode

HTTP API

Operations

Install

`ding serve` mode