Skip to content

Prometheus Metrics

Requires the Enterprise Edition.

What it does

Exposes a /metrics endpoint in the standard Prometheus text format. Third observability pillar next to JSON Logging (logs) and the Audit Log (events) — numeric dashboards, SLO timers, and freshness alerts.

Designed to drop straight into any modern monitoring stack:

  • Prometheus / VictoriaMetrics / Grafana Cloud — add a scrape job
  • OpenMetrics-compatible agents (Datadog, New Relic, Dynatrace) — configure an OpenMetrics endpoint
  • Alertmanager — build alerts on the freshness gauges below

Authentication

/metrics uses the same Basic-Auth layer as the rest of the Syncer REST API:

  1. Create a Syncer User for the scraper (Settings → Users → New).
  2. Assign the API role Prometheus metrics scrape (metrics) — or Full rights (all) if the user should be able to reach every API endpoint.
  3. Set a password and use it in the scraper's Basic-Auth config.

HTTPS is required (the same gate that protects the rest of the API); set ALLOW_INSECURE_API_AUTH = True in local_config.py only for non-production debugging.

Scrape config

scrape_configs:
  - job_name: cmdbsyncer
    metrics_path: /metrics
    scheme: https
    static_configs:
      - targets: ['syncer.example.com']
    basic_auth:
      username: prometheus            # the Syncer user name
      password_file: /etc/prometheus/cmdbsyncer.pass

Metrics

Info

Name Labels Meaning
cmdbsyncer_info customer, license_id, exp Constant 1 carrying license metadata

Cron groups

One time-series per CronGroup:

Name Meaning
cmdbsyncer_cron_group_enabled 1 if enabled in UI, else 0
cmdbsyncer_cron_group_running 1 while a run is in flight
cmdbsyncer_cron_group_failure 1 if the last completed run failed
cmdbsyncer_cron_group_last_start_timestamp_seconds Unix timestamp of the last start
cmdbsyncer_cron_group_last_end_timestamp_seconds Unix timestamp of the last end
cmdbsyncer_cron_group_last_success_timestamp_seconds Unix timestamp of the last successful end
cmdbsyncer_cron_group_last_duration_seconds Duration of the last completed run
cmdbsyncer_cron_group_next_run_timestamp_seconds Unix timestamp when the group is next eligible to run

Hosts

Name Labels Meaning
cmdbsyncer_hosts_total Total host documents (excluding object-mode)
cmdbsyncer_hosts_stale_24h_total Hosts not seen by any importer in the last 24 h

Self

Name Meaning
cmdbsyncer_metrics_scrape_duration_seconds Time this scrape spent building the body
cmdbsyncer_scrape_error 1 if the scrape failed (only emitted then)

Example alerts

Sync hasn't succeeded in 90 minutes — combines last_success_timestamp_seconds with wall clock:

- alert: CmdbSyncerCronStale
  expr: time() - cmdbsyncer_cron_group_last_success_timestamp_seconds > 5400
  for: 5m
  labels:
    severity: warning
  annotations:
    summary: 'CMDBsyncer cron group {{ $labels.group }} has not succeeded for 90 min'

Last run failed:

- alert: CmdbSyncerCronFailure
  expr: cmdbsyncer_cron_group_failure == 1
  for: 1m
  labels:
    severity: error
  annotations:
    summary: 'CMDBsyncer cron group {{ $labels.group }} failed its last run'

Hosts rotting (importer is silent):

- alert: CmdbSyncerStaleHosts
  expr: cmdbsyncer_hosts_stale_24h_total > 10
  for: 10m
  labels:
    severity: warning
  annotations:
    summary: '{{ $value }} hosts have not been seen by any importer in the last 24h'

Design notes

  • Metrics are built on scrape from the Mongo state. No in-process counters to persist; HA-safe across replicas.
  • Scrape cost scales linearly with the number of CronGroups (one query + one loop). At typical sizes (< 1000 groups) a single scrape is well under 100 ms.
  • Counters for "events since start" (audit, notifications, webhook triggers) are deliberately not exposed here — they depend on in-process state that doesn't survive a restart and would diverge between replicas. Use the Audit Log CSV export or your JSON-log aggregator for that axis.