Prometheus Metrics
Requires the Enterprise Edition.
What it does
Exposes a /metrics endpoint in the standard Prometheus text format.
Third observability pillar next to JSON Logging
(logs) and the Audit Log (events) — numeric
dashboards, SLO timers, and freshness alerts.
Designed to drop straight into any modern monitoring stack:
- Prometheus / VictoriaMetrics / Grafana Cloud — add a scrape job
- OpenMetrics-compatible agents (Datadog, New Relic, Dynatrace) — configure an OpenMetrics endpoint
- Alertmanager — build alerts on the freshness gauges below
Authentication
/metrics uses the same Basic-Auth layer as the rest of the Syncer
REST API:
- Create a Syncer User for the scraper (Settings → Users → New).
- Assign the API role Prometheus metrics scrape (
metrics) — or Full rights (all) if the user should be able to reach every API endpoint. - Set a password and use it in the scraper's Basic-Auth config.
HTTPS is required (the same gate that protects the rest of the API);
set ALLOW_INSECURE_API_AUTH = True in local_config.py only for
non-production debugging.
Scrape config
scrape_configs:
- job_name: cmdbsyncer
metrics_path: /metrics
scheme: https
static_configs:
- targets: ['syncer.example.com']
basic_auth:
username: prometheus # the Syncer user name
password_file: /etc/prometheus/cmdbsyncer.pass
Metrics
Info
| Name | Labels | Meaning |
|---|---|---|
cmdbsyncer_info |
customer, license_id, exp |
Constant 1 carrying license metadata |
Cron groups
One time-series per CronGroup:
| Name | Meaning |
|---|---|
cmdbsyncer_cron_group_enabled |
1 if enabled in UI, else 0 |
cmdbsyncer_cron_group_running |
1 while a run is in flight |
cmdbsyncer_cron_group_failure |
1 if the last completed run failed |
cmdbsyncer_cron_group_last_start_timestamp_seconds |
Unix timestamp of the last start |
cmdbsyncer_cron_group_last_end_timestamp_seconds |
Unix timestamp of the last end |
cmdbsyncer_cron_group_last_success_timestamp_seconds |
Unix timestamp of the last successful end |
cmdbsyncer_cron_group_last_duration_seconds |
Duration of the last completed run |
cmdbsyncer_cron_group_next_run_timestamp_seconds |
Unix timestamp when the group is next eligible to run |
Hosts
| Name | Labels | Meaning |
|---|---|---|
cmdbsyncer_hosts_total |
— | Total host documents (excluding object-mode) |
cmdbsyncer_hosts_stale_24h_total |
— | Hosts not seen by any importer in the last 24 h |
Self
| Name | Meaning |
|---|---|
cmdbsyncer_metrics_scrape_duration_seconds |
Time this scrape spent building the body |
cmdbsyncer_scrape_error |
1 if the scrape failed (only emitted then) |
Example alerts
Sync hasn't succeeded in 90 minutes — combines last_success_timestamp_seconds
with wall clock:
- alert: CmdbSyncerCronStale
expr: time() - cmdbsyncer_cron_group_last_success_timestamp_seconds > 5400
for: 5m
labels:
severity: warning
annotations:
summary: 'CMDBsyncer cron group {{ $labels.group }} has not succeeded for 90 min'
Last run failed:
- alert: CmdbSyncerCronFailure
expr: cmdbsyncer_cron_group_failure == 1
for: 1m
labels:
severity: error
annotations:
summary: 'CMDBsyncer cron group {{ $labels.group }} failed its last run'
Hosts rotting (importer is silent):
- alert: CmdbSyncerStaleHosts
expr: cmdbsyncer_hosts_stale_24h_total > 10
for: 10m
labels:
severity: warning
annotations:
summary: '{{ $value }} hosts have not been seen by any importer in the last 24h'
Design notes
- Metrics are built on scrape from the Mongo state. No in-process counters to persist; HA-safe across replicas.
- Scrape cost scales linearly with the number of CronGroups (one query + one loop). At typical sizes (< 1000 groups) a single scrape is well under 100 ms.
- Counters for "events since start" (audit, notifications, webhook triggers) are deliberately not exposed here — they depend on in-process state that doesn't survive a restart and would diverge between replicas. Use the Audit Log CSV export or your JSON-log aggregator for that axis.