Monitoring

View as Markdown

Krenalis monitors its own internal service metrics, and makes them available under the /metrics path.

You can control whether this endpoint is available by using the KRENALIS_PROMETHEUS_METRICS_ENABLED environment variable or, when configuring Krenalis through AWS Parameter Store, the /prometheus-metrics-enabled parameter.

By default, this setting is false, so the /metrics endpoint is disabled unless you explicitly enable it.

In Docker Compose installations, KRENALIS_PROMETHEUS_METRICS_ENABLED is set to true by default. This means the /metrics endpoint is automatically enabled in that setup.

Metrics

Name Type Labels Description
krenalis_db_acquired_conns Gauge Current number of connections in use by the database connection pool.
krenalis_db_max_conns Gauge Configured maximum number of simultaneous database connections.
krenalis_db_acquire_duration_seconds_total Counter Cumulative seconds spent acquiring connections from the database pool.
krenalis_db_acquire_count_total Counter Total number of successful database connection acquisitions.
krenalis_db_new_conns_count_total Counter Total number of newly created database connections (connection churn).
krenalis_lambda_errors_total CounterVec type Total number of Lambda errors, categorized by error type.
krenalis_lambda_duration_seconds Histogram Duration of successful Lambda executions, in seconds.
krenalis_lambda_records_total Counter Total number of input records processed by successful Lambda executions.
krenalis_sender_queue_available GaugeVec Number of available events in the event queue.
krenalis_sender_queue_wait HistogramVec connector,
connection
Time spent waiting in the event queue (in seconds).

Go runtime metrics

In addition to the metrics listed above, the /metrics endpoint also exposes standard Go runtime metrics (such as memory, garbage collection, and goroutine statistics) provided by the Prometheus Go client. To see the full list of metrics available in your installation, query the /metrics endpoint directly.

Notes

  • The total number of successful Lambda executions is provided by the metric krenalis_lambda_duration_seconds_count, which is part of the krenalis_lambda_duration_seconds histogram. It represents the total number of observations (i.e., completed executions) recorded in the histogram.
  • Possible values for the type label in krenalis_lambda_errors_total:
    network, lambda_internal, function_not_found, serialization, and function_exec.
  • Buckets defined for krenalis_lambda_duration_seconds: 0.1, 0.5, 1, 2.5, and 5 (in seconds)
  • Buckets defined for krenalis_sender_queue_wait: 0.005, 0.01, 0.025, 0.05, 0.075, 0.1, 0.15, 0.2, 0.3, 0.5, 0.75, 1.0, 2.0 (in seconds)

Prometheus

To monitor these metrics with Prometheus, configure it with the following scrape job:

scrape_configs:
  - job_name: 'krenalis'
    static_configs:
      - targets: [ "127.0.0.1:2022" ]

Grafana

After setting up Prometheus to collect the metrics, configure Grafana to display them in dashboards.