Observability
RIOT-X exposes several metrics over a Prometheus endpoint that can be useful for troubleshooting and performance tuning.
Getting Started
The riotx-dist repository includes a Docker compose configuration that set ups Prometheus and Grafana.
git clone https://github.com/redis/riotx-dist.git
cd riotx-dist
docker compose up
Prometheus is configured to scrape the host every second.
You can access the Grafana dashboard at localhost:3000.
Now start RIOT-X with the following command:
riotx replicate ... --metrics
This will enable the Prometheus metrics exporter endpoint and will populate the Grafana dashboard.
Configuration
Use the --metrics* options to enable and configure metrics:
--metrics-
Enable metrics
--metrics-jvm-
Enable JVM and system metrics
--metrics-redis-
Enable command latency metrics. See https://github.com/redis/lettuce/wiki/Command-Latency-Metrics#micrometer
--metrics-name=<name>-
Application name tag that will be applied to all metrics
--metrics-port=<int>-
Port that Prometheus HTTP server should listen on (default:
8080) --metrics-prop=<k=v>-
Additional properties to pass to the Prometheus client. See https://prometheus.github.io/client_java/config/config/
Metrics
Below you can find a list of all metrics declared by RIOT-X.
Replication Metrics
| Name | Type | Description |
|---|---|---|
|
Counter |
Number of bytes replicated (needs memory usage with |
|
Summary |
Replication end-to-end latency |
|
Summary |
Replication read latency |
|
Timer |
Batch writing duration |
|
Timer |
Item processing duration |
|
Timer |
Item reading duration |
|
Timer |
Active jobs |
|
Counter |
Job launch count |
|
Gauge |
Gauge reflecting the remaining capacity of the queue |
|
Gauge |
Gauge reflecting the size (depth) of the queue |
|
Counter |
Number of keys scanned |
|
Timer |
Operation execution duration |
|
Gauge |
Gauge reflecting the chunk size of the reader |
|
Gauge |
Gauge reflecting the remaining capacity of the queue |
|
Gauge |
Gauge reflecting the size (depth) of the queue |
JVM Metrics
Use the --metrics-jvm option to enable the following additional metrics:
| Name | Type | Description |
|---|---|---|
|
Gauge |
An estimate of the number of buffers in the pool |
|
Gauge |
An estimate of the memory that the Java virtual machine is using for this buffer pool |
|
Gauge |
An estimate of the total capacity of the buffers in this pool |
|
Timer |
Time spent in concurrent phase |
|
Gauge |
Size of long-lived heap memory pool after reclamation |
|
Gauge |
Max size of long-lived heap memory pool |
|
Gauge |
Incremented for an increase in the size of the (young) heap memory pool after one GC to before the next |
|
Counter |
Count of positive increases in the size of the old generation memory pool before GC to after GC |
|
Timer |
Time spent in GC pause |
|
Gauge |
The amount of memory in bytes that is committed for the Java virtual machine to use |
|
Gauge |
The maximum amount of memory in bytes that can be used for memory management |
|
Gauge |
The amount of used memory |
|
Gauge |
The current number of live daemon threads |
|
Gauge |
The current number of live threads including both daemon and non-daemon threads |
|
Gauge |
The peak live thread count since the Java virtual machine started or peak was reset |
|
Counter |
The total number of application threads started in the JVM |
|
Gauge |
The current number of threads |
|
Counter |
The "cpu time" used by the Java Virtual Machine process |
|
Gauge |
The "recent cpu usage" for the Java Virtual Machine process |
|
Gauge |
Start time of the process since unix epoch. |
|
Gauge |
The uptime of the Java virtual machine |
|
Gauge |
The number of processors available to the Java virtual machine |
|
Gauge |
The "recent cpu usage" of the system the application is running in |
|
Gauge |
The sum of the number of runnable entities queued to available processors and the number of runnable entities running on the available processors averaged over a period of time |