Remote-write rule forwarding
Grafana Enterprise Metrics (GEM) allows for forwarding metrics evaluated from the Ruler to any Prometheus remote-write compatible backend.
This works by loading rule groups into the Ruler with an extra config field as shown in the example below:
# A regular Grafana Mimir rule group
groups:
- name: group_one
interval: 5m
rules:
- expr: 'rate(prometheus_remote_storage_samples_in_total[5m])'
record: 'prometheus_remote_storage_samples_in_total:rate5m'
- name: group_two
interval: 1m
rules:
- expr: 'rate(prometheus_remote_storage_samples_in_total[1m])'
record: 'prometheus_remote_storage_samples_in_total:rate1m'
remote_write:
- url: 'http://user:pass@example.com/api/v1/push'
In the above example, when group_2
is loaded into Grafana Enterprise Metrics, the Ruler Module
will evaluate the expression rate(prometheus_remote_storage_samples_in_total[1m])
every 1m
and forward the generated metric with name prometheus_remote_storage_samples_in_total:rate1m
to example.com
. Meanwhile, group_1
will continue to work as expected, the evaluated
metric prometheus_remote_storage_samples_in_total:rate5m
will be stored within the same GEM
tenant that is running the Ruler.
Configuration
Rule Storage
Remote write rules are compatible with the following backends:
- Azure Blob Storage
- GCS
- S3
- Swift
The following backends are not supported:
- local filesystem
- ConfigDB
Write-ahead log (WAL)
When a rule group is configured with a remote-write config, GEM buffers the generated metrics in a write-ahead log (WAL) before forwarding them to the remote-write endpoint. This is done to increase reliability in case either GEM or the remote endpoint crashes. If GEM crashes, it reads from the WAL and continues to forward metrics to the configured backend from the last sent timestamp. If the remote endpoint crashes, GEM continues to retry requests until it is available again. If multiple rule groups have been configured to send to the same remote-write endpoint, GEM will use a common WAL for the metrics generated by those rule groups. The WAL is truncated at the time specified by the ruler.remote-write.wal-truncate-frequency
setting. WAL entries older than time specified in the ruler.remote-write.max-wal-time
setting are removed. WAL entries younger than ruler.remote-write.min-wal-time
are not removed.
By default, the WAL is stored in the wal
folder in the GEM binary working directory.
$ ls
enterprise-metrics-binary wal/
The directory can be configured as shown:
ruler:
remote_write:
enabled: true
wal_dir: /tmp/wal
min_wal_time: 1h
max_wal_time: 5h
wal_truncate_frequency: 1h
Example
The following is a complete example of the above mentioned config options using a ruler with sharding enabled and S3 as its rule storage backend:
ruler:
external_url: localhost:9090
rule_path: "/tmp/rules"
storage:
type: s3
s3:
endpoint: minio:9000
access_key_id: cortex
secret_access_key: supersecret
bucketnames: "gem-ruler"
insecure: true
s3forcepathstyle: true
poll_interval: 10s
enable_api: true
enable_sharding: true
ring:
kvstore:
store: memberlist
remote_write:
enabled: true
wal_dir: /tmp/wal
min_wal_time: 1h
max_wal_time: 5h
wal_truncate_frequency: 1h
Loading remote-write groups
The mimirtool
tool is compatible with Prometheus rule files that contain the remote-write rule group syntax. You can download and use the latest version of the mimirtool
in the releases of Grafana Mimir.
You can also use the docker image of the mimirtool
: docker pull grafana/mimirtool:latest
Example usage
Once you have GEM running with remote-write rule groups enabled you can load remote-write rule groups using the following procedure.
Save the following file to your workspace:
rules.yaml:
groups:
- name: remote_write_group
interval: 5m
rules:
- expr: "sum(up)"
record: "sum_up"
remote_write:
- url: "http://user:pass@example.com/api/v1/push"
- Run the following command with
mimirtool
:
mimirtool rules sync \
--rule-files=rules.yaml \
--id=<tenant-name> \
--address=<gem-url> \
--key=<valid-gem-write-token>