Grafana Mimir version 2.14 release notes
Grafana Labs is excited to announce version 2.14 of Grafana Mimir.
The highlights that follow include the top features, enhancements, and bug fixes in this release. For the complete list of changes, refer to the CHANGELOG.
Features and enhancements
The streaming of chunks from store-gateways to queriers is now enabled by default. This reduces the memory usage in queriers. This was an experimental feature since Mimir 2.10, and is now considered stable.
Compactor adds a new cortex_compactor_disk_out_of_space_errors_total
counter metric that tracks how many times a compaction fails
due to the compactor being out of disk.
The distributor now replies with the Retry-After
header on retryable errors by default.
This protects Mimir from clients, including Prometheus, that default to retrying very quickly, making recovering from an outage easier.
The feature was originally added as experimental in Mimir 2.11.
Incoming OTLP requests were previously size-limited with the distributor’s -distributor.max-recv-msg-size
configuration.
The distributor has a new -distributor.max-otlp-request-size
configuration for limiting OTLP requests. The default value is 100 MiB.
Ingesters can be marked as read-only as part of their downscaling procedure. The new prepare-instance-ring-downscale
endpoint updates the read-only
status of an ingester in the ring.
Important changes
In Grafana Mimir 2.14, the following behavior has changed:
When running a remote read request, the querier honors the time range specified in the read hints.
The default inactivity timeout of active series in ingesters, controlled by the -ingester.active-series-metrics-idle-timeout
configuration,
is increased from 10m
to 20m
.
The following featues of store-gateway are changed: -blocks-storage.bucket-store.max-concurrent-queue-timeout
is set to five seconds;
-blocks-storage.bucket-store.index-header.lazy-loading-concurrency-queue-timeout
is set to five seconds;
-blocks-storage.bucket-store.max-concurrent
is set to 200;
The experimental support for Redis caching is now deprecated and set to be removed in the next major release. Users are encouraged to switch to use Memcached.
The following deprecated configuration options were removed in this release:
- The
-ingester.return-only-grpc-errors
option in the ingester - The
-ingester.client.circuit-breaker.*
options in the ingester - The
-ingester.limit-inflight-requests-using-grpc-method-limiter
option in the ingester - The
-ingester.client.report-grpc-codes-in-instrumentation-label-enabled
option in the distributor and ruler - The
-distributor.limit-inflight-requests-using-grpc-method-limiter
option in the distributor - The
-distributor.enable-otlp-metadata-storage
option in the distributor - The
-ruler.drain-notification-queue-on-shutdown
option in the ruler - The
-querier.max-query-into-future
option in the querier - The
-querier.prefer-streaming-chunks-from-store-gateways
option in the querier and the store-gateway - The
-query-scheduler.use-multi-algorithm-query-queue
option in the querier-scheduler - The YAML configuration
frontend.align_queries_with_step
in the query-frontend
Experimental features
Grafana Mimir 2.14 includes some features that are experimental and disabled by default. Use these features with caution and report any issues that you encounter:
The ingester added an experimental -ingester.ignore-ooo-exemplars
configuration. When set, out-of-order exemplars are no longer reported
to the remote write client.
The querier supports the experimental limitk()
and limit_ratio()
PromQL functions. This feature is disabled by default,
but you can enable it with the -querier.promql-experimental-functions-enabled=true
setting in the query-frontend and the querier.
Bug fixes
- Alertmanager: fix configuration validation gap around unreferenced templates.
- Alertmanager: fix goroutine leak when stored configuration fails to apply and there is no existing tenant alertmanager.
- Alertmanager: fix receiver firewall to detect
0.0.0.0
and IPv6 interface-local multicast address as local addresses. - Alertmanager: fix per-tenant silence limits not reloaded during runtime.
- Alertmanager: fix bugs in silences that could cause an existing silence to expire/be deleted when updating the silence fails. This could happen when the updated silence was invalid or exceeded limits.
- Alertmanager: fix help message for utf-8-strict-mode.
- Compactor: fix a race condition between different compactor replicas that may cause a deleted block to be referenced as non-deleted in the bucket index.
- Configuration: multi-line environment variables are flattened during injection to be compatible with YAML syntax.
- HA Tracker: store correct timestamp for the last-received request from the elected replica.
- Ingester: fix the sporadic
not found
error causing an internal server error if label names are queried with matchers during head compaction. - Ingester, store-gateway: fix case insensitive regular expressions not correctly matching some Unicode characters.
- Ingester: fixed timestamp reported in the “the sample has been rejected because its timestamp is too old” error when the write request contains only histograms.
- Query-frontend: fix
-querier.max-query-lookback
and-compactor.blocks-retention-period
enforcement in query-frontend when one of the two is not set. - Query-frontend: “query stats” log includes the actual
status_code
when the request fails due to an error occurring in the query-frontend itself. - Query-frontend: ensure that internal errors result in an HTTP 500 response code instead of a 422 response code.
- Query-frontend: return annotations generated during evaluation of sharded queries.
- Query-scheduler: fix a panic in request queueing.
- Querier: fix the issue where “context canceled” is logged for trace spans for requests to store-gateways that return no series when chunks streaming is enabled.
- Querier: fix issue where queries can return incorrect results if a single store-gateway returns overlapping chunks for a series.
- Querier: do not return
grpc: the client connection is closing
errors as HTTP499
. - Querier: fix issue where some native histogram-related warnings were not emitted when
rate()
was used over native histograms. - Querier: fix invalid query results when multiple chunks are merged.
- Querier: support optional start and end times on
/prometheus/api/v1/labels
,/prometheus/api/v1/label/<label>/values
, and/prometheus/api/v1/series
whenmax_query_into_future: 0
. - Querier: fix issue where both recently compacted blocks and their source blocks can be skipped during querying if store-gateways are restarting.
- Ruler: add support for draining any outstanding alert notifications before shutting down. Enable this setting with the
-ruler.drain-notification-queue-on-shutdown=true
CLI flag. - Store-gateway: fixed a case where, on a quick subsequent restart, the previous lazy-loaded index header snapshot was overwritten by a partially loaded one.
- Store-gateway: store sparse index headers atomically to disk.
- Ruler: map invalid org-id errors to the 400 status code.
Helm chart improvements
The Grafana Mimir and Grafana Enterprise Metrics Helm charts are released independently. Refer to the Grafana Mimir Helm chart documentation.