Version 1.6 release notes
The Grafana Enterprise Metrics (GEM) team is excited to announce the release of GEM 1.6.
Features and enhancements
- Two new API endpoints help you to explore the metrics cardinality of tenants within a cluster. These endpoints make it possible to identify the highest cardinality metrics in a tenant, as well as the label names with the highest value count. The GEM plug-in (starting with version 3.3.0) includes a set of cardinality dashboards built on these endpoints, which allows users to explore the cardinality data visually. For more information, see Cardinality management.
- Experimental support for query sharding was added. It allows you to split up a single query into groups of smaller series and parallelize it. This improves the performance of high cardinality queries that touch many series. This feature is under active development, so take care during its use.
- Experimental support was added for a new compaction strategy known as split and merge compaction. This strategy enables users to parallelize the compaction of a large tenant horizontally across multiple machines. It also enables users to surpass the 64GiB block index limit inherent in Prometheus, and raise the ceiling on how many time series a single tenant can support. This feature is under development; use with caution.
- The term instance, which had been used as a synonym for tenant, was eliminated. This allows GEM to more closely align with terminology used in Mimir. Previously, instance had multiple meanings besides tenant, such as an instance of an ingester.
- All references to “instance” in the GEM documentation and GEM Grafana plug-in have been replaced with the word “tenant”.
- We’ve added new v2 API endpoints which specify “tenant” in the request and response bodies. We have deprecated and will eventually delete the original endpoints that use the word ‘instance’. For more detail, see the admin-api page.
- GEM 1.6 renames the
default
auth option totrust
. This reflects the fact that thetrust
option does not actually do any authentication, but simply ‘trusts’ the value of the HTTP headerX-Scope-Org-Id
to determine what tenant to send a request to. GEM users are highly recommended to use auth typeenterprise
, nottrust
. - We’ve made some efficiency improvements, including reduced CPU and memory utilization on the ingesters and queries and when seeing large volumes of out-of-order samples.
Upgrade considerations
- To get the dashboards associated with the Cardinality management feature, upgrade your Grafana GEM administrative plug-in to version 3.3.0.
- To ensure you receive the most up-to-date dashboards, you must disable and then re-enable the plug-in. Therefore, after you install the GEM plug-in version 3.3.0, navigate to its Plugin configuration page, and select Disable plugin. After Grafana reloads, select Enable plugin.
- See the note under Changes about the auth option being renamed from
default
totrust.
The default maximum number of inflight ingester push requests (-ingester.instance-limits.max-inflight-push-requests
) changed to30000
. It is unlikely that this change affects the steady state operation of any GEM user. The maximum number acts as a safeguard against clusters getting overloaded by a spike in requests. - The option
-querier.ingester-streaming
has been removed. The querier and ruler now always use streaming method to query ingesters. - GEM 1.6 has removed support for block deletion marks migration (
-compactor.block-deletion-marks-migration-enabled
). If you’re upgrading to GEM 1.6 from GEM < 1.1.0, then you need to upgrade to GEM 1.5 first and run the migration at least once. Then you can upgrade to 1.6.
Bug fixes
1.6.2
- GEM 1.6.2 fixes a bug that caused tenant limits set via the Admin API to be invalidated when upgrading to GEM 1.6.0 or GEM 1.6.1 from previous versions of GEM.
1.6.1
1.6.1 fixes one important bug in the 1.6.0 release:
- GEM 1.6.1 fixes an issue with the way default tenant limits were applied when setting tenant limits using the Admin API.
1.6.0
- GEM 1.6 changes the way per-tenant limits are stored to avoid breaking changes between version updates.
- GEM 1.6 has a bugfix to ensure that the recording rules executed by the
__system__
tenant (part of the self-monitoring feature) are properly split across rulers. This should eliminate out-of-order errors on the__system__
tenant. - GEM now enforces that a tenant must exist even when a wildcard access policy is used. Writes to non-existent tenants with a wildcard write access policy will fail. Similarly, if there are any read requests to a non-existent tenant using a wildcard write access policy, they will fail.
- If a GEM tenant is deleted but the access policies to that GEM tenant are not, GEM will now fail read and write requests to that tenant.