How Asserts processes data
This topic describes what happens after you connect your observability data stores such as Prometheus and CloudWatch to Asserts.
Discovery
Asserts inspects labels to identify entities and populate their properties. It also establishes relationships between these entities by comparing their properties or using specified metrics to establish direct connections. This allows Asserts to determine which pod is hosted on which node, which pods form a Service, and how services interact with each other.
Asserts creates a knowledge graph that encompasses all entities, properties, and relationships, that represents a comprehensive understanding of the system. This graph is indexed, making it convenient for searching. The discovery process continually updates the graph, while maintaining a record of its historical changes.
Normalization
In the next phase, Asserts uses a curated collection of rules to normalize the incoming heterogeneous time series data. This process converts the data into a cohesive set of essential metrics, such as Request, Error, Duration (RED) metrics for application components, and utilization metrics for infrastructure components.
For example, Asserts records the RED metrics from Springboot as Prometheus counter asserts:request:total
, asserts:latency:total
, and asserts:error:total
.
- record: asserts:request:total
expr: http_server_requests_seconds_count
labels:
asserts_request_type: inbound
asserts_source: spring_boot
- record: asserts:latency:total
expr: http_server_requests_seconds_sum
labels:
asserts_request_type: inbound
asserts_source: spring_boot
- record: asserts:error:total
expr: http_client_requests_seconds_count{status=~"5.."}
labels:
asserts_request_type: outbound
asserts_error_type: server_errors
asserts_source: spring_boot
Asserts adds labels such as asserts_request_type
and asserts_error_type
, to indicate the level of granularity for further processing in instrumentation.
To capture additional dynamic and contextual information, such as HTTP paths, Asserts applies a Prometheus relabeling rule during the data ingestion process. This information is then stored in asserts_request_context
.
If you have different environments (dev, stage, and prod) with each having one or more sites, you can use external labels or relabelling rules to add asserts_env
and asserts_site
labels to scope metrics and entities discovered from them.
Assertions
Asserts applies its extensive domain knowledge to instrument these normalized metrics. Asserts automatically instruments application frameworks like Springboot, Flask, and Loopback, infrastructure components like Kubernetes resources, and third-party services like Redis server, Kafka clusters, and many more.
With instrumentation in place, Asserts forms a SAAFE model to capture events as assertions.
- Saturation indicates whether a resource (CPU, Memory, etc) is saturated
- Amend captures changes in the system, like deployment, scaling, and config map changees
- Anomaly captures abnormal shifts in request rate, latency, or resource consumption
- Failure records failure state in the system, like primary-standby sync failures and pod crash looping
- Error records problematic requests, for example, 500x and 400x, or breaches of latency thresholds
Assertions are condensed time-series data that specifically capture significant events within the system. These events are considered non-trivial and provide valuable insights into the observability of different components within a modern application, showcasing the comprehensive knowledge of the Assert system.
Assertions serve a distinct purpose compared to traditional alerts, as they are not designed to notify on-call personnel. Instead, they act as automated vital signs provided by Asserts, readily available for troubleshooting purposes. However, you can subscribe to specific assertions and use them as traditional alerts if desired.
For more information, refer to About the SAAFE model.
Correlation
The Asserts story doesn’t end with automatic instrumentation. When assertions arise, Asserts:
- Attaches them back to the graph and indexes them for search. This way, a single graph search phrase can become a powerful way to navigate both entities and their health status.
- Enriches the assertions with contextual information from the graph. For example, an assertion raised on a pod is tagged back to the node and service the pod belongs to. This way, assertions that happened on ephemeral entities (for example, pods) can bubble up to long-lived entities (for example, nodes and services), thus forming an aggregated view with a continuous timeline.
Because Asserts condenses and contextualizes assertions, they are much faster to query and aggregate, much easier to correlate or rank, thus enabling quick and precise root cause analysis.