How we're making it easier to use the Loki logging system with AWS Lambda and other short-lived services
There are so many great things that can be said about Loki – I recently wrote about them here. But today, I want to talk about something technical that has been difficult for Loki users, and how we might make it easier: using Loki for short-lived services.
Historically, one of Loki’s blind spots is ingesting logs from infrastructure you don’t control, because you can’t co-locate a forwarding agent like promtail with your application logs.
We’ve already made a number of strides to work around the issue with plugins, our docker driver, and other tools. In this post I’m going to highlight the next stage in our efforts: how we’ve made promtail
, our log forwarding agent, able to act as an intermediate step in processing pipelines.
I’m also going to take a look at a common use case: how promtail
can collect logs from AWS Lambda (Amazon’s event-driven, serverless computing platform) and why it has latent complexities.
Ephemeral services and why they’re difficult
Ephemeral services are short-lived, coming in and out of existence depending on a number of factors, like scaling needs and infrastructure availability. These characteristics make them a great fit for stateless applications, but for a few reasons, they can be a bit more complicated for ingestion services like promtail to handle. Maybe they run on infrastructure outside your control. Or maybe they only expose their logs from a third-party service and you can’t tail files. Even if you do control everything, perhaps the services aren’t alive long enough to monitor effectively.
Ephemeral jobs commonly run on cheaper infrastructure, which may disappear unexpectedly and result in your service being rescheduled. We refer to this as preemptible infrastructure. As a result, instead of paying monetary costs, you end up paying complexity ones (building applications to handle rescheduling).
One of these complexity costs is understanding how cardinality and ordering affect each other in Loki.
Ordering vs Cardinality
Loki has what we call an ordering constraint. That is to say that for each log stream (generally analogous to a log file), Loki requires that lines are ingested in order so that each successive line has a timestamp greater or equal to the previous one. This is necessary to understand when ingesting data into Loki, but it ends up being particularly important for ephemeral jobs.
As for cardinality, in Loki it represents the number of unique log streams. Streams are composed of a set of Prometheus-style label names and label values tied to a set of logs. These labels are generally topological traits, such as {app="api", env="production", cluster="eu-west2"}
, which help narrow down where logs come from.
Let’s take a look at the following set:
{app="frontend", env="dev", cluster="eu-west2"}
{app="frontend", env="prod", cluster="eu-west2"}
{app="backend", env="staging" cluster="eu-west2"}
{app="backend", env="prod", cluster="eu-west2"}
The cardinality for the cluster
label is 1, since it only has one unique key/value pair (eu-west2
).
The cardinality for the app
label is 2, as it has two unique key/value pairs (frontend
& backend
).
The cardinality for the env
label is 3 as it has three unique key/value pairs (dev
, staging
, & prod
).
The total cardinality for this set across all labels is 4: the number of unique collections of key/value label pairs.
There’s already a well-thought-out piece by Ed on how to design your labels, so I won’t delve into this bit any further. But to sum things up, a collection of key/value pairs creates a stream, and all log lines in that stream must be ingested in increasing chronological order.
Case Study: AWS Lambda
Let’s take a look at how cardinality and ordering affect the commonly-used pattern of a scalable service backed by AWS Lambda.
Ephemeral jobs can quite easily run afoul of cardinality best practices. Under high load, an AWS Lambda function might balloon in concurrency, creating many log streams in AWS CloudWatch Logs.
This creates a problem. But what’s the fix?
If you add a label for this source, like {invocation="<id>"}
, you risk creating cardinality issues because we’re using labels that have an unbounded number of inhabitants. These, in turn, cause new log streams to be created, and they may only have a few log lines. This is an anti-pattern in Loki! Remember, we try to maintain a small index. Adding a field that’s basically a UUID will hurt in the long run.
Our other solution could be to ignore these labels, right? Unfortunately, that will likely cause us to run into out-of-order errors
treating logs from different invocations as the same stream.
Here’s where using promtail
as an intermediary step comes into play.
The promtail basics
We recently added support for what’s called the Push API in promtail
. This enables ingesting logs based on http or grpc network requests rather than just by tailing local files.
With that in your tool belt, let’s look at how we can pipeline CloudWatch logs to a set of promtails, which can mitigate the problems in two ways:
Using promtail’s push api along with the
use_incoming_timestamp: false
config, we let promtail determine the timestamp based on when it ingests the logs, not the timestamp assigned by CloudWatch. Obviously, this means that you’ll lose the origin timestamp because promtail now assigns it, but it’s a relatively small difference in a real-time ingestion system like this.In conjunction with (1), promtail can coalesce logs across CloudWatch log streams because it’s no longer susceptible to
out-of-order
errors when combining multiple sources such as different instances of the same Lambda function.
One important aspect to keep in mind when you’re running with a set of promtails behind a load balancer, is that you’re effectively moving the cardinality problems from the number_of_lambdas
to the number_of_promtails
.
To avoid running into out-of-order
errors when each promtail sends data for the otherwise same-label set to Loki, you’ll need to assign a promtail-specific label on each promtail. This can easily be done via a config like --client.external-labels=promtail=${HOSTNAME}
passed to promtail.
To do this, we’ll take advantage of Loki’s lambda-promtail
AWS SAM template, which includes a lambda function to read logs from AWS CloudWatch Logs and send them to Loki. The docs for this can be found here.
Labeling
When using this Lambda forwarder, incoming logs will have three special labels assigned to them which can be used in relabeling or later stages in a promtail pipeline:
__aws_cloudwatch_log_group
: The associated CloudWatch Log Group for this log.__aws_cloudwatch_log_stream
: The associated CloudWatch Log Stream for this log.__aws_cloudwatch_owner
: The AWS ID of the owner of this event.
Our simplistic example will only take advantage of the log group, but they’re all included for more complex use cases.
Configuration
For the sake of brevity, let’s assume that the lambda-promtail
Lambda function has already been given the log group(s) to tail, and is configured to send to our bank of promtails via the load balancer. Now we’ll configure our set of promtails with the following configuration and run multiple replicas behind a load balancer for availability.
Note: The following should be run in conjunction with a promtail-specific label attached, ideally via a flag argument like --client.external-labels=promtail=${HOSTNAME}
. It will receive writes via the push-api on ports 3500
(http) and 3600
(grpc).
clients:
- url: http://<lok_address>:3100/loki/api/v1/push
scrape_configs:
- job_name: push1
loki_push_api:
server:
http_listen_port: 3500
grpc_listen_port: 3600
labels:
# Adds a label on all streams indicating it was processed by the lambda-promtail workflow.
promtail: 'lambda-promtail'
relabel_configs:
# Maps the cloudwatch log group into a label called `log_group` for use in Loki.
- source_labels: ['__aws_cloudwatch_log_group']
target_label: 'log_group'
Limitations
The above approach does come with its own set of issues . . .
Promtail labels
As stated earlier, this workflow moves the worst-case stream cardinality from number_of_log_streams
-> number_of_log_groups
* number_of_promtails
, which is the Cartesian Product of log groups and promtails. For this reason, each promtail must have a unique label attached to logs it processes (ideally via something like --client.external-labels=promtail=${HOSTNAME}
). It’s also smart to run a small number of promtails behind a load balancer according to your throughput and redundancy needs.
This trade-off is very effective when you have a large number of log streams but want to aggregate them by a smaller cardinality label (such as log group). This is very common in AWS Lambda, where log groups are the “application” and log streams are the individual application containers that are spun up and down at whim, possibly just for a single function invocation.
Data Persistence
Since promtail batches writes to Loki for performance, it’s possible that promtail will receive a log, issue a successful 204
http status code for the write, then be killed at a later time before it writes upstream to Loki. This should be rare, but it’s a downside to this workflow. For availability concerns, run a set of promtails behind a load balancer.
Future plans
I hope this has helped explain the nuances between ordering and cardinality, and why it’s an unfortunate area of complexity. As we move forward, we’re looking at ways to improve this. We want it to be easier and more intuitive to ingest logs into Loki, and possibly obviate the balancing act between ordering and cardinality entirely.
One approach we’re considering – which I’m a fan of – is relaxing the ordering constraints in Loki. Doing so would enable ingesting log lines from the same stream without ensuring timestamp order. We expect that removing the need to understand these nuances would be attractive for Loki adopters.
In terms of the Lambda use case described in this post, that change would completely remove the need to have a set of promtails which fan-in the lambda logs. Instead, we’d be able to send all the CloudWatch logs to Loki without worrying about ensuring their order first.
This approach is still just a twinkle in the eyes of a few maintainers at this point, but I have high hopes for it as a way to reduce the operational complexity of running Loki.