Loki v1.4.0 released, with query statistics and up to 300x regex optimization
It has been a little over 2 months since 1.3.0 was released. We started prepping for the 1.4.0 release several weeks ago; however, when I was writing this very blog post for the release, we discovered some confusing stats from the new statistics objects (which we’ll talk about in a bit). After sorting that out, we played the usual game of, “Wait, don’t release yet! I want to include one more thing!” This resets the clock for the amount of time we like to test changes before tagging and releasing them – and so a few more weeks have gone by, but finally here we are!
Loki v1.4.0 brings with it more than 130 PRs from over 40 contributors! I think we say this a lot, but Loki wouldn’t be half what it is today if it weren’t for the help and support provided by the community! These 130 changes cover a pretty big range of features and fixes and also come with some important notes about upgrading, but first let’s start with the fun stuff:
What’s new
Query statistics
1.3.0 introduced the query frontend, which allows for big performance gains via parallelization of queries. But just how good?? We realized we needed better statistics for this and wanted to expose them via the API so that clients could see just how much data a query processed.
As an example, let me run a large query against Loki using logcli
with the --stats
flag (also wrapped with time
so we can see how long the total query took).
work-pc:~ ewelch$ time logcli query '{container_name="ingress-nginx"} |= "pppppppp" ' --since=1h --stats
https://logs-dev-ops-tools1.grafana.net/loki/api/v1/query_range?direction=BACKWARD&end=1585742819881616000&limit=30&query=%7Bcontainer_name%3D%22ingress-nginx%22%7D+%7C%3D+%22pppppppp%22+&start=1585739219881616000
Ingester.TotalReached 84
Ingester.TotalChunksMatched 7435
Ingester.TotalBatches 0
Ingester.TotalLinesSent 0
Ingester.HeadChunkBytes 26 MB
Ingester.HeadChunkLines 89057
Ingester.DecompressedBytes 2.3 GB
Ingester.DecompressedLines 7599926
Ingester.CompressedBytes 598 MB
Ingester.TotalDuplicates 0
Store.TotalChunksRef 6064
Store.TotalChunksDownloaded 6064
Store.ChunksDownloadTime 14.038946798s
Store.HeadChunkBytes 0 B
Store.HeadChunkLines 0
Store.DecompressedBytes 30 GB
Store.DecompressedLines 97844591
Store.CompressedBytes 7.7 GB
Store.TotalDuplicates 0
Summary.BytesProcessedPerSeconds 6.2 GB
Summary.LinesProcessedPerSeconds 20169102
Summary.TotalBytesProcessed 32 GB
Summary.TotalLinesProcessed 105533574
Summary.ExecTime 5.232437764s
real 0m5.811s
user 0m0.092s
sys 0m0.104s
This query is intentionally crafted to match and return any log lines (I’m pretty sure our ingress pods don’t have any pppppppp’s in them). This will force Loki to open every chunk in the time range and search every single log line looking for any that match pppppppp to return up to the requested limit.
We can see we processed quite a bit of data! The stats are quite verbose as we have been using them to fine tune all the moving parts in Loki. However, the main stats most people would be interested in are likely the Summary:
Summary.BytesProcessedPerSeconds 6.2 GB
Summary.LinesProcessedPerSeconds 20169102
Summary.TotalBytesProcessed 32 GB
Summary.TotalLinesProcessed 105533574
Summary.ExecTime 5.232437764s
This is the power of parallelization: Our settings in this environment sharded that query request down into 5m blocks, which were all executed in parallel over 12 queriers. At least for me, it’s still impressive to see 20 million log lines and 6.2GB being processed per second. :)
I had to ask about this one myself, so I think it’s worth noting. We don’t actually have 84 ingesters running in this environment; we have 7. However, each of the 12 queriers made a request to each ingester for the sharded time range for which they were responsible, so this totals up to the 84 ingester requests.
This information is now returned as additional json in the result object. More information can be found in the docs.
In the intro I mentioned this results object was the source of the delay of the 1.4.0 release, so let me explain that a little here. Mentioned further below in this blog is another new feature that allows promtail to add a metric stage that can line count every stream it sends. I was querying this data to compare to the processed data for accuracy, and the numbers were not aligning well at all!
As an example, if I run this query sum(rate(promtail_custom_line_count_total{container_name="ingress-nginx"}[1m]))*60*60
– which takes the per second logging rate of that container, multiplies it by 60 to get the number of lines in a minute, and by 60 again to get the lines in an hour – the result is 77,309,672.
However, compare that with the stat from above (commas added for readability):
Summary.TotalLinesProcessed 105,533,574
And it shows us processing some 28,223,902 extra lines! This was especially confusing because both the Ingester and Store TotalDuplicates are zero!
I’m happy to say that after an exhaustive investigation, we did not find any bugs in the statistics code. What we did find, though, is an implementation detail around the duplicate counters; that is, we only count duplicate lines after they pass the filter expression, and in my contrived example, nothing matches/passes the filter expression. Investigations uncovered there were a lot more duplicate logs in the chunk store (the result of replication) than we were expecting. We are still looking into ways we can further reduce duplicates.
Additionally, there’s the nature of chunks overlapping for a request like this. We have many ingress nginx pods in many clusters, all logging in the same timeframe, which Loki has to read through and sort through to attempt to return in the proper order. This causes some amount of re-processing of chunks. Although it wasn’t as significant as the duplicates in our investigation, it is an area that we may be able to further optimize in the future.
Binary operators and numeric literals in LogQL
There were a couple of PRs in this release that expand LogQL to be able to support binary operators with numeric literals. Or put more simply, you can now do math on your metric queries like this:
sum(rate({app="foo"}[5m])) * 2
sum(rate({app="foo"}[5m]))/1e6
For more information on what binary operations can be done, check out this PR.
For more information on numeric literals. check out this PR.
Regex optimizations
After staring at a bunch of traces and trying to optimize queries, we were finding that certain types of regular expression-based filters had much slower performance than their string-matching counterparts. An example query:
{app=”foo”} |~ “val1|val2”
This regex test on each log line to see if it contains val1 OR val2 was very slow when compared to the bytes.Contains
we use for a similar filter query that does not use regex:
{app=”foo”} |= “val1”
In fact, for this case we found it significantly faster to replace the regex val1|val2
with an equivalent bytes.Contains(“val1”) || bytes.Contains(“val2”)
.
A lot more details can be found in the PR, but for various use cases we saw benchmark improvements from 5x to over 300x!
The best part is all of this is implemented totally transparently from the user, and we now have a mechanism for finding and optimizing more of these issues in the future!
Pipe data to promtail and –dry-run
Update 2020/04/07: we discovered shortly after releasing 1.4.0 the pipe feature was not working for Linux and Windows operating systems! 1.4.1 was released fixing this issue!
This is a much overdue feature for promtail when setting up and debugging pipelines, but now you can do this to test a single log line:
work-pc:loki ewelch$ echo -n 'level=debug msg="test log (200)"' | promtail -config.file=cmd/promtail/promtail-local-config.yaml --dry-run -log.level=debug 2>&1 | sed 's/^.*stage/stage/g'
level=info ts=2020-03-31T21:20:58.74432Z caller=server.go:147 http=[::]:80 grpc=[::]:9095 msg="server listening on addresses"
level=info ts=2020-03-31T21:20:58.746217Z caller=main.go:67 msg="Starting Promtail" version="(version=step-log-query-94d6cf9-WIP, branch=step-log-query, revision=94d6cf97)"
stage=0 name=regex labels={} time=2020-03-31T17:20:58.743671-04:00 entry="level=debug msg=\"test log (200)\""
stage=1 name=labels labels={} time=2020-03-31T17:20:58.743671-04:00 entry="level=debug msg=\"test log (200)\""
stage=2 name=regex labels="{level=\"debug\"}" time=2020-03-31T17:20:58.743671-04:00 entry="level=debug msg=\"test log (200)\""
stage=3 name=labels labels="{level=\"debug\"}" time=2020-03-31T17:20:58.743671-04:00 entry="level=debug msg=\"test log (200)\""
stage=4 name=regex labels="{level=\"debug\", status_code=\"200\"}" time=2020-03-31T17:20:58.743671-04:00 entry="level=debug msg=\"test log (200)\""
stage=5 name=output labels="{level=\"debug\", status_code=\"200\"}" time=2020-03-31T17:20:58.743671-04:00 entry="level=debug msg=\"test log (200)\""
level=debug ts=2020-03-31T21:20:58.748601Z caller=pipeline.go:97 component=pipeline component=pipeline msg="finished processing log line" labels="{level=\"debug\", status_code=\"200\"}" time=2020-03-31T17:20:58.743671-04:00 entry="test log (200)" duration_s=0.004902197
2020-03-31T17:20:58 {level="debug", status_code="200"} test log (200)
The config file in this example is a stripped-down promtail config to test our pipeline:
clients:
- url: required but ignored by dry-run
scrape_configs:
- job_name: test
pipeline_stages:
- regex:
expression: level=(?P<level>\w+)
- labels:
level: ""
- regex:
expression: \((?P<status_code>\d{3})\)
- labels:
status_code: ""
- regex:
expression: msg="(?P<message>.*)"
- output:
source: message
Notice we enabled debug logging, redirected stderr to stdout, and applied a very simple sed
filter to remove some log noise in front of the pipeline stages:
-log.level=debug 2>&1 | sed 's/^.*stage/stage/g'
You can also cat
a file into promtail stdin to process a file. (Or you can use stdout from any process!)
--dry-run
is also not required; you can run promtail sending data to it via stdin and have it ship the lines to Loki.
We realize there is still a lot of room for improvement when setting up and debugging pipeline stages, but we hope this helps a little!
Clients retry 429 rate limited failures
This only applies to the clients in the loki repo:
- Promtail
- Docker logging driver
- Fluent Bit
- Fluentd
Now any pushes to Loki that are met with a 429 http response code will be retried; previously they were dropped.
As we continue to improve the limits on Loki to make sure the server side is resistant to misbehaving clients, we debated what exactly the clients should do if Loki tells them to back off for a bit.
Initially we decided to solve this problem by having clients drop any logs that hit a rate limit. However, we have since changed our minds on this, favoring instead to retry these logs with backoffs.
This pushes the failure to send logs to two places. First is the retry limits. The defaults in promtail (and thus also the Docker logging driver and Fluent Bit, which share the same underlying code) will retry 429s (and 500s) on an exponential backoff for up to about 8.5 mins on the default configurations. (This can be changed; see the config docs for more info.)
The second place would be the log file itself. At some point, most log files roll based on size or time. Promtail makes an attempt to read a rolled log file but will only try once. If you are very sensitive to lost logs, give yourself really big log files with size-based rolling rules and increase those retry timeouts. This should protect you from Loki server outages or network issues.
Honorable mentions
max_chunk_age
PR 1558 allows configuring a max_chunk_age
, which will force-flush any chunk that reaches this age regardless of whether it is idle or not. This is useful to limit Loki memory usage as well as limit risk of lost logs in the event of ingester crashes.
Be careful though! Setting this value too low will cause a lot of small chunks to be flushed. Loki hates small chunks (and we are investigating ways to improve this). The default of 1h is probably good for most people, but consider increasing it to 2h if your log streams write slowly.
For example, if we set chunk_target_size
to 1.5MB, a log stream would have to log approximately 1500bytes/sec to fill 1.5MB in 1h. Our average ingress-nginx log is 300bytes, meaning we would need a rate of at least 5 lines/sec to fill chunks completely with a max_chunk_age
of 1h.
Looking over all of our streams, we found that many of them do not write at a rate that would fill a chunk completely, and this is further reduced when labels are added to streams breaking up a log file into more smaller streams.
It’s not a huge problem if they aren’t completely filled. Basically what I’m saying is: If you set this value to something like 5m you’re gonna have a bad time.
query_ingesters_within
PR 1572 adds a flag that tells the queriers to skip querying ingesters for requests older than a certain period. This is very useful combined with the above max_chunk_age
parameter, as you can safely set query_ingesters_within
to a value greater than max_chunk_age
and save unnecessary queries to the ingester for data which they will no longer have.
Runtime config
PR 1484 uses new runtimeconfig package from Cortex, which allows Loki to reload certain configs at runtime without being restarted. More info can be found in the docs.
Promtail metric for stream line counter
PR 1678 lets you configure a metric stage, which will output a counter for every log stream promtail is creating:
- metrics:
line_count_total:
config:
action: inc
match_all: true
description: A running counter of all lines with their corresponding
labels
type: Counter
This can help you better understand which applications are logging the most.
Loki-canary DDOS fix
PR 1581 fixes a bug in the loki-canary that was responsible for the second DDOS attack they have made on our cluster. :)
Inspect chunks by hand
PR 1577 comes with a very handy tool for opening and inspecting chunk files directly!
Upgrade considerations
Every attempt is made to keep Loki backwards compatible, such that upgrades should be low risk and low friction.
Unfortunately Loki is software, and software is hard, and sometimes things are not as easy as we want them to be.
In version 1.4.0 you may need to take extra steps when upgrading from a previous version. We have created a dedicated page in the docs outlining this in detail.
Mainly you need to be concerned if you are running a version of Loki that is 1.2.0 or older! (Or you can ignore this if you are running 1.3.0 and can go directly to 1.4.0.)
Want to learn more about Loki?
Sign up for our Intro to Loki webinar scheduled for April 22 at 9:30am PT. The agenda includes an overview of how Loki works, basic configs and setup to run Loki and test it out, how to use Loki from Grafana, an introduction to querying, and a Q&A with Loki team members Ed Welch and Cyril Tovena.