10 things you didn’t know about LogQL
For this edition of my ongoing Grafana Loki how-to series, I wanted to offer up some helpful — and perhaps surprising — facts about using LogQL, Loki’s query language.
In case you’re new to Grafana Loki, it’s a log aggregation system created in 2018, and the Loki team has worked with the community ever since to introduce new features and make it easier to deploy. LogQL is heavily inspired by PromQL, the language used to query metrics in Prometheus, the popular CNCF Project.
Back in 2020, LogQL received a lot of new extensions allowing it to extract, transform, and filter your logs. In this blog post, we’ll look at 10 features you probably didn’t know about.
1. You can use the unwrapped rate to count sample values per seconds.
Unwrapped range vector aggregation is the popular way to extract sample values from your log line. Usually you want to use them with sum_over_time
or quantile_over_time
range aggregation operators. However, the rate aggregation can be used for counting log per seconds and can also be used with an unwrapped range vector to rate sample value from the log line. This is useful if your log line contains a counter value. Be careful not to use rate( ... | unwrap)
on values that could go up or down, but only on those which increase.
For example, the query below rate the order count per namespace:
sum by (namespace) (
rate({cluster="dev-us-central-1"}
| json | unwrap order_count[$__interval])
)
Unlike the rate for log count, the unwrapped rate will account for counter reset by extrapolating values, just like Prometheus does for Counter.
2. You can count the length of extracted values.
Text template is the template engine used by LogQL in combination with line_format
and label_format
. This language includes tons of useful functions that we often forget — including len
, which returns the size of the variable — and can be useful when querying for your logs.
If you want to calculate the longest query made in Cortex, try this:
max_over_time(
{namespace="cortex-prod", container="query-frontend"}
| logfmt
| label_format param_query_count=`{{ .param_query | len }}`
| unwrap param_query_count [$__interval]
) by (cluster)
3. There’s a new variable for text/template.
We recently added __line__
, a new variable for text/template that will print or output the actual log line. This is useful when you want to trim, prepend, or mutate the whole log line.
To remove all \
from a log line, you would use:
{name="query-frontend", namespace="loki"} | line_format `{{ Replace __line__ "\\" "" -1 }}`
4. Multiple parsers are an option.
Parsers in LogQL are actually not filtering data if they fail to parse. That means you can actually include multiple parsers within the same query if you want to extract from a different format.
For example, in Loki we have this log line:
level=debug ts=2022-03-15T10:06:21.308662138Z caller=logging.go:67
traceID=7acfb2b833acb3e2 orgID=29 msg="GET
/loki/api/v1/query_range?start=1647337422075409296&end=1647337442075409296&query=%
7Bstream%3D%22stdout%22%2Cpod%3D%22loki-canary-x672l%22%7D&limit=1000 (200)
27.806027ms"
If you want to parse request information (method, status, duration), you can combine logfmt
and pattern parser as follows:
{name="query-frontend", namespace="loki-ops"}
|= "logging.go" != "gRPC"
| logfmt | line_format `{{.msg}}` | pattern `<method> <_> (<status>) <duration>`
As you can see, the trick is to alter the log line with the next property you want to parse using line_format
.
5. It’s possible to group with range vector aggregations.
In Prometheus, range vector aggregations do not allow grouping by
and without
; however, it’s possible in Loki — some queries wouldn’t be expressible otherwise.
For example, if you want to get the latency 99th quantile per method and status, you can’t wrap the quantile with a sum by (method,status) since the latency quantile can’t be summed. However, you can use grouping directly on the quantile as shown below to achieve the right result:
quantile_over_time(0.99,
{name="query-frontend", namespace="loki-ops"}
|= "logging.go" | logfmt | line_format `{{.msg}}`
| pattern `<method> <_> (<status>) <duration>`
| unwrap duration(duration) | __error__=""[$__interval]
) by (method,status)
This is also true for most range vector aggregations, except those that can be summed such as rate, count_over_time, etc.
6. There are bytes and duration label filters.
Once new labels are parsed, they can be used for further filtering using label filters. What’s nice about those filters is that they can infer the type using the literal leg of the expressions.
A query like the one below, for example, will filter requests that have their latency higher than five seconds and bytes throughput lower than 4 gigabytes.
{container="query-frontend"} |= "metrics.go" | logfmt | duration >= 5s or throughput < 4GB
7. You can reformat the log line.
Sometimes logs on screen can be hard to read. Loki line_format
not only can help you filter content but also reformat the log line in a more readable way.
I use this query to remove double backspaces, but also create nicely spaced values from the log line (using t
) and limit the .query
variable to 100 characters.
{name="query-frontend", namespace="loki"}
|= "metrics.go" | logfmt
| label_format query="{{ Replace .query \"\\n\" \"\" -1 }}"
| line_format "{{ .ts}}\t{{.duration}}\ttraceID = {{.traceID}}\t{{ printf \"%100s\" .query }}"
8. There’s IP filtering.
If you’re working with network devices or simply logging IPs, this is for you.
We recently added IP filtering in Loki, and while technically, you could already use regex expressions, the IP filters support more use cases like IP v6, CIDR matches, ranges, etc.
The query below will return logs that do not match a given IP range.
{job_name="myapp"} != ip("192.168.4.5-192.168.4.20")
9. You can put emoji in your logs.
Loki and Grafana support any UTF-8 characters combined with line_format
, which can help you create nice visualizations.
To prepend a status emoji on all your requests, try this:
{name="query-frontend", namespace="loki-ops"}
|= "logging.go" != "gRPC" | logfmt
| line_format `{{.msg}}`
| pattern `<_> <_> (<status>) <_>`
| line_format `{{ if eq .status "500" }} ❌ {{else}} ✅ {{end}} {{__line__}}`
10. Array value can be accessed in JSON.
By default, the JSON parser will automatically extract all possible properties from the JSON document. However, it skips arrays intentionally. If you want to access this data, you can use another variant of the JSON parser taking an argument.
To extract the first server from a JSON array, you’d use this query:
{app="infrastructure"} | json first_server="servers[0]"
Bonus: You can comment in LogQL.
Last but not least, you can use #
and /**/
respectively to comment one or more lines in your query. This is super useful when troubleshooting a query.
I hope you find those tips useful! For more information about LogQL, head to our LogQL documentation page.
You can also join us for our “Scaling and securing your logs with Grafana Loki” webinar on May 19. Register today for free!
The easiest way to get started with Grafana Loki is Grafana Cloud. We have a generous free forever tier and plans for every use case. Sign up for free now!