Day 5: Statistics and Aggregation

What You Will Learn Today

The stats command
The chart and timechart commands
The top and rare commands
Statistical functions (count, avg, sum, max, min, dc)
eventstats and streamstats

The stats Command

The most fundamental command for aggregating data.

# Count by status code
index=main sourcetype=access_combined
| stats count by status

# Average response time per host
index=main sourcetype=access_combined
| stats avg(response_time) AS avg_response by host

# Multiple aggregations at once
index=main sourcetype=access_combined
| stats count, avg(response_time) AS avg_rt, max(response_time) AS max_rt, min(response_time) AS min_rt by host

Key Statistical Functions

Function	Description	Example
`count`	Number of events	`count`
`count(field)`	Count where field exists	`count(user)`
`dc(field)`	Distinct count	`dc(clientip)`
`avg(field)`	Average	`avg(response_time)`
`sum(field)`	Sum	`sum(bytes)`
`max(field)`	Maximum	`max(response_time)`
`min(field)`	Minimum	`min(response_time)`
`median(field)`	Median	`median(response_time)`
`mode(field)`	Most frequent value	`mode(status)`
`stdev(field)`	Standard deviation	`stdev(response_time)`
`perc95(field)`	95th percentile	`perc95(response_time)`
`values(field)`	List of unique values	`values(user)`
`list(field)`	List of values (with duplicates)	`list(status)`
`latest(field)`	Most recent value	`latest(status)`
`earliest(field)`	Oldest value	`earliest(status)`

Grouping by Multiple Fields

# Group by host and status
index=main sourcetype=access_combined
| stats count by host, status
| sort host, -count

The chart Command

Creates a two-dimensional cross-tabulation.

# Count by status and host
index=main sourcetype=access_combined
| chart count by status, host

Example output:

status	web-01	web-02	web-03
200	1500	1200	1800
404	50	30	45
500	10	15	5

# Using over and by
index=main sourcetype=access_combined
| chart count over status by host

# Top 5 hosts only
index=main sourcetype=access_combined
| chart count by host
| sort -count
| head 5

flowchart LR
    subgraph stats["stats count by A"]
        S1["A | count<br>---+------<br>x | 10<br>y | 20"]
    end
    subgraph chart["chart count by A, B"]
        C1["A | B1 | B2<br>---+----+----<br>x | 5 | 5<br>y | 12 | 8"]
    end
    style stats fill:#3b82f6,color:#fff
    style chart fill:#22c55e,color:#fff

The timechart Command

Aggregates data into time-series buckets for charting.

# Event count per hour
index=main sourcetype=access_combined
| timechart span=1h count

# Time-series count by status
index=main sourcetype=access_combined
| timechart span=1h count by status

# Average response time over time
index=main sourcetype=access_combined
| timechart span=15m avg(response_time) AS avg_response

span Options

span	Interval
`span=1m`	1 minute
`span=5m`	5 minutes
`span=15m`	15 minutes
`span=1h`	1 hour
`span=1d`	1 day

Key point: timechart automatically uses _time as the X-axis. The key difference from chart is that the time axis is fixed.

chart vs timechart

Comparison	chart	timechart
X-axis	Any field	`_time` (fixed)
Time bucketing	Manual (`eval` + `strftime`)	Automatic (`span`)
Use case	Category analysis	Time-series analysis

The top and rare Commands

top

Displays the most frequent values.

# Top 10 URIs
index=main sourcetype=access_combined
| top limit=10 uri

# Top 5 URIs per host
index=main sourcetype=access_combined
| top limit=5 uri by host

# Without percentage
index=main sourcetype=access_combined
| top limit=10 uri showperc=false

Output fields from top:

Field	Description
`count`	Number of occurrences
`percent`	Percentage of total

rare

Displays the least frequent values (the inverse of top).

# Least common status codes
index=main sourcetype=access_combined
| rare limit=5 status

The eventstats Command

Works like stats, but appends the aggregated values to each original event instead of replacing them.

# Add the overall average to each event
index=main sourcetype=access_combined
| eventstats avg(response_time) AS overall_avg
| eval is_slow = if(response_time > overall_avg * 2, "Yes", "No")
| where is_slow="Yes"
| table _time, uri, response_time, overall_avg

flowchart TB
    subgraph Stats["stats: Events are replaced by aggregated rows"]
        S1["host | count<br>web-01 | 100<br>web-02 | 200"]
    end
    subgraph EventStats["eventstats: Original events + aggregated values"]
        E1["_time | host | ... | total<br>10:00 | web-01 | ... | 300<br>10:01 | web-02 | ... | 300"]
    end
    style Stats fill:#3b82f6,color:#fff
    style EventStats fill:#22c55e,color:#fff

Command	Original Events	Aggregated Results
`stats`	Removed	One row per group
`eventstats`	Preserved	Appended to each event

The streamstats Command

Processes events sequentially and computes running (cumulative) statistics.

# Running count
index=main sourcetype=access_combined
| streamstats count AS running_count
| table _time, uri, running_count

# Moving average over the last 5 events
index=main sourcetype=access_combined
| streamstats avg(response_time) AS moving_avg window=5
| table _time, response_time, moving_avg

# Cumulative error count per host
index=main sourcetype=access_combined status>=400
| streamstats count AS error_count by host
| table _time, host, status, error_count

Parameter	Description
`window=N`	Compute over the last N events
`time_window=span`	Compute over a time window
`current=false`	Exclude the current event from the calculation

Hands-On: Performance Analysis Queries

# 1. Overall KPIs
index=main sourcetype=access_combined
| stats
    count AS total_requests,
    dc(clientip) AS unique_visitors,
    avg(response_time) AS avg_response_time,
    perc95(response_time) AS p95_response_time,
    sum(eval(if(status>=400,1,0))) AS error_count
| eval error_rate = round(error_count / total_requests * 100, 2)
| eval avg_response_time = round(avg_response_time, 3)
| eval p95_response_time = round(p95_response_time, 3)

# 2. Hourly traffic
index=main sourcetype=access_combined
| timechart span=1h count AS requests, avg(response_time) AS avg_rt

# 3. Per-endpoint performance
index=main sourcetype=access_combined
| stats count, avg(response_time) AS avg_rt, perc95(response_time) AS p95_rt by uri
| sort -count
| head 20
| eval avg_rt = round(avg_rt, 3)
| eval p95_rt = round(p95_rt, 3)

# 4. Error rate over time
index=main sourcetype=access_combined
| timechart span=15m count(eval(status>=400)) AS errors, count AS total
| eval error_rate = round(errors / total * 100, 2)
| fields _time, error_rate

# 5. Response time distribution
index=main sourcetype=access_combined
| eval rt_bucket = case(
    response_time < 0.1, "< 100ms",
    response_time < 0.5, "100-500ms",
    response_time < 1.0, "500ms-1s",
    response_time < 5.0, "1-5s",
    1=1, "> 5s"
)
| stats count by rt_bucket
| sort rt_bucket

Summary

Concept	Description
`stats`	Group-by aggregation
`chart`	Two-dimensional cross-tabulation
`timechart`	Time-series aggregation
`top` / `rare`	Most / least frequent values
`eventstats`	Append aggregated values to original events
`streamstats`	Running and moving statistics

Key Takeaways

stats is the most fundamental aggregation command
timechart is essential for time-series analysis
Use eventstats when you need to compare individual events against aggregate values
Use streamstats for moving averages and cumulative totals

Exercises

Exercise 1: Basic

Use stats to display the event count, number of unique hosts, and the latest timestamp for each sourcetype.

Exercise 2: Applied

Use timechart to aggregate request counts by status code in 15-minute intervals, suitable for a time-series chart.

Challenge

Combine eventstats and where to find requests that took more than twice the average response time for their host, then count them by URI.

References

Next up: In Day 6, you will learn about data visualization -- creating dashboards and panels to present your data visually.