Splunk is one of the most widely used log analysis platforms in the world. At its core is SPL (Search Processing Language). Your ability to leverage Splunk effectively depends heavily on how well you understand SPL.
This article covers SPL fundamentals through practical queries with concrete examples.
What is SPL?
SPL (Search Processing Language) is the language used to search and analyze data indexed by Splunk. It combines concepts from Unix pipelines and SQL, allowing you to intuitively filter, transform, and aggregate data.
flowchart LR
subgraph Pipeline["SPL Pipeline"]
A["Search"] --> B["Filter"]
B --> C["Transform"]
C --> D["Aggregate"]
D --> E["Display"]
end
style Pipeline fill:#3b82f6,color:#fff
Key characteristics of SPL:
- Pipeline processing: Chain commands with
|to process data sequentially - Time-series optimized: Fast timestamp-based searching
- 140+ commands: Comprehensive coverage for statistics, transformation, and visualization
Basic Syntax
Search Structure
SPL queries start with search terms and chain commands using pipes (|).
index=web_logs status=500 | stats count by uri | sort -count | head 10
This query:
- Searches the
web_logsindex for events wherestatus=500 - Counts events grouped by
uri - Sorts by count in descending order
- Returns the top 10 results
Search Terms
The first part of your search narrows down the target data.
# Index specification
index=main
# Keyword search
error OR failed
# Field value matching
status=404
host="web-server-01"
# Wildcards
source="/var/log/*.log"
# Negation
NOT status=200
# Time range (relative)
earliest=-24h latest=now
Important: index=* searches all indexes and is extremely slow. Always specify a concrete index.
Time Range Specification
Splunk is optimized for time-series data, so time range specification significantly impacts search performance.
# Relative time
earliest=-1h # From 1 hour ago
earliest=-7d@d # From midnight 7 days ago
earliest=@d # From midnight today
# Absolute time
earliest="2026-02-01:00:00:00"
latest="2026-02-05:23:59:59"
# Snap operator (@)
earliest=-1d@d # Yesterday at midnight
earliest=-1w@w # Start of last week
Five Essential Commands
1. stats - Statistical Aggregation
stats is the most important command in SPL. It groups data and calculates statistics.
# Basic count
index=web_logs | stats count
# Group by field
index=web_logs | stats count by status
# Multiple statistical functions
index=web_logs
| stats count, avg(response_time) as avg_time, max(response_time) as max_time by uri
# Common statistical functions
# count - Number of events
# sum - Total
# avg - Average
# min/max - Minimum/Maximum
# dc - Distinct count
# values - List of unique values
# latest - Most recent value
2. eval - Field Calculation
eval creates new fields or transforms existing ones.
# Create new field
index=web_logs
| eval response_sec = response_time / 1000
# Conditional logic
index=web_logs
| eval status_category = case(
status < 300, "success",
status < 400, "redirect",
status < 500, "client_error",
true(), "server_error"
)
# String manipulation
index=web_logs
| eval domain = lower(host)
| eval short_uri = substr(uri, 1, 50)
# DateTime operations
index=web_logs
| eval hour = strftime(_time, "%H")
| eval day_of_week = strftime(_time, "%A")
Useful eval functions:
| Function | Description | Example |
|---|---|---|
if(cond, true, false) |
Conditional | if(status=200, "OK", "Error") |
case(cond1, val1, ...) |
Multiple conditions | See above |
coalesce(a, b, ...) |
First non-null value | coalesce(user, "anonymous") |
len(str) |
String length | len(message) |
replace(str, regex, new) |
Replace | replace(uri, "\d+", "N") |
mvcount(field) |
Multivalue field count | mvcount(tags) |
3. timechart - Time Series Charts
timechart aggregates data along a time axis, outputting data suitable for graphing.
# Events over time
index=web_logs | timechart count
# Hourly count by status code
index=web_logs | timechart span=1h count by status
# Average response time (5-minute intervals)
index=web_logs | timechart span=5m avg(response_time) as avg_response
# Multiple metrics
index=web_logs
| timechart span=1h count as requests, avg(response_time) as avg_time
flowchart TB
subgraph timechart["How timechart Works"]
A["Raw Logs"] --> B["Split into Time Buckets"]
B --> C["Aggregate per Bucket"]
C --> D["Output Time Series Table"]
end
style timechart fill:#8b5cf6,color:#fff
4. table / fields - Output Field Control
table displays only specified fields in table format.
# Display specific fields only
index=web_logs
| table _time, host, uri, status, response_time
# Remove unwanted fields with fields
index=web_logs
| fields - _raw, _cd, _indextime
# Rename fields
index=web_logs
| rename response_time as "Response Time (ms)", status as "HTTP Status"
| table _time, host, uri, "HTTP Status", "Response Time (ms)"
5. where / search - Filtering
Filter data mid-pipeline.
# where evaluates expressions (works with calculated fields)
index=web_logs
| eval response_sec = response_time / 1000
| where response_sec > 5
# search uses keyword matching
index=web_logs
| stats count by uri, status
| search status=500
# where comparison operators
| where response_time > 1000
| where status >= 400 AND status < 500
| where like(uri, "/api/%")
| where match(user_agent, "(?i)bot")
Practical Examples
Example 1: Error Analysis Dashboard
Query set for analyzing web server errors.
# HTTP status code distribution
index=web_logs earliest=-24h
| eval status_group = case(
status < 300, "2xx Success",
status < 400, "3xx Redirect",
status < 500, "4xx Client Error",
true(), "5xx Server Error"
)
| stats count by status_group
| sort status_group
# Top 10 endpoints with most errors
index=web_logs status>=400 earliest=-24h
| stats count as errors by uri
| sort -errors
| head 10
# Hourly error rate
index=web_logs earliest=-24h
| timechart span=1h
count(eval(status>=400)) as errors,
count as total
| eval error_rate = round(errors / total * 100, 2)
| fields _time, errors, total, error_rate
Example 2: Performance Analysis
Response time analysis.
# Percentile analysis
index=web_logs earliest=-1h
| stats
avg(response_time) as avg,
median(response_time) as p50,
perc95(response_time) as p95,
perc99(response_time) as p99,
max(response_time) as max
| eval avg = round(avg, 2)
# Identify slow requests
index=web_logs earliest=-1h
| where response_time > 3000
| table _time, host, uri, response_time, status
| sort -response_time
Example 3: User Behavior Analysis
# Daily active users
index=web_logs earliest=-7d
| timechart span=1d dc(user_id) as unique_users
# Per-user session analysis
index=web_logs user_id=* earliest=-24h
| stats
count as page_views,
dc(uri) as unique_pages,
min(_time) as first_access,
max(_time) as last_access
by user_id
| eval session_duration = last_access - first_access
| eval session_minutes = round(session_duration / 60, 1)
| table user_id, page_views, unique_pages, session_minutes
| sort -page_views
Performance Best Practices
Key points for optimizing SPL queries.
1. Narrow the Time Range
# Bad: Searching all time
index=web_logs status=500
# Good: Specify time range
index=web_logs status=500 earliest=-24h latest=now
2. Filter as Early as Possible
# Bad: Filter after aggregation
index=web_logs | stats count by status | search status=500
# Good: Filter at search time
index=web_logs status=500 | stats count
3. Remove Unnecessary Fields with fields
# For logs with many fields
index=web_logs earliest=-1h
| fields _time, host, uri, status, response_time
| stats avg(response_time) by host
4. stats is Faster than transaction
# Slow: transaction
index=web_logs | transaction session_id | stats count
# Fast: stats
index=web_logs | stats count, values(uri) as pages by session_id | stats count
Summary
| Command | Purpose | Example |
|---|---|---|
stats |
Statistical aggregation | stats count, avg(field) by group |
eval |
Field calculation | eval new_field = field1 + field2 |
timechart |
Time series aggregation | timechart span=1h count by status |
table |
Field selection | table _time, host, status |
where |
Conditional filter | where response_time > 1000 |
sort |
Sort | sort -count (descending) |
head/tail |
Limit results | head 10 |
rename |
Rename fields | rename field as "New Name" |
dedup |
Remove duplicates | dedup host, uri |
SPL is deep with over 140 commands, but mastering the five commands covered in this article (stats, eval, timechart, table, where) will handle most use cases.