Day 7: Advanced Search Techniques
What You Will Learn Today
- Subsearches
- The
lookupcommand - The
joincommand - The
transactioncommand appendandappendpipe
Subsearches
A subsearch lets you embed one search inside another. The inner search runs first, and its results feed into the outer search.
# Show logs from the host with the most errors
index=main
[search index=main status>=500
| stats count by host
| sort -count
| head 1
| fields host]
flowchart TB
Outer["Main Search<br>index=main"]
Inner["Subsearch<br>[search ... | head 1 | fields host]"]
Result["Result<br>Events for host=web-01"]
Inner -->|"host=web-01"| Outer --> Result
style Outer fill:#3b82f6,color:#fff
style Inner fill:#22c55e,color:#fff
style Result fill:#f59e0b,color:#fff
How Subsearches Work
- The search inside
[...]runs first - Its results are expanded as OR conditions into the main search
- Example:
[search ... | fields host]becomes(host="web-01")
Practical Examples
# Show all access from IPs that generated many errors in the past hour
index=main sourcetype=access_combined
[search index=main sourcetype=access_combined status>=400 earliest=-1h
| stats count by clientip
| where count > 10
| fields clientip]
| table _time, clientip, uri, status
# Find activity within a specific user's session
index=main sourcetype=app_log
[search index=main sourcetype=auth_log action=login user=alice
| head 1
| fields session_id]
Note: Subsearches have a time limit (60 seconds by default) and a result limit (10,500 results by default). For large datasets, consider using
joininstead.
The lookup Command
Lookups enrich events with data from external CSV files or KV stores.
Creating a CSV Lookup
- Prepare a CSV file:
status_code,status_description,severity
200,OK,info
301,Moved Permanently,info
400,Bad Request,warning
401,Unauthorized,warning
403,Forbidden,warning
404,Not Found,warning
500,Internal Server Error,critical
502,Bad Gateway,critical
503,Service Unavailable,critical
-
Upload to Splunk: Settings > Lookups > Lookup table files > Add new
-
Create a lookup definition: Settings > Lookups > Lookup definitions > Add new
Using the lookup Command
# Add status descriptions to access logs
index=main sourcetype=access_combined
| lookup http_status status_code AS status OUTPUT status_description, severity
| table _time, uri, status, status_description, severity
Automatic Lookups
Configure transforms.conf and props.conf to apply lookups automatically at search time.
# transforms.conf
[http_status_lookup]
filename = http_status.csv
# props.conf
[access_combined]
LOOKUP-http_status = http_status_lookup status_code AS status OUTPUT status_description severity
inputlookup / outputlookup
# Display the contents of a lookup table
| inputlookup http_status.csv
# Save search results to a lookup
index=main sourcetype=access_combined
| stats count by clientip
| outputlookup ip_activity.csv
The join Command
The join command merges two result sets on a shared field, similar to a SQL join.
# Join auth logs with access logs
index=main sourcetype=access_combined
| join type=left clientip
[search index=main sourcetype=auth_log
| stats latest(user) AS user, latest(action) AS last_action by src_ip
| rename src_ip AS clientip]
| table _time, clientip, user, uri, status
Join Types
| Type | Description |
|---|---|
inner (default) |
Only events that match on both sides |
left |
All events from the main search + matching right-side data |
outer |
All events from both sides |
Note:
joinis memory-intensive and does not scale well with large datasets. Preferstatsorlookupwhen possible.
Alternatives to join
# Use stats instead of join
index=main (sourcetype=access_combined OR sourcetype=auth_log)
| stats values(uri) AS uris, values(user) AS users, latest(status) AS status by clientip
The transaction Command
The transaction command groups related events together. It is especially useful for session analysis and workflow tracking.
# Group events by client IP into sessions
index=main sourcetype=access_combined
| transaction clientip maxspan=30m maxpause=5m
| table clientip, duration, eventcount, _time
# Define session boundaries explicitly
index=main sourcetype=app_log
| transaction session_id startswith="login" endswith="logout"
| table session_id, user, duration, eventcount
Transaction Parameters
| Parameter | Description | Example |
|---|---|---|
maxspan |
Maximum duration of a transaction | maxspan=1h |
maxpause |
Maximum gap between events | maxpause=5m |
startswith |
Condition for the first event | startswith="login" |
endswith |
Condition for the last event | endswith="logout" |
maxevents |
Maximum number of events | maxevents=100 |
Fields Added by transaction
| Field | Description |
|---|---|
duration |
Duration of the transaction in seconds |
eventcount |
Number of events in the transaction |
closed_txn |
Whether the transaction closed normally |
flowchart LR
E1["10:00<br>login"]
E2["10:05<br>browse"]
E3["10:10<br>purchase"]
E4["10:15<br>logout"]
subgraph TX["transaction session_id"]
E1 --> E2 --> E3 --> E4
end
TX -->|"duration=900s<br>eventcount=4"| Result["Result"]
style TX fill:#3b82f6,color:#fff
style Result fill:#22c55e,color:#fff
Performance tip:
transactionis a resource-heavy command. Where possible, usestatsas a lighter alternative.
# Replicate transaction behavior with stats
index=main sourcetype=access_combined
| stats min(_time) AS start, max(_time) AS end, count AS eventcount, values(uri) AS pages by clientip
| eval duration = end - start
append and appendpipe
append
The append command appends the results of a second search to the current result set.
# Regular results + a total row
index=main sourcetype=access_combined
| stats count by host
| append
[search index=main sourcetype=access_combined
| stats count
| eval host="TOTAL"]
| sort -count
appendpipe
The appendpipe command runs a secondary aggregation on the current result set and appends it as additional rows.
# Per-host counts + a total row
index=main sourcetype=access_combined
| stats count by host
| appendpipe [stats sum(count) AS count | eval host="TOTAL"]
| sort -count
Recommendation: For adding summary rows,
appendpipeis simpler and more efficient thanappend.
Hands-On: Security Analysis Queries
# 1. Brute force detection
# Multiple login failures from the same IP in a short period
index=main sourcetype=auth_log action=failed
| stats count by src_ip
| where count > 5
| lookup geo_ip ip AS src_ip OUTPUT country, city
| sort -count
# 2. Unusual access patterns
# Sessions with an abnormally high page count
index=main sourcetype=access_combined
| transaction clientip maxspan=10m
| where eventcount > 50
| table clientip, eventcount, duration
# 3. Error chain analysis
# Inspect events surrounding a specific error
index=main sourcetype=app_log
| transaction host maxspan=5m maxpause=1m
| search "OutOfMemoryError"
| table _time, host, eventcount, _raw
# 4. User behavior tracking
# Track all activity from IPs that received multiple 403s
index=main sourcetype=access_combined
[search index=main sourcetype=access_combined status=403
| stats count by clientip
| where count > 3
| fields clientip]
| transaction clientip maxspan=1h
| table clientip, duration, eventcount, _raw
Summary
| Concept | Description |
|---|---|
| Subsearch | Embed a search inside another with [search ...] |
lookup |
Enrich events with external data |
join |
Merge two result sets on a shared field |
transaction |
Group related events into logical units |
append |
Add results from a separate search |
appendpipe |
Add summary rows from the current result set |
Key Takeaways
- Subsearches are subject to time and result count limits
lookupis the best tool for enriching events with external datajoinis memory-intensive -- use it sparingly on large datasetstransactionis powerful but heavy -- preferstatswhen possible
Exercises
Exercise 1: Basic
Create an HTTP status code CSV lookup and use it to add descriptions to your access logs.
Exercise 2: Applied
Use a subsearch to display all logs from the host that had the most errors in the past hour.
Exercise 3: Challenge
Use transaction to analyze user sessions and calculate average session duration, average page views per session, and bounce rate (percentage of single-page sessions).
References
Coming up next: In Day 8, you will learn about alerts and reports -- how to automate monitoring with scheduled searches, trigger notifications, and build reusable search macros.