Day 4: フィールドとフィルタリング

今日学ぶこと

フィールドの種類と抽出
eval コマンドによるフィールド作成
where コマンドによるフィルタリング
rex コマンドによる正規表現抽出
フィールドエイリアスと計算フィールド

フィールドの種類

Splunkのフィールドにはデフォルトフィールドと抽出フィールドがあります。

flowchart TB
    subgraph Fields["フィールドの種類"]
        Default["デフォルトフィールド<br>_time, host, source,<br>sourcetype, _raw"]
        Extracted["抽出フィールド<br>自動: status, user<br>手動: rex, erex"]
        Calculated["計算フィールド<br>eval で作成"]
    end
    style Default fill:#3b82f6,color:#fff
    style Extracted fill:#22c55e,color:#fff
    style Calculated fill:#f59e0b,color:#fff

デフォルトフィールド

フィールド	説明
`_time`	イベントのタイムスタンプ
`_raw`	イベントの生データ
`host`	データの送信元ホスト
`source`	データのソース（ファイルパス等）
`sourcetype`	データの種類
`index`	格納先インデックス
`_indextime`	インデックスされた時刻
`linecount`	イベントの行数

自動抽出フィールド

Splunkはkey=value形式のフィールドを自動的に抽出します。

2026-01-30 10:00:01 status=200 user=alice duration=0.023

上記のログからstatus, user, durationが自動的にフィールドとして抽出されます。

eval コマンド

新しいフィールドを作成したり、既存フィールドを変換します。

基本的な使い方

index=main
| eval response_time_ms = duration * 1000
| eval status_category = if(status >= 400, "Error", "OK")
| table _time, status, status_category, duration, response_time_ms

eval の関数

文字列関数

| eval lower_user = lower(user)
| eval upper_host = upper(host)
| eval greeting = "Hello, " . user    # 文字列連結
| eval domain = replace(email, ".*@", "")
| eval length = len(message)
| eval first3 = substr(uri, 1, 3)

関数	説明	例
`lower(x)`	小文字化	`lower("ABC")` → `"abc"`
`upper(x)`	大文字化	`upper("abc")` → `"ABC"`
`len(x)`	文字列長	`len("hello")` → `5`
`substr(x,s,e)`	部分文字列	`substr("hello",1,3)` → `"hel"`
`replace(x,r,n)`	正規表現置換	`replace(ip, "\.\d+$", ".0")`
`trim(x)`	空白除去	`trim(" abc ")` → `"abc"`

数値関数

| eval rounded = round(duration, 2)
| eval absolute = abs(difference)
| eval power = pow(2, 10)
| eval log_value = log(count, 10)

条件関数

# if
| eval severity = if(status >= 500, "Critical", "Normal")

# case
| eval severity = case(
    status >= 500, "Critical",
    status >= 400, "Warning",
    status >= 300, "Redirect",
    1=1, "OK"
)

# coalesce（最初のnullでない値）
| eval display_name = coalesce(full_name, username, "Unknown")

# null チェック
| eval has_error = if(isnull(error_message), "No", "Yes")

日時関数

| eval event_date = strftime(_time, "%Y-%m-%d")
| eval event_hour = strftime(_time, "%H")
| eval day_of_week = strftime(_time, "%A")
| eval epoch = strptime("2026-01-30", "%Y-%m-%d")
| eval elapsed = now() - _time

関数	説明
`strftime(_time, fmt)`	タイムスタンプを文字列に変換
`strptime(str, fmt)`	文字列をタイムスタンプに変換
`now()`	現在のエポック時間
`relative_time(t, offset)`	相対時間の計算

where コマンド

eval式を使った高度なフィルタリングができます。

# 基本
index=main
| where status >= 400

# 文字列比較（where は大文字小文字を区別する）
index=main
| where user = "alice"

# 関数を使ったフィルタ
index=main
| where like(uri, "/api/%")
| where len(user) > 3
| where isnotnull(error_message)

# 正規表現マッチ
index=main
| where match(uri, "^/api/v[0-9]+/")

search vs where

比較	`search`	`where`
位置	パイプ前後	パイプ後のみ
大文字小文字	区別しない	区別する
ワイルドカード	`*`使用可	`like()`を使用
eval関数	使えない	使える
速度	高速	やや低速

# search（キーワード検索、大文字小文字を区別しない）
index=main
| search user=alice status>400

# where（eval式、大文字小文字を区別する）
index=main
| where user="alice" AND status > 400

rex コマンド

正規表現でフィールドを抽出します。

# IPアドレスの抽出
index=main
| rex field=_raw "(?P<ip_address>\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})"
| table _time, ip_address

# メールアドレスの抽出
index=main
| rex field=_raw "(?P<email>[\w.+-]+@[\w-]+\.[\w.]+)"
| table _time, email

# URIからパスパラメータを抽出
index=main
| rex field=uri "/api/(?P<api_version>v\d+)/(?P<resource>\w+)"
| table _time, api_version, resource

# 複数のグループを一度に抽出
index=main
| rex field=_raw "user=(?P<user>\w+).*status=(?P<status>\d+)"

rex の正規表現構文

パターン	説明	マッチ例
`\d`	数字	`0-9`
`\w`	英数字/アンダースコア	`a-z, 0-9, _`
`\s`	空白	スペース, タブ
`.`	任意の1文字	何でも
`+`	1回以上	`\d+` → `123`
`*`	0回以上	`\d*` → `""` or `123`
`?`	0回 or 1回	`\d?` → `""` or `1`
`(?P<name>...)`	名前付きグループ	フィールド名として抽出

rex mode=sed

sed構文で文字列を置換します。

# IPアドレスをマスキング
index=main
| rex mode=sed field=_raw "s/\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}/xxx.xxx.xxx.xxx/g"

フィールドエイリアスと計算フィールド

props.conf での設定

# props.conf
[my_sourcetype]
# フィールドエイリアス
FIELDALIAS-src = src_ip AS source_ip
FIELDALIAS-dst = dst_ip AS dest_ip

# 計算フィールド
EVAL-response_time_ms = response_time * 1000
EVAL-severity = case(status >= 500, "critical", status >= 400, "warning", 1=1, "info")

ポイント: フィールドエイリアスと計算フィールドはサーチ時に自動適用されるため、毎回SPLでevalする必要がなくなります。

実践: ログ分析クエリ集

# 1. レスポンスタイムの分類
index=main sourcetype=access_combined
| eval response_class = case(
    response_time < 0.1, "Fast",
    response_time < 1.0, "Normal",
    response_time < 5.0, "Slow",
    1=1, "Critical"
)
| stats count by response_class
| sort response_class

# 2. 時間帯別のエラー率
index=main sourcetype=access_combined
| eval hour = strftime(_time, "%H")
| eval is_error = if(status >= 400, 1, 0)
| stats avg(is_error) AS error_rate by hour
| eval error_rate_pct = round(error_rate * 100, 2)
| table hour, error_rate_pct
| sort hour

# 3. ユーザーエージェントからブラウザを抽出
index=main sourcetype=access_combined
| rex field=useragent "(?P<browser>Chrome|Firefox|Safari|Edge|Opera)"
| stats count by browser
| sort -count

# 4. IPアドレスのサブネット分析
index=main sourcetype=access_combined
| rex field=clientip "(?P<subnet>\d+\.\d+\.\d+)\.\d+"
| eval subnet = subnet . ".0/24"
| stats count by subnet
| sort -count
| head 10

まとめ

概念	説明
デフォルトフィールド	`_time`, `host`, `source`, `sourcetype`
`eval`	フィールドの作成・変換
`where`	eval式によるフィルタリング
`rex`	正規表現によるフィールド抽出
フィールドエイリアス	フィールド名の別名定義
計算フィールド	自動適用されるeval式

重要ポイント

**eval**は新しいフィールドを作成する最も基本的な方法
**where**はeval関数を使ったフィルタリングに使う
**rex**は構造化されていないデータからフィールドを抽出する
searchとwhereの違い（大文字小文字の区別）を理解する

練習問題

問題1: 基本

evalのcase関数を使って、HTTPステータスコードを「Success」(2xx)、「Redirect」(3xx)、「Client Error」(4xx)、「Server Error」(5xx)に分類してください。

問題2: 応用

rexを使って、syslogメッセージからプロセス名とPIDを抽出し、プロセスごとのイベント数を表示してください。

チャレンジ問題

evalとwhereを組み合わせて、営業時間（9:00-18:00）外に発生したエラーイベントのみを抽出し、時間帯別に集計するSPLを書いてください。

参考リンク

次回予告: Day 5では「統計と集計」について学びます。stats、chart、timechartコマンドでデータを集約する方法をマスターしましょう。