Splunk Query – Freezing Bucket Reasoning

Explanation of Frozen Buckets

Splunk indexes data from data sources that send their data to Splunk into the standalone indexer or indexer cluster. When data is ingested it goes into “BUCKETS” which are directories the data. An index will consist of many buckets organized by age of the data. The buckets go through a series of bucket database types being:
HOT  –>>  WARM  –>>  COLD  –>>  FROZEN
The “INDEXES.CONF” configuration files controls when, where, how, and how long for each bucket per index.
If the “INDEXES.CONF” is properly configured, the buckets will transverse through the bucket types until it gets to COLD. At this point depending on how you configured your indexers, the data will either:

  • Sit in COLD until the indexers run out of space
  • Be pushed to FROZEN depending on time period it is set for

If the data is sent to FROZEN, it will either be deleted if no directory is set or the data will go into the FROZEN directory that is configured.

Explanation of Query

The below query is ran on a Splunk Search Head with a user logged in that is capable of searching the “_internal” index.

This query will identify if Splunk is rolling buckets from COLD to FROZEN. This is important to know for many reasons if you are maintaining a Splunk infrastructure:

  • If the frozen time period is set and buckets are not rolling – It shows a configuration issue
  • If buckets are not rolling to FROZEN after the designated time period – It shows a configuration issue
index=_internal host={HOST_NAME} sourcetype=splunkd component=BucketMover "will attempt to freeze"
| rex field=candidate "'\/opt\/{INDEX_LOCATION}\/(?<index_bucket_cold>[\w_\d]+)\/"
| rex field=_raw "because (?<reason>[\w\s\=\d]+)"
| rex field=_raw "now=(?<time_now>[\d]+)"
| rex field=_raw "latest=(?<time_latest>[\d]+)"
| search index_bucket_cold!=_*
| table index_bucket_cold, time_now, time_latest, reason
| convert ctime(time_now)
| convert ctime(time_latest) 

