Mastering Splunk
上QQ阅读APP看书,第一时间看更新

Splunk bucketing

The Splunk bucketing option allows you to group events into discreet buckets of information for better analysis. For example, the number of events returned from the indexed data might be overwhelming, so it makes more sense to group or bucket them by a span (or a time range) of time (seconds, minutes, hours, days, months, or even subseconds).

We can use the following example to illustrate this point:

tm1* error | stats count(_raw) by _time source

Notice the generated output:

Here is an additional example:

tm1* error | bucket _time span=5d | stats count(_raw) by _time source

The output obtained is as follows:

Reporting using the timechart command

Similar to the chart command, timechart is a reporting command for creating time series charts with a corresponding table of statistics. As discussed earlier, timechart always generates a _time x-axis (while with chart, you are able to set your own x-axis for your chart visualization). This is an important difference as the following commands appear to be identical (they just use different reporting commands) but yield very different results:

tm1* rule |  chart count(date_hour) by date_wday
tm1* rule |  timechart count(date_hour) by date_wday

The chart command displays the following visualization:

The timechart command displays the following version of the visualization:

Arguments required by the timechart command

When you use the Splunk timechart command, a single aggregation or an eval expression must be supplied, as follows:

  • Single aggregation: This is an aggregation applied to a single field
  • Eval expression: This is a combination of literals, fields, operators, and functions that represent the value of your destination field

Bucket time spans versus per_* functions

The per_day(), per_hour(), per_minute(), and per_second() functions are the aggregator functions to be used with timechart in order to get a consistent scale for your data (when an explicit span (a time range) is not provided). The functions are described as follows:

  • per_day(): This function returns the values of the field X per day
  • per_hour(): This function returns the values of the field X per hour
  • per_minute(): This function returns the values of the field X per minute
  • per_second(): This function returns the values of the field X per second

In the following example, we've used the per_day function with timechart (to calculate the per day total of the other field):

sourcetype=access_* action=purchase | timechart per_day(other) by file usenull=f

The preceding code generates the following output:

The same search command, written using span and sum is shown as follows:

sourcetype=access_* action=purchase | timechart span=1d sum(other) by file usenull=f

This search generates the following chart: