Skip to content

Commit 9d02cb3

Browse files
adding stats aggregation docs (#10251) (#10328)
1 parent ce4fcd2 commit 9d02cb3

File tree

1 file changed

+237
-16
lines changed

1 file changed

+237
-16
lines changed

_aggregations/metric/stats.md

Lines changed: 237 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -9,37 +9,258 @@ redirect_from:
99

1010
# Stats aggregations
1111

12-
The `stats` metric is a multi-value metric aggregation that returns all basic metrics such as `min`, `max`, `sum`, `avg`, and `value_count` in one aggregation query.
12+
The `stats` aggregation is a multi-value metric aggregation that computes a summary of numeric data. This aggregation is useful for quickly understanding the distribution of numeric fields. It can operate directly on a field, apply a script to derive the values, or handle documents with missing fields. The `stats` aggregation returns five values:
1313

14-
The following example returns the basic stats for the `taxful_total_price` field:
14+
* `count`: The number of values collected
15+
* `min`: The lowest value
16+
* `max`: The highest value
17+
* `sum`: The total of all values
18+
* `avg`: The average of the values (sum divided by count)
19+
20+
## Parameters
21+
22+
The `stats` aggregation takes the following optional parameters.
23+
24+
| Parameter | Data type | Description |
25+
| --------- | --------- | ------------------------------------------------------------------------------------------ |
26+
| `field` | String | The field to aggregate on. Must be a numeric field. |
27+
| `script` | Object | The script used to calculate custom values for aggregation. Can be used instead of or with `field`. |
28+
| `missing` | Number | The default value used for documents missing the target field.
29+
30+
## Example
31+
32+
The following example computes a `stats` aggregation for electricity usage.
33+
34+
Create an index named `power_usage` and add documents containing the number of kilowatt-hours (kWh) consumed during a given hour:
1535

1636
```json
17-
GET opensearch_dashboards_sample_data_ecommerce/_search
37+
PUT /power_usage/_bulk?refresh=true
38+
{"index": {}}
39+
{"device_id": "A1", "kwh": 1.2}
40+
{"index": {}}
41+
{"device_id": "A2", "kwh": 0.7}
42+
{"index": {}}
43+
{"device_id": "A3", "kwh": 1.5}
44+
```
45+
{% include copy-curl.html %}
46+
47+
To compute statistics on the `kwh` field across all documents, use a `stats` aggregation named `consumption_stats` over the `kwh` field. Setting `size` to `0` specifies that document hits should not be returned:
48+
49+
```json
50+
GET /power_usage/_search
1851
{
1952
"size": 0,
2053
"aggs": {
21-
"stats_taxful_total_price": {
54+
"consumption_stats": {
2255
"stats": {
23-
"field": "taxful_total_price"
56+
"field": "kwh"
2457
}
2558
}
2659
}
2760
}
2861
```
2962
{% include copy-curl.html %}
3063

31-
#### Example response
64+
The response includes `count`, `min`, `max`, `avg`, and `sum` values for the three documents in the index:
3265

3366
```json
34-
...
35-
"aggregations" : {
36-
"stats_taxful_total_price" : {
37-
"count" : 4675,
38-
"min" : 6.98828125,
39-
"max" : 2250.0,
40-
"avg" : 75.05542864304813,
41-
"sum" : 350884.12890625
67+
{
68+
...
69+
"hits": {
70+
"total": {
71+
"value": 3,
72+
"relation": "eq"
73+
},
74+
"max_score": null,
75+
"hits": []
76+
},
77+
"aggregations": {
78+
"consumption_stats": {
79+
"count": 3,
80+
"min": 0.699999988079071,
81+
"max": 1.5,
82+
"avg": 1.1333333452542622,
83+
"sum": 3.400000035762787
84+
}
4285
}
43-
}
4486
}
45-
```
87+
```
88+
89+
### Running a stats aggregation per bucket
90+
91+
You can compute separate statistics for each device by nesting a `stats` aggregation inside a `terms` aggregation in the `device_id` field. The `terms` aggregation groups documents into buckets based on unique `device_id` values, and the `stats` aggregation computes summary statistics within each bucket:
92+
93+
```json
94+
GET /power_usage/_search
95+
{
96+
"size": 0,
97+
"aggs": {
98+
"per_device": {
99+
"terms": {
100+
"field": "device_id.keyword"
101+
},
102+
"aggs": {
103+
"device_usage_stats": {
104+
"stats": {
105+
"field": "kwh"
106+
}
107+
}
108+
}
109+
}
110+
}
111+
}
112+
```
113+
{% include copy-curl.html %}
114+
115+
The response returns one bucket per `device_id`, with computed `count`, `min`, `max`, `avg`, and `sum` fields within each bucket:
116+
117+
```json
118+
{
119+
...
120+
"hits": {
121+
"total": {
122+
"value": 3,
123+
"relation": "eq"
124+
},
125+
"max_score": null,
126+
"hits": []
127+
},
128+
"aggregations": {
129+
"per_device": {
130+
"doc_count_error_upper_bound": 0,
131+
"sum_other_doc_count": 0,
132+
"buckets": [
133+
{
134+
"key": "A1",
135+
"doc_count": 1,
136+
"device_usage_stats": {
137+
"count": 1,
138+
"min": 1.2000000476837158,
139+
"max": 1.2000000476837158,
140+
"avg": 1.2000000476837158,
141+
"sum": 1.2000000476837158
142+
}
143+
},
144+
{
145+
"key": "A2",
146+
"doc_count": 1,
147+
"device_usage_stats": {
148+
"count": 1,
149+
"min": 0.699999988079071,
150+
"max": 0.699999988079071,
151+
"avg": 0.699999988079071,
152+
"sum": 0.699999988079071
153+
}
154+
},
155+
{
156+
"key": "A3",
157+
"doc_count": 1,
158+
"device_usage_stats": {
159+
"count": 1,
160+
"min": 1.5,
161+
"max": 1.5,
162+
"avg": 1.5,
163+
"sum": 1.5
164+
}
165+
}
166+
]
167+
}
168+
}
169+
}
170+
```
171+
172+
This allows you to compare usage statistics across devices with a single query.
173+
174+
### Using a script to compute derived values
175+
176+
You can also use a script to compute the values used in the `stats` aggregation. This is useful when the metric is derived from document fields or requires transformation.
177+
178+
For example, to convert kilowatt-hours (kWh) to watt-hours (Wh) before running the `stats` aggregation, because `1 kWh` equals `1,000 Wh`, you can use a script that multiplies each value by `1,000`. The following script `doc['kwh'].value * 1000` is used to derive the input value for each document:
179+
180+
```json
181+
GET /power_usage/_search
182+
{
183+
"size": 0,
184+
"aggs": {
185+
"usage_wh_stats": {
186+
"stats": {
187+
"script": {
188+
"source": "doc['kwh'].value * 1000"
189+
}
190+
}
191+
}
192+
}
193+
}
194+
```
195+
{% include copy-curl.html %}
196+
197+
The `stats` aggregation returned in the response reflects values of `1200`, `700`, and `1500` Wh:
198+
199+
```json
200+
{
201+
...
202+
"hits": {
203+
"total": {
204+
"value": 3,
205+
"relation": "eq"
206+
},
207+
"max_score": null,
208+
"hits": []
209+
},
210+
"aggregations": {
211+
"usage_wh_stats": {
212+
"count": 3,
213+
"min": 699.999988079071,
214+
"max": 1500,
215+
"avg": 1133.3333452542622,
216+
"sum": 3400.000035762787
217+
}
218+
}
219+
}
220+
```
221+
222+
### Using a value script with a field
223+
224+
When combining a field with a transformation, you can specify both `field` and `script`. This allows using the `_value` variable to reference the field's value within the script.
225+
226+
The following example increases each energy reading by 5% before computing the `stats` aggregation:
227+
228+
```json
229+
GET /power_usage/_search
230+
{
231+
"size": 0,
232+
"aggs": {
233+
"adjusted_usage": {
234+
"stats": {
235+
"field": "kwh",
236+
"script": {
237+
"source": "_value * 1.05"
238+
}
239+
}
240+
}
241+
}
242+
}
243+
```
244+
{% include copy-curl.html %}
245+
246+
### Missing values
247+
248+
If some documents do not contain the target field, they are excluded by default from the aggregation. To include them using a default value, you can specify the `missing` parameter.
249+
250+
The following request treats missing `kwh` values as `0.0`:
251+
252+
```json
253+
GET /power_usage/_search
254+
{
255+
"size": 0,
256+
"aggs": {
257+
"consumption_with_default": {
258+
"stats": {
259+
"field": "kwh",
260+
"missing": 0.0
261+
}
262+
}
263+
}
264+
}
265+
```
266+
{% include copy-curl.html %}

0 commit comments

Comments
 (0)