|
1 |
| -## Placeholder |
| 1 | +In this tutorial, you'll learn how to use the t-digest data structure in a bike shop use case. |
2 | 2 |
|
3 |
| -This is placeholder content. |
| 3 | +A t-digest provides a mechanism to estimate percentiles from a data stream or large dataset using a compact sketch. |
| 4 | +It can answer questions such as: |
| 5 | + |
| 6 | +- Which fraction of the values in the data stream are smaller than a given value? |
| 7 | +- How many values in the data stream are smaller than a given value? |
| 8 | +- What's the highest value that's smaller than *p* percent of the values in the data stream? That is, what is the *p*-percentile value? |
| 9 | + |
| 10 | +t-digests are created using the `TDIGEST.CREATE` command. This command takes a key name and an optional `COMPRESSION` value, which defaults to `100`. |
| 11 | +A full discussion of compression is beyond the scope of this tutorial. See the **Learn More** section below for an academic article that discusses compression in great detail. |
| 12 | + |
| 13 | +Once a t-digest has been created, you can use the `TDIGEST.ADD` command to add observations to it. |
| 14 | + |
| 15 | +```redis Create a t-digest |
| 16 | +TDIGEST.CREATE bike:sales COMPRESSION 1000 // 1000 provides for very accurate estimations |
| 17 | +``` |
| 18 | + |
| 19 | +```redis Add bike sales data to the t-digest |
| 20 | +TDIGEST.ADD bike:sales 0 |
| 21 | +TDIGEST.ADD bike:sales 1028.24 |
| 22 | +TDIGEST.ADD bike:sales 1034.63 |
| 23 | +TDIGEST.ADD bike:sales 1087.93 |
| 24 | +TDIGEST.ADD bike:sales 1088.78 |
| 25 | +TDIGEST.ADD bike:sales 1093.82 |
| 26 | +TDIGEST.ADD bike:sales 1135.37 |
| 27 | +TDIGEST.ADD bike:sales 1146.33 |
| 28 | +TDIGEST.ADD bike:sales 1147.99 |
| 29 | +TDIGEST.ADD bike:sales 1151.2 |
| 30 | +TDIGEST.ADD bike:sales 1155.28 |
| 31 | +TDIGEST.ADD bike:sales 1172.82 |
| 32 | +TDIGEST.ADD bike:sales 1181.07 |
| 33 | +TDIGEST.ADD bike:sales 1206.48 |
| 34 | +TDIGEST.ADD bike:sales 1219.61 |
| 35 | +TDIGEST.ADD bike:sales 1234.81 |
| 36 | +TDIGEST.ADD bike:sales 1239.53 |
| 37 | +TDIGEST.ADD bike:sales 1261.35 |
| 38 | +TDIGEST.ADD bike:sales 1275.97 |
| 39 | +TDIGEST.ADD bike:sales 1276.34 |
| 40 | +TDIGEST.ADD bike:sales 1284.55 |
| 41 | +TDIGEST.ADD bike:sales 1289.01 |
| 42 | +TDIGEST.ADD bike:sales 1309.74 |
| 43 | +TDIGEST.ADD bike:sales 1317.87 |
| 44 | +TDIGEST.ADD bike:sales 1318.97 |
| 45 | +TDIGEST.ADD bike:sales 1355.34 |
| 46 | +TDIGEST.ADD bike:sales 1372.14 |
| 47 | +TDIGEST.ADD bike:sales 1393.12 |
| 48 | +TDIGEST.ADD bike:sales 1395.69 |
| 49 | +TDIGEST.ADD bike:sales 141.99 |
| 50 | +TDIGEST.ADD bike:sales 1435.79 |
| 51 | +TDIGEST.ADD bike:sales 144.36 |
| 52 | +TDIGEST.ADD bike:sales 1464.74 |
| 53 | +TDIGEST.ADD bike:sales 1469.08 |
| 54 | +TDIGEST.ADD bike:sales 147.98 |
| 55 | +TDIGEST.ADD bike:sales 1494.88 |
| 56 | +TDIGEST.ADD bike:sales 1499.2 |
| 57 | +TDIGEST.ADD bike:sales 1507.2 |
| 58 | +TDIGEST.ADD bike:sales 1516.66 |
| 59 | +TDIGEST.ADD bike:sales 1521.42 |
| 60 | +TDIGEST.ADD bike:sales 1587.48 |
| 61 | +TDIGEST.ADD bike:sales 1590.98 |
| 62 | +TDIGEST.ADD bike:sales 1596.35 |
| 63 | +TDIGEST.ADD bike:sales 1609.0 |
| 64 | +TDIGEST.ADD bike:sales 1618.29 |
| 65 | +TDIGEST.ADD bike:sales 1619.03 |
| 66 | +TDIGEST.ADD bike:sales 1619.09 |
| 67 | +TDIGEST.ADD bike:sales 1624.51 |
| 68 | +TDIGEST.ADD bike:sales 1640.86 |
| 69 | +TDIGEST.ADD bike:sales 1652.79 |
| 70 | +TDIGEST.ADD bike:sales 1653.44 |
| 71 | +TDIGEST.ADD bike:sales 1666.23 |
| 72 | +TDIGEST.ADD bike:sales 1684.47 |
| 73 | +TDIGEST.ADD bike:sales 1695.86 |
| 74 | +TDIGEST.ADD bike:sales 1697.72 |
| 75 | +TDIGEST.ADD bike:sales 1711.07 |
| 76 | +TDIGEST.ADD bike:sales 1733.52 |
| 77 | +TDIGEST.ADD bike:sales 1749.9 |
| 78 | +TDIGEST.ADD bike:sales 1752.62 |
| 79 | +TDIGEST.ADD bike:sales 1769.37 |
| 80 | +TDIGEST.ADD bike:sales 1785.55 |
| 81 | +TDIGEST.ADD bike:sales 1791.65 |
| 82 | +TDIGEST.ADD bike:sales 1801.86 |
| 83 | +TDIGEST.ADD bike:sales 1802.61 |
| 84 | +TDIGEST.ADD bike:sales 1822.93 |
| 85 | +TDIGEST.ADD bike:sales 1840.15 |
| 86 | +TDIGEST.ADD bike:sales 1862.83 |
| 87 | +TDIGEST.ADD bike:sales 1865.0 |
| 88 | +TDIGEST.ADD bike:sales 1879.4 |
| 89 | +TDIGEST.ADD bike:sales 1909.59 |
| 90 | +TDIGEST.ADD bike:sales 1927.15 |
| 91 | +TDIGEST.ADD bike:sales 1950.88 |
| 92 | +TDIGEST.ADD bike:sales 1950.91 |
| 93 | +TDIGEST.ADD bike:sales 1950.95 |
| 94 | +TDIGEST.ADD bike:sales 1971.25 |
| 95 | +TDIGEST.ADD bike:sales 1974.42 |
| 96 | +TDIGEST.ADD bike:sales 1981.45 |
| 97 | +TDIGEST.ADD bike:sales 1984.7 |
| 98 | +TDIGEST.ADD bike:sales 1986.63 |
| 99 | +TDIGEST.ADD bike:sales 2009.3 |
| 100 | +TDIGEST.ADD bike:sales 202.68 |
| 101 | +TDIGEST.ADD bike:sales 2031.86 |
| 102 | +TDIGEST.ADD bike:sales 2040.06 |
| 103 | +TDIGEST.ADD bike:sales 2051.99 |
| 104 | +TDIGEST.ADD bike:sales 2065.05 |
| 105 | +TDIGEST.ADD bike:sales 2070.4 |
| 106 | +TDIGEST.ADD bike:sales 2076.08 |
| 107 | +TDIGEST.ADD bike:sales 2098.87 |
| 108 | +TDIGEST.ADD bike:sales 2113.58 |
| 109 | +TDIGEST.ADD bike:sales 2120.37 |
| 110 | +TDIGEST.ADD bike:sales 2127.65 |
| 111 | +TDIGEST.ADD bike:sales 2131.1 |
| 112 | +TDIGEST.ADD bike:sales 2148.47 |
| 113 | +TDIGEST.ADD bike:sales 2195.21 |
| 114 | +TDIGEST.ADD bike:sales 2204.21 |
| 115 | +TDIGEST.ADD bike:sales 2205.5 |
| 116 | +TDIGEST.ADD bike:sales 2214.18 |
| 117 | +TDIGEST.ADD bike:sales 2225.34 |
| 118 | +TDIGEST.ADD bike:sales 2226.55 |
| 119 | +TDIGEST.ADD bike:sales 2249.47 |
| 120 | +TDIGEST.ADD bike:sales 2254.49 |
| 121 | +TDIGEST.ADD bike:sales 2257.08 |
| 122 | +TDIGEST.ADD bike:sales 2262.15 |
| 123 | +TDIGEST.ADD bike:sales 2262.57 |
| 124 | +TDIGEST.ADD bike:sales 2289.55 |
| 125 | +TDIGEST.ADD bike:sales 2289.75 |
| 126 | +TDIGEST.ADD bike:sales 2301.67 |
| 127 | +TDIGEST.ADD bike:sales 2325.75 |
| 128 | +TDIGEST.ADD bike:sales 234.75 |
| 129 | +TDIGEST.ADD bike:sales 2350.04 |
| 130 | +TDIGEST.ADD bike:sales 2354.71 |
| 131 | +TDIGEST.ADD bike:sales 2365.72 |
| 132 | +TDIGEST.ADD bike:sales 2371.73 |
| 133 | +TDIGEST.ADD bike:sales 2372.96 |
| 134 | +TDIGEST.ADD bike:sales 2374.72 |
| 135 | +TDIGEST.ADD bike:sales 2376.35 |
| 136 | +TDIGEST.ADD bike:sales 238.17 |
| 137 | +TDIGEST.ADD bike:sales 2396.72 |
| 138 | +TDIGEST.ADD bike:sales 2447.65 |
| 139 | +TDIGEST.ADD bike:sales 2449.07 |
| 140 | +TDIGEST.ADD bike:sales 245.27 |
| 141 | +TDIGEST.ADD bike:sales 2460.1 |
| 142 | +TDIGEST.ADD bike:sales 2479.99 |
| 143 | +TDIGEST.ADD bike:sales 2528.12 |
| 144 | +TDIGEST.ADD bike:sales 2535.28 |
| 145 | +TDIGEST.ADD bike:sales 2543.31 |
| 146 | +TDIGEST.ADD bike:sales 2559.02 |
| 147 | +TDIGEST.ADD bike:sales 2581.71 |
| 148 | +TDIGEST.ADD bike:sales 2597.52 |
| 149 | +TDIGEST.ADD bike:sales 2602.24 |
| 150 | +TDIGEST.ADD bike:sales 2611.96 |
| 151 | +TDIGEST.ADD bike:sales 2614.96 |
| 152 | +TDIGEST.ADD bike:sales 2619.53 |
| 153 | +TDIGEST.ADD bike:sales 2636.62 |
| 154 | +TDIGEST.ADD bike:sales 2655.43 |
| 155 | +TDIGEST.ADD bike:sales 2660.26 |
| 156 | +TDIGEST.ADD bike:sales 2671.49 |
| 157 | +TDIGEST.ADD bike:sales 2695.12 |
| 158 | +TDIGEST.ADD bike:sales 270.43 |
| 159 | +TDIGEST.ADD bike:sales 2722.73 |
| 160 | +TDIGEST.ADD bike:sales 2732.6 |
| 161 | +TDIGEST.ADD bike:sales 2754.18 |
| 162 | +TDIGEST.ADD bike:sales 2754.61 |
| 163 | +TDIGEST.ADD bike:sales 2787.63 |
| 164 | +TDIGEST.ADD bike:sales 2795.85 |
| 165 | +TDIGEST.ADD bike:sales 2796.3 |
| 166 | +TDIGEST.ADD bike:sales 2800.65 |
| 167 | +TDIGEST.ADD bike:sales 2805.15 |
| 168 | +TDIGEST.ADD bike:sales 2810.31 |
| 169 | +TDIGEST.ADD bike:sales 2825.98 |
| 170 | +TDIGEST.ADD bike:sales 2829.23 |
| 171 | +TDIGEST.ADD bike:sales 2832.54 |
| 172 | +TDIGEST.ADD bike:sales 2855.58 |
| 173 | +TDIGEST.ADD bike:sales 2861.72 |
| 174 | +TDIGEST.ADD bike:sales 2864.62 |
| 175 | +TDIGEST.ADD bike:sales 2865.15 |
| 176 | +TDIGEST.ADD bike:sales 2868.9 |
| 177 | +TDIGEST.ADD bike:sales 2870.47 |
| 178 | +TDIGEST.ADD bike:sales 2906.22 |
| 179 | +TDIGEST.ADD bike:sales 2919.5 |
| 180 | +TDIGEST.ADD bike:sales 2936.76 |
| 181 | +TDIGEST.ADD bike:sales 2938.93 |
| 182 | +TDIGEST.ADD bike:sales 2942.94 |
| 183 | +TDIGEST.ADD bike:sales 2946.26 |
| 184 | +TDIGEST.ADD bike:sales 2974.71 |
| 185 | +TDIGEST.ADD bike:sales 2985.74 |
| 186 | +TDIGEST.ADD bike:sales 2986.28 |
| 187 | +TDIGEST.ADD bike:sales 2999.32 |
| 188 | +TDIGEST.ADD bike:sales 305.04 |
| 189 | +TDIGEST.ADD bike:sales 3062.72 |
| 190 | +TDIGEST.ADD bike:sales 3080.67 |
| 191 | +TDIGEST.ADD bike:sales 3087.52 |
| 192 | +TDIGEST.ADD bike:sales 3124.01 |
| 193 | +TDIGEST.ADD bike:sales 3139.33 |
| 194 | +TDIGEST.ADD bike:sales 3161.07 |
| 195 | +TDIGEST.ADD bike:sales 3189.64 |
| 196 | +TDIGEST.ADD bike:sales 3192.7 |
| 197 | +TDIGEST.ADD bike:sales 326.04 |
| 198 | +TDIGEST.ADD bike:sales 3336.23 |
| 199 | +TDIGEST.ADD bike:sales 3337.66 |
| 200 | +TDIGEST.ADD bike:sales 346.05 |
| 201 | +TDIGEST.ADD bike:sales 3500.79 |
| 202 | +TDIGEST.ADD bike:sales 3506.64 |
| 203 | +TDIGEST.ADD bike:sales 3510.92 |
| 204 | +TDIGEST.ADD bike:sales 354.1 |
| 205 | +TDIGEST.ADD bike:sales 3563.32 |
| 206 | +TDIGEST.ADD bike:sales 3573.39 |
| 207 | +TDIGEST.ADD bike:sales 3618.76 |
| 208 | +TDIGEST.ADD bike:sales 3638.42 |
| 209 | +TDIGEST.ADD bike:sales 364.33 |
| 210 | +TDIGEST.ADD bike:sales 3653.41 |
| 211 | +TDIGEST.ADD bike:sales 3654.49 |
| 212 | +TDIGEST.ADD bike:sales 3658.54 |
| 213 | +TDIGEST.ADD bike:sales 370.26 |
| 214 | +TDIGEST.ADD bike:sales 3734.39 |
| 215 | +TDIGEST.ADD bike:sales 3736.82 |
| 216 | +TDIGEST.ADD bike:sales 378.39 |
| 217 | +TDIGEST.ADD bike:sales 3843.6 |
| 218 | +TDIGEST.ADD bike:sales 3905.58 |
| 219 | +TDIGEST.ADD bike:sales 3924.99 |
| 220 | +TDIGEST.ADD bike:sales 3974.14 |
| 221 | +TDIGEST.ADD bike:sales 3975.63 |
| 222 | +TDIGEST.ADD bike:sales 4020.62 |
| 223 | +TDIGEST.ADD bike:sales 403.36 |
| 224 | +TDIGEST.ADD bike:sales 404.85 |
| 225 | +TDIGEST.ADD bike:sales 405.28 |
| 226 | +TDIGEST.ADD bike:sales 4166.95 |
| 227 | +TDIGEST.ADD bike:sales 4171.72 |
| 228 | +TDIGEST.ADD bike:sales 4205.09 |
| 229 | +TDIGEST.ADD bike:sales 421.44 |
| 230 | +TDIGEST.ADD bike:sales 4231.58 |
| 231 | +TDIGEST.ADD bike:sales 4238.72 |
| 232 | +TDIGEST.ADD bike:sales 425.03 |
| 233 | +TDIGEST.ADD bike:sales 4486.56 |
| 234 | +TDIGEST.ADD bike:sales 4553.39 |
| 235 | +TDIGEST.ADD bike:sales 4554.14 |
| 236 | +TDIGEST.ADD bike:sales 4599.81 |
| 237 | +TDIGEST.ADD bike:sales 460.03 |
| 238 | +TDIGEST.ADD bike:sales 4612.71 |
| 239 | +TDIGEST.ADD bike:sales 4644.86 |
| 240 | +TDIGEST.ADD bike:sales 470.23 |
| 241 | +TDIGEST.ADD bike:sales 470.64 |
| 242 | +TDIGEST.ADD bike:sales 4739.8 |
| 243 | +TDIGEST.ADD bike:sales 4753.21 |
| 244 | +TDIGEST.ADD bike:sales 4755.21 |
| 245 | +TDIGEST.ADD bike:sales 4888.02 |
| 246 | +TDIGEST.ADD bike:sales 497.4 |
| 247 | +TDIGEST.ADD bike:sales 500.78 |
| 248 | +TDIGEST.ADD bike:sales 522.78 |
| 249 | +TDIGEST.ADD bike:sales 528.19 |
| 250 | +TDIGEST.ADD bike:sales 530.2 |
| 251 | +TDIGEST.ADD bike:sales 553.26 |
| 252 | +TDIGEST.ADD bike:sales 570.97 |
| 253 | +TDIGEST.ADD bike:sales 582.7 |
| 254 | +TDIGEST.ADD bike:sales 587.48 |
| 255 | +TDIGEST.ADD bike:sales 619.62 |
| 256 | +TDIGEST.ADD bike:sales 663.51 |
| 257 | +TDIGEST.ADD bike:sales 669.97 |
| 258 | +TDIGEST.ADD bike:sales 676.86 |
| 259 | +TDIGEST.ADD bike:sales 685.65 |
| 260 | +TDIGEST.ADD bike:sales 717.76 |
| 261 | +TDIGEST.ADD bike:sales 721.44 |
| 262 | +TDIGEST.ADD bike:sales 759.14 |
| 263 | +TDIGEST.ADD bike:sales 776.84 |
| 264 | +TDIGEST.ADD bike:sales 784.85 |
| 265 | +TDIGEST.ADD bike:sales 784.96 |
| 266 | +TDIGEST.ADD bike:sales 803.99 |
| 267 | +TDIGEST.ADD bike:sales 808.3 |
| 268 | +TDIGEST.ADD bike:sales 825.79 |
| 269 | +TDIGEST.ADD bike:sales 838.53 |
| 270 | +TDIGEST.ADD bike:sales 863.63 |
| 271 | +TDIGEST.ADD bike:sales 908.73 |
| 272 | +TDIGEST.ADD bike:sales 927.37 |
| 273 | +TDIGEST.ADD bike:sales 956.84 |
| 274 | +TDIGEST.ADD bike:sales 985.09 |
| 275 | +``` |
| 276 | + |
| 277 | +Use `TDIGEST.INFO` to retrieve information about your t-digest sketch. |
| 278 | + |
| 279 | +```redis TDIGEST.INFO usage |
| 280 | +TDIGEST.INFO bike:sales |
| 281 | +``` |
| 282 | + |
| 283 | +`TDIGEST.MIN` and `TDIGEST.MAX` return the minimum and maximum values of the t-digest. |
| 284 | + |
| 285 | +```redis TDIGEST.MIN and TDIGEST.MAX usage |
| 286 | +TDIGEST.MIN bike:sales |
| 287 | +TDIGEST.MAX bike:sales |
| 288 | +``` |
| 289 | + |
| 290 | +The `TDIGEST.BYRANK` and `TDIGEST.BYREVRANK` commands return, for each input rank, a floating point estimation of the value with that rank. |
| 291 | + |
| 292 | +**Note**: |
| 293 | +> For `BYRANK`, when a provided rank is 0, the result is the smallest observation (0 in this example). Similarly, when the rank is equal to the number of observations minus one, the result is the largest observation. For these two ranks, the results are always accurate; the results for any other rank are estimates. When the rank is equal to or larger than the number of observations, the result is inf. `BYREVRANK` is reversed. |
| 294 | +
|
| 295 | +```redis TDIGEST.BYRANK usage |
| 296 | +TDIGEST.BYRANK bike:sales 0 254 // there are 255 observations in the t-digest |
| 297 | +``` |
| 298 | + |
| 299 | +```redis TDIGEST.BYREVRANK usage |
| 300 | +TDIGEST.BYREVRANK bike:sales 0 254 |
| 301 | +``` |
| 302 | + |
| 303 | +`TDIGEST.QUANTILE` allows you to calculate percentiles for your t-digest. For example, to get the 25th, 50th, and 75th percentiles you would run: |
| 304 | + |
| 305 | +```redis TDIGEST.QUANTILE usage |
| 306 | +TDIGEST.QUANTILE bike:sales 0.25 0.50 0.75 |
| 307 | +``` |
| 308 | + |
| 309 | +The `TDIGEST.CDF` command retrieves, for each input value, an estimation of the fraction of observations smaller than the given value and half the observations that are equal to the given value. CDF stands for cumulative distribution function. |
| 310 | + |
| 311 | +```redis TDIGEST.CDF usage |
| 312 | +TDIGEST.CDF bike:sales 1000.0 3000.0 |
| 313 | +``` |
| 314 | + |
| 315 | +Calculating the average in a list of values is a common operation. However, sometimes measurements are noisy or contain invalid values. For example, consider an absurd `bike:sales` entry of 999999.0. A common practice is to calculate the average value of all observations ignoring outliers. For example, you might want to calculate the average value between the 20th percentile and the 80th percentile. You can use the `TDIGEST.TRIMMED_MEAN` command to do this: |
| 316 | + |
| 317 | +```redis TDIGEST.TRIMMED_MEAN usage |
| 318 | +TDIGEST.ADD bike:sales 999999.0 |
| 319 | +TDIGEST.TRIMMED_MEAN bike:sales 0.2 0.8 |
| 320 | +``` |
| 321 | + |
| 322 | +There may be occasions when you want to merge two t-digests together. You can use the `TDIGEST.MERGE` command to accomplish this. |
| 323 | + |
| 324 | +Finally, you can reset a t-digest sketch using the `TDIGEST.RESET` command. |
| 325 | + |
| 326 | +## Learn more |
| 327 | + |
| 328 | +[Academic resource](https://www.sciencedirect.com/science/article/pii/S2665963820300403) |
| 329 | + |
| 330 | +[Redis blog post](https://redis.com/blog/t-digest-in-redis-stack/) |
0 commit comments