You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* Store timestamp in state value
* Decouple state operations from aggregate functions
* Check max expired window start instead of min valid window end
* Pass watermark to expire_windows
* Decouple window deletion from window expiration
* Feature: Sliding Windows
* Do not emit right windows via .final()
* Do not emit right windows via .current()
* Move SlidingWindow to a separate module
* Don't use deque
* Replace watermark with max_start_time
* Set expiration watermark only once
* Remove timeit tests
These were complicated. Custom performance tests are better.
* Refactoring tests and fixing problems
* Test presence of exact windows in the state
* Log late_by_ms from FixedTimeWindow
* Rename watermark variables
* Log expired windows from SlidingWindow
* Correct existing windowing docs
Corrected timestamp keys and changed temperature readings to numbers
that are easier to calculate without calculator and also look more
familiar in both Celsius and Fahrenheit scales.
* Add sliding windows docs
* Create docstring describing sliding window algorithm
* Fix imports
* Fix TestWindowedRocksDBPartitionTransaction after rebase
* Latest deleted window timestamps per key
* Create helper TimestampsCache class
Sliding windows are overlapping time-based windows that advance with each incoming message, rather than at fixed time intervals like hopping windows. They have a fixed 1 ms resolution and perform better and are less resource-intensive than hopping windows with a 1 ms step. Sliding windows do not produce redundant windows; every interval has a distinct aggregation.
310
+
311
+
Sliding windows provide optimal performance for tasks requiring high-precision real-time monitoring. However, if the task is not time-critical or the data stream is extremely dense, tumbling or hopping windows may perform better.
312
+
313
+
For example, a sliding window of 1 hour will generate the following intervals as messages A, B, C, and D arrive:
314
+
315
+
```
316
+
Sliding Windows
317
+
318
+
Time
319
+
[00:00:00.000, 01:00:00.000): ......A
320
+
[00:00:00.001, 01:00:00.001): ......B
321
+
[00:00:00.003, 01:00:00.003): ......C
322
+
[00:00:00.006, 01:00:00.006): ......D
323
+
324
+
```
325
+
326
+
Note that both the start and the end of the interval are inclusive.
327
+
328
+
In sliding windows, each timestamp can be assigned to multiple intervals because these intervals overlap.
329
+
330
+
For example, a timestamp `01:33:13.000` will match intervals for all messages incoming between `01:33:13.000` and `02:33:13.000`. Borderline windows including this timestamp will be:
331
+
332
+
-`00:33:13.000 - 01:33:13.000`
333
+
-`01:33:13.000 - 02:33:13.000`
334
+
335
+
336
+
**Example:**
337
+
338
+
Imagine you receive temperature readings from sensors, and you need to calculate the average temperature for the last hour, producing updates for each incoming message. The message key is a sensor ID, so the aggregations will be grouped by each sensor.
339
+
340
+
Input:
341
+
(Here the `"timestamp"` column illustrates Kafka message timestamps)
0 commit comments