-
Notifications
You must be signed in to change notification settings - Fork 477
Description
Bug Summary
Searching "query": "time:>=T"
returns documents with time < T
, where:
T
is a numeric literal containing a valid integer nanosecond unix timestamp whereT % 1000 > 0
- Indexed document(s) exist with nanoseconds timestamp
x
whereT-N < x < T
for "small N" (*) - Field
time
in the index is configured withfast_precision: nanoseconds
andoutput_format: unix_timestamp_nanos
.
(*) Black-box behavior probing gives the impression that timestamps are being erroneously rounded down to lower precisions internally at some point during query execution, but further probing shows that the "boundary" points (the smallest U>T
for which time:>=U
stops returning documents at time T
) are very unusual and don't seem to obviously correspond to either the integer microsecond or floating point nanosecond rounding error boundaries we'd expect. See "Example Repro" section below for more details.
Repro Steps
- Ingest some data with nanosecond precision timestamps (reference index configuration is provided below)
- Select some sample timestamp
T
from any document, as long as the last 3 decimal digits ofT
are not999
. It's fine if the last 3 digits are all000
. - Use the REST API to search
"query": "time:>={T+1}", "sort_by":"-time", "max_hits":1"
whereT
the sample timestamp selected above. - You will find that the document at time
T
is incorrectly included in the result! - Optional: Experiment with different time offsets (incorrect behavior repros with offsets much larger than 1 nano, though exactly what offset ahead of
T
stops incorrectly returning the document at timeT
seems unusual and doesn't align with most normal precision rounding boundaries one would expect).
Example Repro
❯ curl -X POST "http://127.0.0.1:7280/api/v1/fluentbit-logs/search" \
-H "Content-Type: application/json" \
-d '{"max_hits":1,"query":"time:>=1751128201918247001","sort_by":"-time"}'
{
"num_hits": 7354717336,
"hits": [
{
"time": 1751128201918247000
/* ... */
}
],
"elapsed_time_micros": 291577,
"errors": []
}
Bisecting this "time:>=T+N"
query to find when Quickwit stops returning the document at time T
found this:
❯ curl -X POST "http://127.0.0.1:7280/api/v1/fluentbit-logs/search" \
-H "Content-Type: application/json" \
-d '{"max_hits":1,"query":"time:>=1751128201918247595","sort_by":"-time"}'
...
"time": 1751128201918247000
❯ curl -X POST "http://127.0.0.1:7280/api/v1/fluentbit-logs/search" \
-H "Content-Type: application/json" \
-d '{"max_hits":1,"query":"time:>=1751128201918247596","sort_by":"-time"}'
...
"time": 1751128201918249000
Notice how the query for time:>= (T + 595)
still incorrectly returns the document at time T
, while querying time:>=(T + 596)
finally got it to move on to the (presumably) next document >= T
.
Expected behavior
Quickwit should NOT have returned a "hit" document with "time": 1751128201918247000
in response to a query for time:>=1751128201918247595
, since 1751128201918247000 < 1751128201918247595
(equivalently 000 < 595
).
In general:
- Querying
time:>=T
should never return documents withtime < T
. - Querying
field:>=X
should never return documents withfield < X
.
Some black-box brainstorming
This odd boundary at least suggests that whatever the issue is internally, it's not as simple as integer nanoseconds being rounded down to integer microseconds, and probably not even nanoseconds being simply rounded to a IEEE-753 64-bit float!
If the timestamps were just being rounded into f64
, then we'd expect the boundary point to be N > 40
, rather than the observed boundary of N > 595
(since (1751128201918247000.0 >= 1751128201918247040.0) == true
and (1751128201918247000.0 >= 1751128201918247041.0) == false
).
Configuration
> quickwit --version
Quickwit 0.8.2 (x86_64-unknown-linux-gnu 2024-09-03T11:26:51Z 0f28194)
index.yaml
version: 0.8
index_id: fluentbit-logs
doc_mapping:
mode: lenient
field_mappings:
- name: time
type: datetime
input_formats:
- iso8601
output_format: unix_timestamp_nanos
fast_precision: nanoseconds
fast: true
# ...
timestamp_field: time
# ...