Skip to content

Incorrect range filter results: Querying time:>=T yields documents with time < T! #5825

@judnich

Description

@judnich

Bug Summary

Searching "query": "time:>=T" returns documents with time < T, where:

  • T is a numeric literal containing a valid integer nanosecond unix timestamp where T % 1000 > 0
  • Indexed document(s) exist with nanoseconds timestamp x where T-N < x < T for "small N" (*)
  • Field time in the index is configured with fast_precision: nanoseconds and output_format: unix_timestamp_nanos.

(*) Black-box behavior probing gives the impression that timestamps are being erroneously rounded down to lower precisions internally at some point during query execution, but further probing shows that the "boundary" points (the smallest U>T for which time:>=U stops returning documents at time T) are very unusual and don't seem to obviously correspond to either the integer microsecond or floating point nanosecond rounding error boundaries we'd expect. See "Example Repro" section below for more details.

Repro Steps

  1. Ingest some data with nanosecond precision timestamps (reference index configuration is provided below)
  2. Select some sample timestamp T from any document, as long as the last 3 decimal digits of T are not 999. It's fine if the last 3 digits are all 000.
  3. Use the REST API to search "query": "time:>={T+1}", "sort_by":"-time", "max_hits":1" where T the sample timestamp selected above.
  4. You will find that the document at time T is incorrectly included in the result!
  5. Optional: Experiment with different time offsets (incorrect behavior repros with offsets much larger than 1 nano, though exactly what offset ahead of T stops incorrectly returning the document at time T seems unusual and doesn't align with most normal precision rounding boundaries one would expect).

Example Repro

❯ curl -X POST "http://127.0.0.1:7280/api/v1/fluentbit-logs/search" \
    -H "Content-Type: application/json" \
    -d '{"max_hits":1,"query":"time:>=1751128201918247001","sort_by":"-time"}'

{
  "num_hits": 7354717336,
  "hits": [
    {
      "time": 1751128201918247000
      /* ... */
    }
  ],
  "elapsed_time_micros": 291577,
  "errors": []
}

Bisecting this "time:>=T+N" query to find when Quickwit stops returning the document at time T found this:

❯ curl -X POST "http://127.0.0.1:7280/api/v1/fluentbit-logs/search" \
    -H "Content-Type: application/json" \
    -d '{"max_hits":1,"query":"time:>=1751128201918247595","sort_by":"-time"}'
...
      "time": 1751128201918247000
❯ curl -X POST "http://127.0.0.1:7280/api/v1/fluentbit-logs/search" \
    -H "Content-Type: application/json" \
      -d '{"max_hits":1,"query":"time:>=1751128201918247596","sort_by":"-time"}'
...
      "time": 1751128201918249000

Notice how the query for time:>= (T + 595) still incorrectly returns the document at time T, while querying time:>=(T + 596) finally got it to move on to the (presumably) next document >= T.

Expected behavior

Quickwit should NOT have returned a "hit" document with "time": 1751128201918247000 in response to a query for time:>=1751128201918247595, since 1751128201918247000 < 1751128201918247595 (equivalently 000 < 595).

In general:

  • Querying time:>=T should never return documents with time < T.
  • Querying field:>=X should never return documents with field < X.

Some black-box brainstorming

This odd boundary at least suggests that whatever the issue is internally, it's not as simple as integer nanoseconds being rounded down to integer microseconds, and probably not even nanoseconds being simply rounded to a IEEE-753 64-bit float!

If the timestamps were just being rounded into f64, then we'd expect the boundary point to be N > 40, rather than the observed boundary of N > 595 (since (1751128201918247000.0 >= 1751128201918247040.0) == true and (1751128201918247000.0 >= 1751128201918247041.0) == false).

Configuration

> quickwit --version
Quickwit 0.8.2 (x86_64-unknown-linux-gnu 2024-09-03T11:26:51Z 0f28194)

index.yaml

version: 0.8

index_id: fluentbit-logs

doc_mapping:
  mode: lenient

  field_mappings:
    - name: time
      type: datetime
      input_formats:
        - iso8601
      output_format: unix_timestamp_nanos
      fast_precision: nanoseconds
      fast: true
    # ...

  timestamp_field: time

# ...

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions