Skip to content

The implementation of json inverted index #41624

Discussion options

You must be logged in to vote

@YolandaLyj A comprehensive design document will subsequently be placed in the doc directory of the Milvus repository. Below is a summary explanation of the differences between the two.

Two-Level JSON Indexing Strategy for Query Optimization

  1. Key Presence Index (jsonKeyStats)
    Structure: Maintains inverted lists tracking:
  • Which rows contain each JSON key
  • Byte positions (start, length) of corresponding values in raw JSON strings
    Optimization Benefits:
  • Scan Reduction: For sparse keys (e.g., appearing in 1% of 1M rows), reduces scanned rows from 1M → ~10K
  • Partial Parsing: Directly extracts target values using recorded byte ranges without full JSON unmarshaling
  • Ideal For: Sparse key queries …

Replies: 2 comments 7 replies

Comment options

You must be logged in to vote
7 replies
@YolandaLyj
Comment options

@yhmo
Comment options

yhmo Apr 30, 2025
Collaborator

@YolandaLyj
Comment options

@czs007
Comment options

Answer selected by YolandaLyj
Comment options

You must be logged in to vote
0 replies
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
4 participants