|
| 1 | +--- |
| 2 | +Title: Best practices for Redis Query Engine performance |
| 3 | +alwaysopen: false |
| 4 | +categories: |
| 5 | +- docs |
| 6 | +- develop |
| 7 | +- stack |
| 8 | +- oss |
| 9 | +- kubernetes |
| 10 | +- clients |
| 11 | +linkTitle: RQE performance |
| 12 | +weight: 1 |
| 13 | +--- |
| 14 | + |
| 15 | +{{< note >}} |
| 16 | +If you're using Redis Software or Redis Cloud, see the [best practices for scalable Redis Query Engine]({{< relref "/operate/oss_and_stack/stack-with-enterprise/search/scalable-query-best-practices" >}}) page. |
| 17 | +{{< /note >}} |
| 18 | + |
| 19 | +## Checklist |
| 20 | +Below are some basic steps to ensure good performance of the Redis Query Engine (RQE). |
| 21 | + |
| 22 | +* Create a Redis data model with your query patterns in mind. |
| 23 | +* Ensure the Redis architecture has been sized for the expected load using the [sizing calculator](https://redis.io/redisearch-sizing-calculator/). |
| 24 | +* Provision Redis nodes with sufficient resources (RAM, CPU, network) to support the expected maximum load. |
| 25 | +* Review [`FT.INFO`]({{< baseurl >}}/commands/ft.info) and [`FT.PROFILE`]({{< baseurl >}}/commands/ft.profile) outputs for anomalies and/or errors. |
| 26 | +* Conduct load testing in a test environment with real-world queries and a load generated by either [memtier_benchmark](https://github.com/redislabs/memtier_benchmark) or a custom load application. |
| 27 | + |
| 28 | +## Indexing considerations |
| 29 | + |
| 30 | +### General |
| 31 | +- Favor [`TAG`]({{< relref "/develop/interact/search-and-query/basic-constructs/field-and-type-options#tag-fields" >}}) over [`NUMERIC`]({{< relref "/develop/interact/search-and-query/basic-constructs/field-and-type-options#numeric-fields" >}}) for use cases that only require matching. |
| 32 | +- Favor [`TAG`]({{< relref "/develop/interact/search-and-query/basic-constructs/field-and-type-options#tag-fields" >}}) over [`TEXT`]({{< relref "/develop/interact/search-and-query/basic-constructs/field-and-type-options#text-fields" >}}) for use cases that don’t require full-text capabilities (pure match). |
| 33 | + |
| 34 | +### Non-threaded search |
| 35 | +- Put only those fields used in your queries in the index. |
| 36 | +- Only make fields [`SORTABLE`]({{< relref "/develop/interact/search-and-query/advanced-concepts/sorting" >}}) if they are used in [`SORTBY`]({{< relref "/develop/interact/search-and-query/advanced-concepts/sorting#specifying-sortby" >}}) |
| 37 | +queries. |
| 38 | +- Use [`DIALECT 4`]({{< relref "/develop/interact/search-and-query/advanced-concepts/dialects#dialect-4" >}}). |
| 39 | + |
| 40 | +### Threaded (query performance factor or QPF) search |
| 41 | +- Put both query fields and any projected fields (`RETURN` or `LOAD`) in the index. |
| 42 | +- Set all fields to `SORTABLE`. |
| 43 | +- Set TAG fields to [UNF]({{< relref "/develop/interact/search-and-query/advanced-concepts/sorting#normalization-unf-option" >}}). |
| 44 | +- Optional: Set `TEXT` fields to `NOSTEM` if the use case will support it. |
| 45 | +- Use [`DIALECT 4`]({{< relref "/develop/interact/search-and-query/advanced-concepts/dialects#dialect-4" >}}). |
| 46 | + |
| 47 | +## Query optimization |
| 48 | + |
| 49 | +- Avoid returning large result sets. Use `CURSOR` or `LIMIT`. |
| 50 | +- Avoid wildcard searches. |
| 51 | +- Avoid projecting all fields (e.g., `LOAD *`). Project only those fields that are part of the index schema. |
| 52 | +- If queries are long-running, enable threading (query performance factor) to reduce contention for the main Redis thread. |
| 53 | + |
| 54 | +## Validate performance (`FT.PROFILE`) |
| 55 | + |
| 56 | +You can analyze [`FT.PROFILE`]({{< baseurl >}}/commands/ft.profile) output to gain insights about query execution. |
| 57 | +The following informational items are available for analysis: |
| 58 | + |
| 59 | +- Total execution time |
| 60 | +- Execution time per shard |
| 61 | +- Coordination time (for multi-sharded environments) |
| 62 | +- Breakdown of the query into fundamental components, such as `UNION` and `INTERSECT` |
| 63 | +- Warnings, such as `TIMEOUT` |
| 64 | + |
| 65 | +## Anti-patterns |
| 66 | + |
| 67 | +When designing and querying indexes in RQE, certain practices can hinder performance, scalability, and maintainability. Below are some common anti-patterns to avoid: |
| 68 | + |
| 69 | +- **Large documents**: storing excessively large documents in Redis makes data retrieval slower and increases memory usage. Break data into smaller, focused records whenever possible. |
| 70 | +- **Deeply-nested fields**: retrieving or indexing deeply-nested JSON fields is computationally expensive. Use a flatter schema for better performance. |
| 71 | +- **Large result sets**: fetching unnecessarily large result sets puts a strain on memory and network resources. Limit results to only what is needed. |
| 72 | +- **Wildcarding**: using wildcard patterns indiscriminately in queries can lead to large and inefficient scans, especially if the index size is significant. |
| 73 | +- **Large projections**: including excessive fields in query results increases memory overhead and slows down query execution. Limit projections to essential fields. |
| 74 | + |
| 75 | +The following examples depict an anti-pattern index schema and query, followed by corrected versions designed for scalability with RQE. |
| 76 | + |
| 77 | +### Anti-pattern index schema |
| 78 | + |
| 79 | +The following schema introduces challenges for scalability and performance: |
| 80 | + |
| 81 | +```sh |
| 82 | +FT.CREATE jsonidx:profiles ON JSON PREFIX 1 profiles: |
| 83 | + SCHEMA $.tags.* as t NUMERIC SORTABLE |
| 84 | + $.firstName as name TEXT |
| 85 | + $.location as loc GEO |
| 86 | +``` |
| 87 | + |
| 88 | +Issues: |
| 89 | + |
| 90 | +- Minimal schema definition: the schema is sparse and lacks fields like `lastName`, `id`, and `version` that might be frequently queried. This results in additional operations to fetch these fields separately, reducing efficiency. |
| 91 | +- Missing `SORTABLE` flag for text fields: sorting operations on unsortable fields require full-text processing, which is slow. |
| 92 | +- Wildcard indexing: `$.tags.*` creates a broad index that can lead to excessive memory usage and reduced query performance. |
| 93 | + |
| 94 | +### Anti-pattern query |
| 95 | + |
| 96 | +The following query is inefficient and not optimized for vertical scaling: |
| 97 | + |
| 98 | +```sh |
| 99 | +FT.AGGREGATE jsonidx:profiles '@t:[1299 1299]' LOAD * LIMIT 0 10 |
| 100 | +``` |
| 101 | +Issues: |
| 102 | + |
| 103 | +- Wildcard projection (`LOAD *`): retrieving all fields in the result set is inefficient and increases memory usage, especially if the documents are large. |
| 104 | +- Unnecessary fields: fields that aren't required for the current operation are still fetched, slowing down execution. |
| 105 | +- Lack of advanced query syntax: without specifying a query dialect or leveraging features like tagging, the query may perform unnecessary computations. |
| 106 | + |
| 107 | +### Improved index schema |
| 108 | + |
| 109 | +Here’s an optimized schema that adheres to best practices for vertical scaling: |
| 110 | + |
| 111 | +```sh |
| 112 | +FT.CREATE jsonidx:profiles ON JSON PREFIX 1 profiles: |
| 113 | + SCHEMA $.tags.* as t NUMERIC SORTABLE |
| 114 | + $.firstName as name TEXT NOSTEM SORTABLE |
| 115 | + $.lastName as lastname TEXT NOSTEM SORTABLE |
| 116 | + $.location as loc GEO SORTABLE |
| 117 | + $.id as id TAG SORTABLE UNF |
| 118 | + $.ver as ver TAG SORTABLE UNF |
| 119 | +``` |
| 120 | + |
| 121 | +Improvements: |
| 122 | + |
| 123 | +- `NOSTEM` for text fields: prevents stemming on fields like `firstName` and `lastName` to allow for exact matches (e.g., "Smith" stays "Smith"). |
| 124 | +- Expanded schema: adds commonly queried fields like `lastName`, `id`, and `version`, making queries more efficient by reducing the need for post-query data retrieval. |
| 125 | +- `TAG` fields: `id` and `ver` are defined as `TAG` fields to support fast filtering with exact matches. |
| 126 | +- `SORTABLE` for all relevant fields: ensures that sorting operations are efficient without requiring full-text scanning. |
| 127 | + |
| 128 | +You might be wondering why `$.tags.* as t NUMERIC SORTABLE` is acceptable in the improved schema and it wasn't previously. |
| 129 | +The inclusion of `$.tags.*` is acceptable when: |
| 130 | + |
| 131 | +- It has a clear purpose: it is actively used in queries, such as filtering on numeric ranges or matching specific values. |
| 132 | +- Other fields in the schema complement it: these fields reduce over-reliance on `$.tags.*` for all query operations, distributing the load more evenly. |
| 133 | +- Projections and limits are managed carefully: queries that use `$.tags.*` should avoid loading unnecessary fields or returning excessively large result sets. |
| 134 | + |
| 135 | +### Improved query |
| 136 | + |
| 137 | +The following query is better suited for vertical scaling: |
| 138 | + |
| 139 | +```sh |
| 140 | +FT.AGGREGATE jsonidx:profiles '@t:[1299 1299]' |
| 141 | + LOAD 6 id t name lastname loc ver |
| 142 | + LIMIT 0 10 |
| 143 | + DIALECT 3 |
| 144 | +``` |
| 145 | + |
| 146 | +Improvements: |
| 147 | + |
| 148 | +- Targeted projection: the `LOAD` clause specifies only essential fields (`id, t, name, lastname, loc, ver`), reducing memory and network overhead. |
| 149 | +- Limited results: the `LIMIT` clause ensures the query retrieves only the first 10 results, avoiding large result sets. |
| 150 | +- [`DIALECT 3`]({{< relref "/develop/interact/search-and-query/advanced-concepts/dialects#dialect-3" >}}): enables the latest RQE syntax and features, ensuring compatibility with modern capabilities. |
0 commit comments