|
| 1 | +--- |
| 2 | +layout: default |
| 3 | +title: Comparing query sets |
| 4 | +nav_order: 12 |
| 5 | +parent: Using Search Relevance Workbench |
| 6 | +grand_parent: Search relevance |
| 7 | +has_children: false |
| 8 | +has_toc: false |
| 9 | +--- |
| 10 | + |
| 11 | +# Comparing query sets |
| 12 | + |
| 13 | +To compare the results of two different search configurations, you can run a pairwise experiment. To achieve this, you need two search configurations and a query set to use for the search configuration. |
| 14 | + |
| 15 | + |
| 16 | +For more information about creating a query set, see [Query Sets]({{site.url}}{{site.baseurl}}/search-plugins/search-relevance/query-sets/). |
| 17 | + |
| 18 | +For more information about creating search configurations, see [Search Configurations]({{site.url}}{{site.baseurl}}/search-plugins/search-relevance/search-configurations/). |
| 19 | + |
| 20 | +## Creating a pairwise experiment |
| 21 | + |
| 22 | +An experiment is used to compare the metrics between two different search configurations. An experiment shows you the top N results for every query based on the specified search configurations. In the dashboard, you can view the returned documents from any of the queries in the query set and determine which search configuration returns more relevant results. Additionally, you can measure the similarity between the two returned search result lists using the provided similarity metrics. |
| 23 | + |
| 24 | +### Example |
| 25 | + |
| 26 | +To create a pairwise comparison experiment for the specified query set and search configurations, send the following request: |
| 27 | + |
| 28 | +```json |
| 29 | +PUT _plugins/_search_relevance/experiments |
| 30 | +{ |
| 31 | + "querySetId": "8368a359-146b-4690-b756-40591b2fcddb", |
| 32 | + "searchConfigurationList": ["a5acc9f3-6ad7-43f4-9651-fe118c499bc6", "26c7255c-c36e-42fb-b5b2-633dbf8e53b6"], |
| 33 | + "size": 10, |
| 34 | + "type": "PAIRWISE_COMPARISON" |
| 35 | +} |
| 36 | +``` |
| 37 | +{% include copy-curl.html %} |
| 38 | + |
| 39 | +### Request body fields |
| 40 | + |
| 41 | +The following table lists the available input parameters. |
| 42 | + |
| 43 | +Field | Data type | Description |
| 44 | +:--- | :--- | :--- |
| 45 | +`querySetId` | String | The query set ID. |
| 46 | +`searchConfigurationList` | List | A list of search configuration IDs to use for comparison. |
| 47 | +`size` | Integer | The number of documents to return in the results. |
| 48 | +`type` | String | Defines the type of experiment to run. Valid values are `PAIRWISE_COMPARISON`, `HYBRID_OPTIMIZER`, or `POINTWISE_EVALUATION`. Depending on the experiment type, you must provide different body fields in the request. `PAIRWISE_COMPARISON` is for comparing two search configurations against a query set and is used [here]({{site.url}}{{site.baseurl}}/search-plugins/search-relevance/compare-query-sets/). `HYBRID_OPTIMIZER` is for combining results and is used [here]({{site.url}}{{site.baseurl}}/search-plugins/search-relevance/optimize-hybrid-search/). `POINTWISE_EVALUATION` is for evaluating a search configuration against judgments and is used [here]({{site.url}}{{site.baseurl}}/search-plugins/search-relevance/evaluate-search-quality/). |
| 49 | + |
| 50 | +The response contains the experiment ID of the created experiment: |
| 51 | + |
| 52 | +```json |
| 53 | +{ |
| 54 | + "experiment_id": "cbd2c209-96d1-4012-aa73-e524b7a1b11a", |
| 55 | + "experiment_result": "CREATED" |
| 56 | +} |
| 57 | +``` |
| 58 | +## Interpreting the experiment results |
| 59 | +To interpret the experiment results, use the following operations. |
| 60 | + |
| 61 | +### Retrieving the experiment results |
| 62 | + |
| 63 | +Use the following API to retrieve the result of a specific experiment. |
| 64 | + |
| 65 | +#### Endpoints |
| 66 | + |
| 67 | +```json |
| 68 | +GET _plugins/_search_relevance/experiments |
| 69 | +GET _plugins/_search_relevance/experiments/<experiment_id> |
| 70 | +``` |
| 71 | + |
| 72 | +#### Path parameters |
| 73 | + |
| 74 | +The following table lists the available path parameters. |
| 75 | + |
| 76 | +| Parameter | Data type | Description | |
| 77 | +| :--- | :--- | :--- | |
| 78 | +| `experiment_id` | String | The ID of the experiment to retrieve. Retrieves all experiments when empty. | |
| 79 | + |
| 80 | +#### Example request |
| 81 | + |
| 82 | +```json |
| 83 | +GET _plugins/_search_relevance/experiments/cbd2c209-96d1-4012-aa73-e524b7a1b11a |
| 84 | +``` |
| 85 | + |
| 86 | +#### Example response |
| 87 | + |
| 88 | +```json |
| 89 | +{ |
| 90 | + "took": 2, |
| 91 | + "timed_out": false, |
| 92 | + "_shards": { |
| 93 | + "total": 1, |
| 94 | + "successful": 1, |
| 95 | + "skipped": 0, |
| 96 | + "failed": 0 |
| 97 | + }, |
| 98 | + "hits": { |
| 99 | + "total": { |
| 100 | + "value": 1, |
| 101 | + "relation": "eq" |
| 102 | + }, |
| 103 | + "max_score": 1, |
| 104 | + "hits": [ |
| 105 | + { |
| 106 | + "_index": ".plugins-search-relevance-experiment", |
| 107 | + "_id": "cbd2c209-96d1-4012-aa73-e524b7a1b11a", |
| 108 | + "_score": 1, |
| 109 | + "_source": { |
| 110 | + "id": "cbd2c209-96d1-4012-aa73-e524b7a1b11a", |
| 111 | + "timestamp": "2025-06-11T23:24:26.792Z", |
| 112 | + "type": "PAIRWISE_COMPARISON", |
| 113 | + "status": "PROCESSING", |
| 114 | + "querySetId": "8368a359-146b-4690-b756-40591b2fcddb", |
| 115 | + "searchConfigurationList": [ |
| 116 | + "a5acc9f3-6ad7-43f4-9651-fe118c499bc6", |
| 117 | + "26c7255c-c36e-42fb-b5b2-633dbf8e53b6" |
| 118 | + ], |
| 119 | + "judgmentList": [], |
| 120 | + "size": 10, |
| 121 | + "results": {} |
| 122 | + } |
| 123 | + } |
| 124 | + ] |
| 125 | + } |
| 126 | +} |
| 127 | +``` |
| 128 | + |
| 129 | +Once the experiment finishes running, the results are available: |
| 130 | + |
| 131 | +<details open markdown="block"> |
| 132 | + <summary> |
| 133 | + Response |
| 134 | + </summary> |
| 135 | + |
| 136 | +```json |
| 137 | +{ |
| 138 | + "took": 34, |
| 139 | + "timed_out": false, |
| 140 | + "_shards": { |
| 141 | + "total": 1, |
| 142 | + "successful": 1, |
| 143 | + "skipped": 0, |
| 144 | + "failed": 0 |
| 145 | + }, |
| 146 | + "hits": { |
| 147 | + "total": { |
| 148 | + "value": 1, |
| 149 | + "relation": "eq" |
| 150 | + }, |
| 151 | + "max_score": 1.0, |
| 152 | + "hits": [ |
| 153 | + { |
| 154 | + "_index": ".plugins-search-relevance-experiment", |
| 155 | + "_id": "cbd2c209-96d1-4012-aa73-e524b7a1b11a", |
| 156 | + "_score": 1.0, |
| 157 | + "_source": { |
| 158 | + "id": "cbd2c209-96d1-4012-aa73-e524b7a1b11a", |
| 159 | + "timestamp": "2025-06-12T04:18:37.284Z", |
| 160 | + "type": "PAIRWISE_COMPARISON", |
| 161 | + "status": "COMPLETED", |
| 162 | + "querySetId": "7889ffe9-835e-4f48-a9cd-53905bb967d3", |
| 163 | + "searchConfigurationList": [ |
| 164 | + "a5acc9f3-6ad7-43f4-9651-fe118c499bc6", |
| 165 | + "26c7255c-c36e-42fb-b5b2-633dbf8e53b6" |
| 166 | + ], |
| 167 | + "judgmentList": [], |
| 168 | + "size": 10, |
| 169 | + "results": { |
| 170 | + "tv": { |
| 171 | + "26c7255c-c36e-42fb-b5b2-633dbf8e53b6": [ |
| 172 | + "B07X3S9RTZ", |
| 173 | + "B07WVZFKLQ", |
| 174 | + "B00GXD4NWE", |
| 175 | + "B07ZKCV5K5", |
| 176 | + "B07ZKDVHFB", |
| 177 | + "B086VKT9R8", |
| 178 | + "B08XLM8YK1", |
| 179 | + "B07FPP6TB5", |
| 180 | + "B07N1TMNHB", |
| 181 | + "B09CDHM8W7" |
| 182 | + ], |
| 183 | + "pairwiseComparison": { |
| 184 | + "jaccard": 0.11, |
| 185 | + "rbo90": 0.16, |
| 186 | + "frequencyWeighted": 0.2, |
| 187 | + "rbo50": 0.07 |
| 188 | + }, |
| 189 | + "a5acc9f3-6ad7-43f4-9651-fe118c499bc6": [ |
| 190 | + "B07Q7VGW4Q", |
| 191 | + "B00GXD4NWE", |
| 192 | + "B07VML1CY1", |
| 193 | + "B07THVCJK3", |
| 194 | + "B07RKSV7SW", |
| 195 | + "B010EAW8UK", |
| 196 | + "B07FPP6TB5", |
| 197 | + "B073G9ZD33", |
| 198 | + "B07VXRXRJX", |
| 199 | + "B07Q45SP9P" |
| 200 | + ] |
| 201 | + }, |
| 202 | + "led tv": { |
| 203 | + "26c7255c-c36e-42fb-b5b2-633dbf8e53b6": [ |
| 204 | + "B01M1D0KL1", |
| 205 | + "B07YSMD3Z9", |
| 206 | + "B07V4CY9GZ", |
| 207 | + "B074KFP426", |
| 208 | + "B07S8XNWWF", |
| 209 | + "B07XBJR7GY", |
| 210 | + "B075FDWSHT", |
| 211 | + "B01N2Z17MS", |
| 212 | + "B07F1T4JFB", |
| 213 | + "B07S658ZLH" |
| 214 | + ], |
| 215 | + "pairwiseComparison": { |
| 216 | + "jaccard": 0.11, |
| 217 | + "rbo90": 0.13, |
| 218 | + "frequencyWeighted": 0.2, |
| 219 | + "rbo50": 0.03 |
| 220 | + }, |
| 221 | + "a5acc9f3-6ad7-43f4-9651-fe118c499bc6": [ |
| 222 | + "B07Q45SP9P", |
| 223 | + "B074KFP426", |
| 224 | + "B07JKVKZX8", |
| 225 | + "B07THVCJK3", |
| 226 | + "B0874XJYW8", |
| 227 | + "B08LVPWQQP", |
| 228 | + "B07V4CY9GZ", |
| 229 | + "B07X3BS3DF", |
| 230 | + "B074PDYLCZ", |
| 231 | + "B08CD9MKLZ" |
| 232 | + ] |
| 233 | + } |
| 234 | + } |
| 235 | + } |
| 236 | + } |
| 237 | + ] |
| 238 | + } |
| 239 | +} |
| 240 | +``` |
| 241 | + |
| 242 | +</details> |
| 243 | + |
| 244 | +### Interpreting the results |
| 245 | + |
| 246 | +As shown in the preceding response, both search configurations return the top N documents, with `size` set to 10 in the search request. In addition to the results, the response also includes metrics from the pairwise comparison. |
| 247 | + |
| 248 | +### Response body fields |
| 249 | + |
| 250 | +Field | Description |
| 251 | +:--- | :--- |
| 252 | +`jaccard` | Shows the similarity score by dividing the intersection cardinality by the union cardinality of the returned documents. |
| 253 | +`rbo` | The Rank-Biased Overlap (RBO) metric compares the returned result sets at each ranking depth—for example, the top 1 document, top 2 documents, and so on. It places greater importance on higher-ranked results, giving more weight to earlier positions in the list. |
| 254 | +`frequencyWeighted` | Similar to the Jaccard metric, the frequency-weighted metric calculates the ratio of the weighted intersection to the weighted union of two sets. However, unlike standard Jaccard, it gives more weight to documents with higher frequencies, skewing the result toward more frequently occurring items. |
0 commit comments