ES|QL index resolution on planning is broken

Currently, ES|QL index resolution works like this:

1. Take the index pattern, and remove unavailable clusters from it (e.g. from enrich resolution) 
2. If the resulting pattern is empty, then resolution is empty
3. If we have something to resolve, add the filter, if present, to it and send it all to field-caps, with `ignore_unavailable=true` (which means unknown indices and other such errors are ignored)
4. Collect field-caps response. If it's empty, the resolution is invalid (not found).
5. If resolution is valid:
6. Find clusters that were in the original pattern but not in the response. For any such clusters, if any concrete index had been requested, produce failure with `Verification` error.
7. Check if there are any unavailable clusters - if those are skippable, mark them as skipped, otherwise it’s a failure. 
8. If it was a CCS search and no clusters are left to search, fail with `NoClustersToSearchException`
9. Try running the analysis step. For relations, this for each index pattern this checks that there is a resolution with this index pattern. It does not check individual indices in the pattern, only that the whole pattern resolves to something. 
10. If the analysis step succeeded, we're done. If not, we run the field-caps resolution step again, but this time without the filter. 
11. Then we try to run the analysis step again, using the non-filtered resolution now. 


This causes various issues:

1. Indices are treated as single blob, not as individual ones, which forces us to defer the real index check to runtime and leads to various issues with partial results (https://github.com/elastic/elasticsearch/issues/126275) and LIMIT 0 (https://github.com/elastic/elasticsearch/issues/114495).
2. Dual field-caps resolution leads to inconsistencies with field set when filters are applied - sometimes filtered out fields are added, sometimes they are not.
3. The errors on missing index are inconsistent - sometimes 400 from Verification error, sometimes 404. 

Further comment from @dnhatn:

I took a look at the issue and I think there are several problems:

1. There is a disparity in the `indices` option between the field-caps API (planning time) and the search-shards API (runtime). We use `ALLOW_UNAVAILABLE_TARGETS` for field-caps and `ERROR_WHEN_UNAVAILABLE_TARGETS` for search-shards. This leads to cases where field-caps does not return failures, but the runtime does. With `allow_partial_results`, we then ignore the runtime failures and return partial results instead of failing the request.

2. We do not strictly check the index failures returned by the field-caps API.

3. Another issue is related to security exceptions. Since we use `ALLOW_UNAVAILABLE_TARGETS` in the field-caps API, it returns `unknown index` if users lack the privilege to access it. However, if multiple index patterns are specified, we return an `unauthorized` error from the runtime instead (see `EsqlSecurityIT`).

4. There are cases where we return a 400 error, and others where we return a 404.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

ES|QL index resolution on planning is broken #127347

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

ES|QL index resolution on planning is broken #127347

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions