bug: Loss of a Query Pod causes queries to fail with indeterminant state

### Search before asking

- [x] I had searched in the [issues](https://github.com/databendlabs/databend/issues) and found no similar issues.


### Version

v1.2.776-nightly

### What's Wrong?

Since Databend is a distributed system with query execution on multiple nodes, a loss of a node mid-query should be handled gracefully so as not to leave the database in an indeterminate state.

Currently, during query execution when a query node is lost, a "Broken Pipe" error is raised to the client.  Unfortunately, it is not known what state the tables are in when that occurs.

While it might be possible to add something on the client side to try to determine the state, this would be quite complex and likely not entirely accurate or able to understand the internal state of Databend objects at the time of failure.  Since Databend is a distributed system, node failures will most certainly occur on a regular basis for any sufficiently large system.  Therefore, internal loss of nodes should be expected and handled transparently to outside systems/clients.  We experience "Broken Pipe" errors on a daily basis.

While much less important, this bug also prevents the use of auto-scaling the query pods.  The query pods can be scaled out without error but when trying to scale back in, there is a high likelihood that a Broken Pipe error will occur if the system has any use since query pods are being terminated without regard to active queries being run.

### How to Reproduce?

1. Create a Databend cluster with at least two query pods running on different nodes
2. Execute a long running query
3. Terminate one or more of the nodes hosting the query pods
4. Observe the query error, most likely a "Broken Pipe"



### Are you willing to submit PR?

- [ ] Yes I am willing to submit a PR!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

bug: Loss of a Query Pod causes queries to fail with indeterminant state #18391

Search before asking

Version

What's Wrong?

How to Reproduce?

Are you willing to submit PR?

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

bug: Loss of a Query Pod causes queries to fail with indeterminant state #18391

Description

Search before asking

Version

What's Wrong?

How to Reproduce?

Are you willing to submit PR?

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions