SOLR-17711 index fetcher doesn't need timeout #3356

kotman12 · 2025-05-17T01:04:43Z

https://issues.apache.org/jira/browse/SOLR-17711

Description

There is now a default total request timeout applied to IndexFetcher because of Jetty Http2 client migration. We ran into a few problems with this already during replication of large number of shards on a single host. Sometimes we hit machine bandwidth limits because shards are downloaded in parallel. Thus requests for the next segment of each shard compete with each other and sometimes take more than 120 seconds. The 120 second number is important because it is the default idle timeout and ever since this change, the idle timeout happens to be the total request timeout in the http2 client. I don't think that is a good idea in the case of IndexFetcher, hence this change.

Solution

Revert to the pre-Http2SolrClient behavior of no timeout in IndexFetcher.

Tests

Not aware of an easy way to test this reasonably, i.e. to accelerate the clock.

Checklist

Please review the following and check all that apply:

I have reviewed the guidelines for How to Contribute and my code conforms to the standards described there to the best of my ability.
I have created a Jira issue and added the issue ID to my pull request title.
I have given Solr maintainers access to contribute to my PR branch. (optional but recommended, not available for branches on forks living under an organisation)
I have developed this patch against the main branch.
I have run ./gradlew check.
I have added tests for my changes.
I have added documentation for the Reference Guide

dsmiley

Looks good, thanks.
Can you please add to CHANGES.txt in the improvement section?

kotman12 · 2025-05-19T16:24:55Z

@dsmiley Just saw your message to the mailing list, thanks. So it looks like what I propose here is incidentally a no-op as it currently stands? Should I just wait for your PR then? Looks like I'd have to make this change to the recoveryOnlyClient which may or may not be the right thing to do but the question might be moot if we apply your planned changes.

…indexFetcher

kotman12 · 2025-05-27T14:33:56Z

Looks good, thanks. Can you please add to CHANGES.txt in the improvement section?

@dsmiley I updated the CHANGES.txt under 10 but I'm wondering if this should go out in 9.9? I'd argue this is something between a bug and an optimization since the request timeouts were added very likely by accident.

I've also moved the override to the recovery client in UpdateShardHandler. Maybe that is overkill but, again, I want to stress that until we moved to Jetty's HttpClient there was no request timeout on this at all.

dsmiley · 2025-05-27T21:14:57Z

Until the underlying timeout propagation is fixed, this PR is in limbo. I view this as a bug, not an optimization, since the user experience is just like a bug. For users with large shards, Solr worked and then later breaks after an upgrade. Let's try to get this fixed in 9.9 somehow. It's tempting to switch back to Apache HttpClient there, in the interests of stability / confidence. WDYT? Then in parallel improve Jetty Http2SolrClient without feeling rushed for 9.9.

CC @iamsanjay

kotman12 · 2025-05-27T21:42:33Z

@dsmiley is the request timeout even at the HttpClient level? I didn't think it was so I don't think we have any propagation to worry about in this particular case (I think that applies to connect and idle timeouts). Also I moved the setting of the request timeout to the "root" HttpClient (recoveryOnlyClient) anyways.

Also I realized I meant improvement (as you originally suggested) not optimization ... but the point about this being a bug probably still stands.

dsmiley · 2025-05-29T13:22:47Z

I quickly looked at Jetty HttpClient and there's no request timeout there. It's on the Jetty Request.

index fetcher doesn't need timeout

bb4ddfd

dsmiley self-requested a review May 18, 2025 00:29

dsmiley approved these changes May 18, 2025

View reviewed changes

kotman12 and others added 2 commits May 26, 2025 20:55

Merge branch 'apache:main' into SOLR-17711-remove-request-timeout-on-…

268d317

…indexFetcher

update CHANGES.txt + recovery client doesn't need request timeout

6c35e86

github-actions bot added the cat:index label May 27, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

SOLR-17711 index fetcher doesn't need timeout #3356

SOLR-17711 index fetcher doesn't need timeout #3356

Uh oh!

kotman12 commented May 17, 2025

Uh oh!

dsmiley left a comment

Uh oh!

kotman12 commented May 19, 2025

Uh oh!

kotman12 commented May 27, 2025 •

edited

Loading

Uh oh!

dsmiley commented May 27, 2025

Uh oh!

kotman12 commented May 27, 2025 •

edited

Loading

Uh oh!

dsmiley commented May 29, 2025

Uh oh!

Uh oh!

SOLR-17711 index fetcher doesn't need timeout #3356

Are you sure you want to change the base?

SOLR-17711 index fetcher doesn't need timeout #3356

Uh oh!

Conversation

kotman12 commented May 17, 2025

Description

Solution

Tests

Checklist

Uh oh!

dsmiley left a comment

Choose a reason for hiding this comment

Uh oh!

kotman12 commented May 19, 2025

Uh oh!

kotman12 commented May 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dsmiley commented May 27, 2025

Uh oh!

kotman12 commented May 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dsmiley commented May 29, 2025

Uh oh!

Uh oh!

kotman12 commented May 27, 2025 •

edited

Loading

kotman12 commented May 27, 2025 •

edited

Loading