Skip to content

SOLR-17776: Harmonize SolrJ timeouts #3357

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 16 commits into from
Jun 19, 2025
Merged

Conversation

dsmiley
Copy link
Contributor

@dsmiley dsmiley commented May 19, 2025

@@ -129,6 +126,12 @@ protected Http2SolrClient(String serverBaseUrl, Builder builder) {
super(serverBaseUrl, builder);

if (builder.httpClient != null) {
if (builder.followRedirects != null
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NOCOMMIT; it remains to be seen if we're going to retain HttpClient reuse but reusing it means we can't be customizing timeouts.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can't customize connection timeout specifically, and other non-timeout things.

@@ -261,8 +266,6 @@ private HttpClient createHttpClient(Builder builder) {
this.authenticationStore = new AuthenticationStoreHolder();
httpClient.setAuthenticationStore(this.authenticationStore);

httpClient.setConnectTimeout(builder.connectionTimeoutMillis);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just reorganized this upwards

Comment on lines 258 to 259
httpClient.setConnectTimeout(builder.getConnectionTimeoutMillis());
httpClient.setIdleTimeout(builder.getIdleTimeoutMillis());
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note: these getters now resolve a default value, a primitive; do not return null.

Comment on lines -94 to +99
HttpClient.Builder b = HttpClient.newBuilder().followRedirects(followRedirects);
b.followRedirects(followRedirects);

b.connectTimeout(Duration.of(builder.getConnectionTimeoutMillis(), ChronoUnit.MILLIS));
// note: idle timeout isn't used for the JDK client
// note: request timeout is set per request
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • organizational change
  • setting the connection timeout (was forgotten!)

Comment on lines -422 to +428
if (requestTimeoutMillis > 0) {
reqb.timeout(Duration.of(requestTimeoutMillis, ChronoUnit.MILLIS));
} else if (idleTimeoutMillis > 0) {
reqb.timeout(Duration.of(idleTimeoutMillis, ChronoUnit.MILLIS));
}
reqb.timeout(Duration.of(requestTimeoutMillis, ChronoUnit.MILLIS));
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same comment as for Http2SolrClient: we now keep this simply and do defaulting logic in one common place earlier

Comment on lines -556 to -561
if (idleTimeoutMillis == null || idleTimeoutMillis <= 0) {
idleTimeoutMillis = (long) HttpClientUtil.DEFAULT_SO_TIMEOUT;
}
if (connectionTimeoutMillis == null) {
connectionTimeoutMillis = (long) HttpClientUtil.DEFAULT_CONNECT_TIMEOUT;
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same comment as for Http2SolrClient: these settings were redundant/duplicative and thus misleading. Only need to be inside the underlying HttpClient.

@@ -57,8 +57,6 @@ public abstract class HttpSolrClientBase extends SolrClient {
/** The URL of the Solr server. */
protected final String serverBaseUrl;

protected final long idleTimeoutMillis;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

redundant with HttpClient; removed here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and isn't supported by the Jdk HttpClient

Copy link
Contributor Author

@dsmiley dsmiley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this PR is now in good shape.

Summary in JIRA:

If Http2SolrClient.Builder.withHttpClient is used, (and it's used more in 9.8, shifted from deprecated Apache HttpClient based clients), settings affecting HttpClient construction cannot be customized; an IllegalStateException will now be thrown if you try.  The connection timeout cannot be customized, but idle & request timeouts can be.  Callers (various places in Solr) should no longer customize the connection timeout.
The HttpJdkSolrClient wasn't setting the connection timeout as per the builder configuration.

@@ -862,7 +862,6 @@ private void doCompatCheck(BiConsumer<String, Object> consumer) {
new Http2SolrClient.Builder()
.withHttpClient(getCoreContainer().getDefaultHttpSolrClient())
.withIdleTimeout(30000, TimeUnit.MILLISECONDS)
.withConnectionTimeout(15000, TimeUnit.MILLISECONDS)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not supported anymore if we use an existing HttpClient. No big deal.

@@ -360,7 +360,6 @@ private void requestRecovery(
try (SolrClient client =
new Http2SolrClient.Builder(baseUrl)
.withHttpClient(solrClient)
.withConnectionTimeout(30000, TimeUnit.MILLISECONDS)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not supported anymore if we use an existing HttpClient. No big deal.

@@ -287,7 +284,6 @@ public IndexFetcher(
String compress = (String) initArgs.get(COMPRESSION);
useInternalCompression = ReplicationHandler.INTERNAL.equals(compress);
useExternalCompression = ReplicationHandler.EXTERNAL.equals(compress);
connTimeout = getParameter(initArgs, HttpClientUtil.PROP_CONNECTION_TIMEOUT, 30000, null);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not supported anymore if we use an existing HttpClient. No big deal.

Here this setting comes from ReplicationHandler config in solrconfig.xml but I don't think anyone specifies this (I searched for it and didn't see in all of Solr configs/tests).

@@ -205,19 +205,13 @@ private static Http2SolrClient.Builder newHttp2SolrClientBuilder(
.withDefaultCollection(URLUtil.extractCoreFromCoreUrl(url));
if (http2SolrClient != null) {
builder.withHttpClient(http2SolrClient);
// cannot set connection timeout
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay? I suppose.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we validate that the http2SolrClient connectionTimeout is greater than or equal to the minConnTimeout? Maybe this validation should be in the constructor? There seems to be an expectation from the comments that this is the "floor" for some reason... But if this is actually irrelevant then does it make more sense to just rip out minConnTimeout altogether to avoid confusion?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dug into this... the timeouts came from https://issues.apache.org/jira/browse/SOLR-14672 which is more about making them inheritable instead of hard-coded. We have that already -- they are inherited from the passed in client, which in turn has values that are configurable via HttpSolrClientProvider via UpdateShardHandlerConfig via solr.xml.
I think we're fine.

@@ -57,8 +57,6 @@ public abstract class HttpSolrClientBase extends SolrClient {
/** The URL of the Solr server. */
protected final String serverBaseUrl;

protected final long idleTimeoutMillis;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and isn't supported by the Jdk HttpClient

@dsmiley dsmiley changed the title Harmonize SolrJ timeouts SOLR-17776: Harmonize SolrJ timeouts Jun 4, 2025
@dsmiley dsmiley marked this pull request as ready for review June 4, 2025 05:19
@SuppressWarnings("unchecked")
public B withRequestTimeout(long requestTimeout, TimeUnit unit) {
this.requestTimeoutMillis = TimeUnit.MILLISECONDS.convert(requestTimeout, unit);
return (B) this;
}

public long getRequestTimeoutMillis() {
return requestTimeoutMillis != null && requestTimeoutMillis >= 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious why setting a negative value defaults to the idle timeout but 0 means no timeout? I guess this is to preserve the existing behavior in Solr? The underlying Jetty http request API has <=0 signify no timeout.

Copy link
Contributor

@kotman12 kotman12 Jun 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually if I am understanding the original correctly:

    if (requestTimeoutMillis > 0) {
      req.timeout(requestTimeoutMillis, TimeUnit.MILLISECONDS);
    } else {
      req.timeout(idleTimeoutMillis, TimeUnit.MILLISECONDS);
    }

+

    if (builder.requestTimeoutMillis != null) {
      this.requestTimeoutMillis = builder.requestTimeoutMillis;
    } else {
      this.requestTimeoutMillis = -1;
    }

Then I'd expect the following to preserve existing behavior. If we want to change existing behavior we should be very explicit about it and IMO we shouldn't change it to something so complicated, i.e. we either stick with the idle timeout fallback or we go to a "forever" fallback. I don't know if a three-state default solution is really beneficial. At that point you'd want to disambiguate by explicitly setting to "forever" or idleTimeout or whatever else...

return requestTimeoutMillis != null && requestTimeoutMillis > 0 ? requestTimeoutMillis : getIdleTimeoutMillis()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good eye. I'll change this back for consistency.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't intend to change existing behavior, but I wanted the getXXXTimeout methods to return a >0 value, so that it's it's clear/obvious how to interpret it and so that callers could do min/max against some other value sensibly. For example the SolrClientCache could then use max on and it'd behave sensibly when compared to another default. If 0 means forever, it's semantically inverted compared to the other values.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WDYT @kotman12 ?
9.9 is coming next week...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dsmiley were you going to change this? Regarding:

For example the SolrClientCache could then use max on and it'd behave sensibly when compared to another default. If 0 means forever, it's semantically inverted compared to the other values.

I think the existing logic does ensure that this returns greater than 0 because it defaults to idleTimeout in the case of <=0. Idle timeout's original logic effectively was:

return idleTimeoutMillis != null && idleTimeoutMillis > 0 ? idleTimeoutMillis : HttpClientUtil.DEFAULT_SO_TIMEOUT

so again, unless HttpClientUtil.DEFAULT_SO_TIMEOUT returns a value <=0 (which should never be the case) I don't see how you would ever be inverting the meanings of these values and so I think it would be safe to call max when comparing them.

The current approach in this PR is more complicated because it does double-defaulting:

{
    timeout < 0: idleTimeout
    timeout = 0: Integer.MAX_VALUE
    timeout > 0: timeout
}

I don't think this is really useful? Maybe I'm still missing something.

@@ -205,19 +205,13 @@ private static Http2SolrClient.Builder newHttp2SolrClientBuilder(
.withDefaultCollection(URLUtil.extractCoreFromCoreUrl(url));
if (http2SolrClient != null) {
builder.withHttpClient(http2SolrClient);
// cannot set connection timeout
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we validate that the http2SolrClient connectionTimeout is greater than or equal to the minConnTimeout? Maybe this validation should be in the constructor? There seems to be an expectation from the comments that this is the "floor" for some reason... But if this is actually irrelevant then does it make more sense to just rip out minConnTimeout altogether to avoid confusion?

public Long getIdleTimeoutMillis() {
return idleTimeoutMillis;
public long getIdleTimeoutMillis() {
return idleTimeoutMillis != null
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wasn't the original logic, effectively:
return idleTimeoutMillis != null && idleTimeoutMillis > 0 ? idleTimeoutMillis : HttpClientUtil.DEFAULT_SO_TIMEOUT
?

In general I think we should be very careful about default-setting the idle timeout to "forever". I believe that in a majority of cases you don't want this default.

* About 24 days in milliseconds -- basically forever to wait for something. But not so large to
* cause overflow.
*/
protected static final long FOREVER_MILLIS = Integer.MAX_VALUE;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be useful to expose this in some way. Either by making this public or having a function in the builder which does this for you, i.e.:

  public B withNoIdleTimeout() {
    this.idleTimeoutMillis = FOREVER_MILLIS;
    return (B) this;
  }

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dsmiley if we are really worried about overflow here, should we bite the bullet and make this an int? This fallback value doesn't really prevent overflows since I could just do:

.withIdleTimeout(Long.MAX_VALUE, TimeUnit.MILLISECONDS)

which I incidentally did here. To a future reviewer a change like this may seem reasonable, in fact you approved it and you are reasonable 😄 . But if this will overflow in some low-level lossy cast to int then this is still dangerous and the FOREVER_MILLIS, although well-intentioned, just adds noise IMO without really providing much protection.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather not add yet another method for the ~infinite/forever case. I did see an issue with Long.MAX_VALUE and a test failed clearly related to overflow... if I recall it was with the JDK HttpClient. Any way, we could ignore the matter if it doesn't show up again. If it does show up again... I could see either switching to integer or reducing the value on setXXX(...)

Don't need "FOREVER"
@github-actions github-actions bot added the documentation Improvements or additions to documentation label Jun 17, 2025
(a test reused the same builder for 2 clients).
So make build() trivial and do defaulting/checks in the constructor.
@dsmiley dsmiley merged commit dd89cd6 into apache:main Jun 19, 2025
3 of 4 checks passed
@dsmiley dsmiley deleted the solrjTimeoutHarmonize branch June 19, 2025 00:06
dsmiley added a commit that referenced this pull request Jun 19, 2025
SolrJ: If Http2SolrClient.Builder.withHttpClient is used, (and it's used more in 9.8, shifted from deprecated Apache HttpClient  based clients), settings affecting HttpClient construction cannot be customized; an IllegalStateException will now be thrown if you try.
The connection timeout cannot be customized, but idle & request timeouts can be.  Callers (various places in Solr) no longer customize the connection timeout.

The HttpJdkSolrClient wasn't setting the connection timeout as per the builder configuration.

Co-authored-by: Luke Kot-Zaniewski <lkotzaniewsk@bloomberg.net>

(cherry picked from commit dd89cd6)
if (builder.followRedirects != null
|| builder.connectionTimeoutMillis != null
|| builder.maxConnectionsPerHost != null
|| builder.useHttp1_1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Posting on the PR instead of the commit)

useHttp1_1 behaves differently than the rest of these, because it is not a default as Null option. So if someone wants to customize useHttp1, while not changing the sysProp, this will fail.

I'll make a PR so that useHttp1_1 is a Boolean and defaults to null, like the rest of these.

(This was breaking some of the benchmark suites)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for noticing / fixing! This was gnawing me a little but I didn't see the bug. I was tempted to use a Boolean and to resolve the correct choice in a getter, like I did for the timeouts.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cat:cloud client:solrj documentation Improvements or additions to documentation tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants