Skip to content

Informer: suspended watcher #4007

@anidzgor

Description

@anidzgor

hi,
i am trying to configure Informer in a simple way:

final SharedIndexInformer<V1Pod> podInformer = informerFactory.sharedIndexInformerFor(
                    (CallGeneratorParams params) -> {
                        final CoreV1Api.APIlistPodForAllNamespacesRequest request = coreV1Api.listPodForAllNamespaces();
                        request.allowWatchBookmarks(true);
                        request.watch(params.watch);
                        request.resourceVersion(params.resourceVersion);
                        request.timeoutSeconds(params.timeoutSeconds);
                        return request.buildCall(null);
                    },
                    V1Pod.class,
                    V1PodList.class,
                    0L,
                    (arg1, arg2) -> {
                        log.error("-------exception here--------" + arg1 + " " + arg2);
                    }
            );

Before that, i created OkHttpClient object like that:

           final OkHttpClient configuredHttpClient = client.getHttpClient()
                    .newBuilder()
                    .protocols(Arrays.asList(Protocol.HTTP_1_1))
                    .connectTimeout(30, TimeUnit.SECONDS)  // Connection establishment timeout
                    .writeTimeout(10, TimeUnit.SECONDS)    // Timeout for sending data
                    .callTimeout(0, TimeUnit.SECONDS)    // Overall timeout for the entire request
                    .readTimeout(0, TimeUnit.SECONDS)
                    .retryOnConnectionFailure(false)
                    .addInterceptor(loggingInterceptor)
                    .connectionPool(new ConnectionPool(0, 1, TimeUnit.SECONDS))
                    .socketFactory(new CustomSocketFactory())
                    .build();

In CustomFactory i set 2 below fields:

private void configureSocket(Socket socket) {
            try {
                socket.setKeepAlive(true);      // Enable TCP keep-alive
                socket.setTcpNoDelay(true);     // Disable Nagle's algorithm
            } catch (SocketException e) {
                throw new RuntimeException("Failed to configure socket", e);
            }
        }
    }

What I saw, in short network disconnection tests, Watcher is able to continue its work, which is fine, but for a longer period of time like, I think above 5 minutes, the application is suspended, and I can't see any new events, logs etc.

In the rest of the code I implement ResourceEventHandler, but it's not necessary for showing here i guess.
I saw in the other issue that adding properties pingInterval in OkHttpClient for HTTP_2 in protocols array property solved the problem, but when i did that i saw in the logs requests trying to reconnect with param watch=false which results in list and then watch operations, which i think is not appropriate here.
What I want to achieve is to see the watcher try to reconnect with param watch=true, next get 410 if resource is gone, and then do list and watch operation, right?
Generally i tried implement method sharedIndexInformerFor with many implementations and change OkHttpClient, ...timeouts but with no bigger results.

Does anyone have an idea how to solve this or point out what I am doing wrong?

Metadata

Metadata

Assignees

No one assigned

    Labels

    lifecycle/staleDenotes an issue or PR has remained open with no activity and has become stale.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions