Skip to content

net: tcp: keep-alive "pings" are no longer acknowledged with v4.2.0 #93350

@lucasdietrich

Description

@lucasdietrich

Describe the bug

Since Zephyr v4.2.0:

I'm working on an nRF52840 DK. My zephyr application implements a TCP client which establishes a connection with a remote server over a NET-over-USB interface.
The remote TCP server is configured to send TCP keep-alive "pings" to the client. However, the zephyr client never responds to these keep-alives, causing the server to close the connection.

Regression

  • This is a regression.

I haven't tried a git bisect yet, but the keep-alive packets were properly handled in Zephyr v4.1.0.

Steps to reproduce

I did not try on a particular example, just any zephyr sample implementing a TCP client with the default configuration could be used to reproduce the issue.

On the server side, my implementation looks like this:

server.c
// Build with: gcc -Wall -O2 -o server server.c

#include <arpa/inet.h>
#include <netinet/in.h>
#include <netinet/tcp.h>
#include <sys/socket.h>
#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>
#include <errno.h>
#include <string.h>

#define ADDR "192.0.3.1"
#define PORT 4000
#define KEEP_ALIVE_IDLE  5   // seconds before first probe
#define KEEP_ALIVE_INTVL 1   // seconds between probes
#define KEEP_ALIVE_CNT   1   // probes before giving up

static void set_keepalive(int fd)
{
    int optval = 1;
    setsockopt(fd, SOL_SOCKET, SO_KEEPALIVE, &optval, sizeof optval);
    
    optval = KEEP_ALIVE_IDLE;
    setsockopt(fd, IPPROTO_TCP, TCP_KEEPIDLE, &optval, sizeof optval);
    
    optval = KEEP_ALIVE_INTVL;
    setsockopt(fd, IPPROTO_TCP, TCP_KEEPINTVL, &optval, sizeof optval);
    
    optval = KEEP_ALIVE_CNT;
    setsockopt(fd, IPPROTO_TCP, TCP_KEEPCNT, &optval, sizeof optval);
}

int main(void)
{
    int lsock = socket(AF_INET, SOCK_STREAM, 0);
    if (lsock < 0) {
        perror("socket");
        exit(EXIT_FAILURE);
    }

    int optval = 1;
    setsockopt(lsock, SOL_SOCKET, SO_REUSEADDR, &optval, sizeof optval);

    struct sockaddr_in addr = {
        .sin_family = AF_INET,
        .sin_port = htons(PORT),
        .sin_addr.s_addr = inet_addr(ADDR)
    };

    if (bind(lsock, (struct sockaddr *)&addr, sizeof addr) < 0) {
        perror("bind");
        exit(EXIT_FAILURE);
    }

    if (listen(lsock, 1) < 0) {
        perror("listen");
        exit(EXIT_FAILURE);
    }

    printf("Server listening on %s:%d\n", ADDR, PORT);

    for (;;) {
        int csock = accept(lsock, NULL, NULL);
        if (csock < 0) {
            perror("accept");
            continue;
        }

        printf("Client connected\n");
        set_keepalive(csock);

        char buf[512];
        ssize_t n;
        while ((n = read(csock, buf, sizeof buf)) > 0) {
            printf("Received %zd bytes\n", n);
        }

        if (n == 0) {
            printf("Client closed the connection\n");
        } else {
            fprintf(stderr, "read error: %s\n", strerror(errno));
        }

        close(csock);
    }
}

Relevant log output

I have .pcapng captures, but i'm not able to upload them to Github.

Image

Additional context

Since keep-alive pings are standard TCP packets with the ACK flag set and no data, naively, I tried modifying the code to send an ACK in response when receiving such packets:

Culprit code ?

zephyr/subsys/net/ip/tcp.c

Lines 2808 to 2809 in 413b789

} else if ((len > 0) || FL(&fl, &, FIN)) {
tcp_out(conn, ACK);

My patch

diff --git a/subsys/net/ip/tcp.c b/subsys/net/ip/tcp.c
index 1496c3ef9cf..b395c3e6309 100644
--- a/subsys/net/ip/tcp.c
+++ b/subsys/net/ip/tcp.c
@@ -2805,7 +2805,7 @@ static enum net_verdict tcp_in(struct tcp *conn, struct net_pkt *pkt)
 		/* send ACK for non-RST packet */
 		if (FL(&fl, &, RST)) {
 			net_stats_update_tcp_seg_rsterr(net_pkt_iface(pkt));
-		} else if ((len > 0) || FL(&fl, &, FIN)) {
+		} else if ((len > 0) || FL(&fl, &, ACK) || FL(&fl, &, FIN)) {
 			tcp_out(conn, ACK);
 		}
 		k_mutex_unlock(&conn->lock);

After the patch, everything seems to behave as expected:

Image

Impact

Functional Limitation – Some features not working as expected, but system usable.

Environment

OS: Linux
Toolchain: zephyr-sdk-0.17.0
Commit: Zephyr v4.2.0

EDIT: typo + reorganize

Metadata

Metadata

Assignees

Labels

area: NetworkingbugThe issue is a bug, or the PR is fixing a bug

Type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions