-
Describe the bug The wired interface provides support for 2 UDP receive sockets, which can optionally (conditional compilation) be replaced with a singular raw layer 2 socket in a single thread. The raw socket is the method that I need as I need Layer 2 support to be able to forward on ARPs and ICMPs across a network. There is also a separate thread bound to a different UDP port. A Telnet server is active on this interface to provide status and control. In my current Zephyr configuration, when data is ingress into the raw socket (or the UDP sockets) in combination with the CONFIG_NET_PROMISCUOUS_MODE kernel configuration - the Zephyr stack appears to be leaking, resulting in an application failure. This is confirmed by viewing the results of the shell output to "net allocs" command showing network buffer allocation and corresponding free (where present, lack of free shows the buffer leak during the clone procedure). So it would seem that the stack allocations are not being properly de-registered and freed. I am looking for some suggestions on where to dig into determine the problem and solution. I can delay the amount of ingress traffic before the exception/failure by increasing the number of network buffers, but eventually it will happen. I can get the leak to not occur if I disable CONFIG_NET_PROMISCUOUS_MODE in the kernel configuration, but then I can not pass Layer 2 traffic. The problem is I need support for this and Layer 2 support in general using a raw socket to handle L2 traffic to I am using the standard POSIX multiplexed i/o "select" call to handle the Ethernet interfaces similar to the following:
The read from the raw socket uses code as follows:
The receive handler in POSIX is the following:
Where:
My expectation would be that after the POSIX API call to zsock_recv, any cloning of the network packet or dynamic memory would be automatically freed. Using the POSIX APIs, I have no access to the underlying Zephyr specific buffers to allow the application to free the cloned packet. Clearly I do not fully understand what is required by the application using this combination of a raw socket in promiscuous mode via the kernel configuration, a separate UPD socket in another thread, and a Telnet server to co-exist properly, and the de-referencing process that occurs to mark a network packet for deallocation. Please also mention any information which could help others to understand
To Reproduce
Expected behavior Impact Logs and console output Here is a capture of the callstack at the point of the exception occurence.
Here is a dump of "net allocs" from the Telnet interface menu with a simple 1 pulse per second PING traffic ingress to the primary Native-POSIX Ethernet interface. Note that there is no corresponding free to the packet clone.
Environment (please complete the following information):
Additional context Project Kernel Configuration is as follows:
|
Beta Was this translation helpful? Give feedback.
Replies: 8 comments 2 replies
-
Are you familiar with the promiscuous mode sample app at |
Beta Was this translation helpful? Give feedback.
-
I understand, but this then implies that using standard POSIX APIs at the network layer is incompatible with promiscuous mode for the network interface, that is Zephyr is not compatible with that standard. Is there any way that the underlying collection pointer to be freed could be exposed to say the zsock_recv() API so that the application could free it?, or would it just make sense that Zephyr perform the free itself as a result of that call?, meaning as soon as the content is copied to the user space buffer, there is no reason to keep the allocation around anymore. I think this approach would be the most elegant, keeping Zephyr compatible to the POSIX standard and freeing the developer adhering to POSIX standard APIs out of the business of knowing deallocations. Using the POSIX APIs provides efficiencies to folks porting existing applications, and adding promiscuous support would make Zephyr compatible with the Linux APIs. |
Beta Was this translation helpful? Give feedback.
-
I think we've discussed this already in #36748. TLDR the conclusion was that the module enabled with |
Beta Was this translation helpful? Give feedback.
-
Yes, the discussion from the past is the same one, over time I have found that I really needed that raw socket, which is not uncommon for any Layer 2 networking device, and as the Zephyr code base as it has evolved since my original posting, changes/updates have not addressed that issue. Is the level of effort described to have the zsock_recv API simply free up the local allocation a complex issue to implement? I really don't know the scope of what I am asking, and you guys being the subject matter experts, know the full scope of the design |
Beta Was this translation helpful? Give feedback.
-
You could have a somewhat hackish workaround by having a "dummy" promiscuous mode handler that just discards the received packets, and then your raw socket should also receive all the same packets. This is not optimal of course but should work until we get support for proper promiscuous mode sockets. |
Beta Was this translation helpful? Give feedback.
-
A workaround is just fine, I am not a purist. How would this other handler synchronize packet removal and freeing relative to the raw socket implementation based on the zsock_select() multiplexed i/o and POSIX APIs? Meaning, would this simply be a separate thread implementing the logic similar to the promiscuous_mode sample similar to the following, where it waits on a packet, then frees it? Would it be freeing a copy that would be used in the application thread implementing the POSIX communications APIs, i.e. would there need to be some form of thread synchronization?
|
Beta Was this translation helpful? Give feedback.
-
Yeah, I was thinking a thread that would set the interface to promiscuous mode and have a while loop like you have above. Line 3900 in bb85759 net_promisc_mode_on() is not called, then the packet should be unreffed properly and no leaks should happen.So if you want to receive packets in promiscuous mode, but do not want to use the promiscuous mode API, you can enable promiscuous mode in config file, and then call net_if_set_promisc(iface); to set the interface to promiscuous mode. You should then be able to receive all the network traffic in your device without any buffer leaks and without any dummy handlers etc. Not tested but should work ok.
|
Beta Was this translation helpful? Give feedback.
-
Given that you are really asking for a feature request here, I will convert this to a discussion and then ask you to open a new issue (marked as enhancement), along the lines of: "Implement promiscuous mode for |
Beta Was this translation helpful? Give feedback.
Yeah, I was thinking a thread that would set the interface to promiscuous mode and have a while loop like you have above.
After some code digging there seems to be a simpler way. The promiscuous packets are processed in
zephyr/subsys/net/ip/net_if.c
Line 3900 in bb85759
net_promisc_mode_on()
is not called, then the packet should be unreffed properly and no leaks should happen.So if you want to receive packets in promiscuous mode, but do not want to use the promiscuous mode API, you can enable promiscuous mode in config file, and then call
net_i…