-
-
Notifications
You must be signed in to change notification settings - Fork 3.7k
Description
A report on the forum from @lo92fr about OH running out of memory includes a stack track that reveals a deadlock, in which Teleinfo seems to be involved.
This might be related to #9546 and #9724.
There might to be a "deadlock loop" between Teleinfo and the OH discovery service:
"OH-binding-teleinfo:serialcontroller:teleinfoserial" #1165 [2820248] daemon prio=5 os_prio=0 cpu=9998,65ms elapsed=21349,28s tid=0x00007fbb0c114430 nid=2820248 waiting for monitor entry [0x00007fba1b8fe000]
java.lang.Thread.State: BLOCKED (on object monitor)
at org.openhab.core.config.discovery.internal.DiscoveryServiceRegistryImpl.thingDiscovered(DiscoveryServiceRegistryImpl.java:257)
- waiting to lock <0x00000000814f9128> (a org.openhab.core.config.discovery.internal.DiscoveryServiceRegistryImpl)
at org.openhab.core.config.discovery.AbstractDiscoveryService.thingDiscovered(AbstractDiscoveryService.java:318)
at org.openhab.binding.teleinfo.internal.TeleinfoDiscoveryService.detectNewElectricityMeterFromReceivedFrame(TeleinfoDiscoveryService.java:132)
at org.openhab.binding.teleinfo.internal.TeleinfoDiscoveryService.onFrameReceived(TeleinfoDiscoveryService.java:111)
at org.openhab.binding.teleinfo.internal.handler.TeleinfoAbstractControllerHandler.lambda$0(TeleinfoAbstractControllerHandler.java:49)
at org.openhab.binding.teleinfo.internal.handler.TeleinfoAbstractControllerHandler$$Lambda/0x0000000101341188.accept(Unknown Source)
at java.util.concurrent.CopyOnWriteArrayList.forEach(java.base@21.0.8/CopyOnWriteArrayList.java:891)
at java.util.concurrent.CopyOnWriteArraySet.forEach(java.base@21.0.8/CopyOnWriteArraySet.java:425)
at org.openhab.binding.teleinfo.internal.handler.TeleinfoAbstractControllerHandler.fireOnFrameReceivedEvent(TeleinfoAbstractControllerHandler.java:49)
at org.openhab.binding.teleinfo.internal.serial.TeleinfoSerialControllerHandler.onFrameReceived(TeleinfoSerialControllerHandler.java:108)
at org.openhab.binding.teleinfo.internal.serial.TeleinfoReceiveThread.run(TeleinfoReceiveThread.java:63)"JmDNS pool-27-thread-1" #871 [2819883] prio=5 os_prio=0 cpu=2478,52ms elapsed=21361,56s tid=0x00007fbb35624bf0 nid=2819883 waiting for monitor entry [0x00007fba205fe000]
java.lang.Thread.State: BLOCKED (on object monitor)
at org.openhab.core.config.discovery.internal.DiscoveryServiceRegistryImpl.thingDiscovered(DiscoveryServiceRegistryImpl.java:257)
- waiting to lock <0x00000000814f9128> (a org.openhab.core.config.discovery.internal.DiscoveryServiceRegistryImpl)
at org.openhab.core.config.discovery.AbstractDiscoveryService.thingDiscovered(AbstractDiscoveryService.java:318)
at org.openhab.core.config.discovery.mdns.internal.MDNSDiscoveryService.createDiscoveryResult(MDNSDiscoveryService.java:227)
at org.openhab.core.config.discovery.mdns.internal.MDNSDiscoveryService.considerService(MDNSDiscoveryService.java:214)
at org.openhab.core.config.discovery.mdns.internal.MDNSDiscoveryService.serviceResolved(MDNSDiscoveryService.java:207)
at javax.jmdns.impl.ListenerStatus$ServiceListenerStatus.serviceResolved(ListenerStatus.java:117)
- locked <0x00000000911c3db0> (a javax.jmdns.impl.ListenerStatus$ServiceListenerStatus)
at javax.jmdns.impl.JmDNSImpl$1.run(JmDNSImpl.java:923)
at java.util.concurrent.Executors$RunnableAdapter.call(java.base@21.0.8/Executors.java:572)
at java.util.concurrent.FutureTask.run(java.base@21.0.8/FutureTask.java:317)
at java.util.concurrent.ThreadPoolExecutor.runWorker(java.base@21.0.8/ThreadPoolExecutor.java:1144)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base@21.0.8/ThreadPoolExecutor.java:642)
at java.lang.Thread.runWith(java.base@21.0.8/Thread.java:1596)
at java.lang.Thread.run(java.base@21.0.8/Thread.java:1583)I'm not familiar with the binding, but after a quick look, I found that binding probably should not call thingDiscovered(discoveryResult); from the thread that handles the controller communications. Once a discovery result has been created, a new thread should probably be used to actually register the result, as this can take time. An alternative approach is that something is done in Core so that thingDiscovered returns quickly, as I've seen this problem in other bindings as well.
It's unclear to me at this point whether this binding is a part of the problem or just an innocent victim of the discovery service being deadlocked.
So, I guess, the crux of this issue is: How should discovery results be registered, since this process can be slow (and potentially deadlock)? Should bindings handle this, or should Core? As I write this, it becomes increasingly clear to me that the answer probably is: Core.