-
Notifications
You must be signed in to change notification settings - Fork 65
Description
Describe the bug
When using the mosquitto persistence (e.g. the mosquitto.db
file), the mosquitto bridge function stops working and no messages are being received or published to the cloud.
Symptoms
- mosquitto health topic (
te/device/main/service/mosquitto-c8y-bridge/status/health
) toggles periodically between0
and1
- tedge-mapper-c8y is not able to get a valid token (publishing to
c8y/s/uat
goes unanswered onc8y/s/dat
) - Cumulocity operation sent to the device are marked with the "delivery" fragment indicating that the message has been received by the broker, however the message is not published by the bridge to the local broker
- Restarting the tedge-mapper-c8y service does not have any effect
- Restarting the mosquitto does not have any effect
Below shows the pattern where the bridge health is toggling and the requests by the mapper to get a new Cumulocity JWT.
$ tedge mqtt sub '#'
[c8y/s/uat]
[c8y/s/uat]
[c8y/s/uat]
[c8y/s/uat]
[c8y/s/uat]
[c8y/s/uat]
[te/device/main/service/mosquitto-c8y-bridge/status/health] 0
[c8y/s/us] 102,rmi_cb001:device:main:service:mosquitto-c8y-bridge,service,mosquitto-c8y-bridge,down
[te/device/main/service/mosquitto-c8y-bridge/status/health] 1
[c8y/s/us] 102,rmi_cb001:device:main:service:mosquitto-c8y-bridge,service,mosquitto-c8y-bridge,up
[c8y/s/uat]
[c8y/s/uat]
[c8y/s/uat]
[c8y/s/uat]
[te/device/main/service/mosquitto-c8y-bridge/status/health] 0
[c8y/s/us] 102,rmi_cb001:device:main:service:mosquitto-c8y-bridge,service,mosquitto-c8y-bridge,down
[te/device/main/service/mosquitto-c8y-bridge/status/health] 1
[c8y/s/us] 102,rmi_cb001:device:main:service:mosquitto-c8y-bridge,service,mosquitto-c8y-bridge,up
[c8y/s/uat]
[c8y/s/uat]
Workaround
The cloud connection
-
Disconnect tedge (which stops mosquitto and tedge-mapper-c8y)
tedge disconnect c8y
-
Remove the mosquitto.db file
rm /var/lib/mosquitto/mosquitto.db
-
Connect tedge (which starts mosquitto and tedge-mapper-c8y)
tedge connect c8y
To Reproduce
It is currently unknown how to reproduce the problem. It might be possible to use the attached mosquitto.db to reproduce the mosquitto bridge deadlock.
This procedure is not verified but you could try:
-
Stop mosquitto
systemctl stop mosquitto
-
Copy the mosquitto.db.tgz file, and decompress it to /var/lib/mosquitto/mosquitto.db
-
Change the permissions of the file
mkdir -p /var/lib/mosquitto/ chmod 755 /var/lib/mosquitto/ chown mosquitto:mosquitto /var/lib/mosquitto/ chmod 600 /var/lib/mosquitto/mosquitto.db chown mosquitto:mosquitto /var/lib/mosquitto/mosquitto.db
-
Enable the mosquitto persistence (assuming you haven't already this setting to mosquitto)
persistence true persistence_location /var/lib/mosquitto/
-
Start mosquitto
systemctl start mosquitto
-
Monitor the local MQTT broker looking at the mosquitto health etc.
tedge mqtt sub '#'
Expected behavior
Screenshots
Environment (please complete the following information):
Property | Value |
---|---|
OS [incl. version] | Alpine Linux v3.18 |
Hardware [incl. revision] | docker |
System-Architecture | Linux 9bba511280c7 6.8.0-39-generic #39-Ubuntu SMP PREEMPT_DYNAMIC Sat Jul 6 02:50:39 UTC 2024 aarch64 Linux |
thin-edge.io version | tedge 1.3.0 |
mosquitto version | 2.0.18 |
Additional context
Logs from mosquitto and tedge-mapper-c8y
tedge-mapper-c8y | 2024-09-26T15:01:38.444085498Z INFO mqtt_channel::connection: MQTT connection established
mosquitto | 1727362898: osujqlfosa 1 c8y/s/dat
tedge-mapper-c8y | 2024-09-26T15:01:38.468649108Z INFO c8y_api::http_proxy: JWT token requested
tedge-mapper-c8y | 2024-09-26T15:01:58.471328117Z INFO c8y_api::http_proxy: JWT token requested
tedge-mapper-c8y | 2024-09-26T15:02:18.474034123Z INFO c8y_api::http_proxy: JWT token requested
tedge-mapper-c8y | 2024-09-26T15:02:38.476886867Z ERROR c8y_api::http_proxy: Fail to retrieve JWT token after 3 attempts
tedge-mapper-c8y | 2024-09-26T15:02:38.477218784Z INFO mqtt_channel::connection: MQTT connection closed
mosquitto | 1727362958: Client osujqlfosa disconnected.
tedge-mapper-c8y | 2024-09-26T15:02:38.477961577Z ERROR c8y_http_proxy::actor: An error occurred while retrieving internal Id, operation will retry in 20 seconds
tedge-mapper-c8y | Error: CustomError("JWT token not available")
tedge-mapper-c8y | 2024-09-26T15:02:58.479105094Z INFO mqtt_channel::connection: MQTT connecting to broker: host=127.0.0.1:1883, session_name=None
mosquitto | 1727362978: New connection from 127.0.0.1:53426 on port 1883.
mosquitto | 1727362978: New client connected from 127.0.0.1:53426 as ueidzucazb (p2, c1, k60).
tedge-mapper-c8y | 2024-09-26T15:02:58.47954322Z INFO mqtt_channel::connection: MQTT connection established
mosquitto | 1727362978: ueidzucazb 1 c8y/s/dat
tedge-mapper-c8y | 2024-09-26T15:02:58.50466554Z INFO c8y_api::http_proxy: JWT token requested
tedge-mapper-c8y | 2024-09-26T15:03:18.507521871Z INFO c8y_api::http_proxy: JWT token requested
mosquitto | 1727363006: Connecting bridge edge_to_c8y (thin-edge-io.eu-latest.cumulocity.com:8883)
tedge-mapper-c8y | 2024-09-26T15:03:38.509243964Z INFO c8y_api::http_proxy: JWT token requested
tedge-mapper-c8y | 2024-09-26T15:03:58.513521985Z ERROR c8y_api::http_proxy: Fail to retrieve JWT token after 3 attempts
tedge-mapper-c8y | 2024-09-26T15:03:58.514573903Z ERROR c8y_http_proxy::actor: An error occurred while retrieving internal Id, operation will retry in 20 seconds
tedge-mapper-c8y | Error: CustomError("JWT token not available")
tedge-mapper-c8y | 2024-09-26T15:03:58.514723736Z INFO mqtt_channel::connection: MQTT connection closed
mosquitto | 1727363038: Client ueidzucazb disconnected.
Output from mosquitto_db_dump tool
Using the mosquitto_dump_db (built from https://github.com/eclipse/mosquitto/tree/master/apps/db_dump), the following shows the output:
% ./mosquitto_db_dump --stats mosquitto.db.bak
DB_CHUNK_CFG: 1
DB_CHUNK_MSG_STORE: 132
DB_CHUNK_CLIENT_MSG: 112
DB_CHUNK_RETAIN: 24
DB_CHUNK_SUB: 38
DB_CHUNK_CLIENT: 4
% ./mosquitto_db_dump --client-stats mosquitto.db.bak
SC: 0 SS: 0 MC: 108 MS: 33219 Cumulocity
SC: 28 SS: 1536 MC: 3 MS: 459 tedge-mapper-c8y
SC: 1 SS: 89 MC: 1 MS: 136 last_will_c8y_mapper
SC: 9 SS: 567 MC: 0 MS: 0 tedge-agent#te/device/main//
The following one-liner looks at the MQTT client ID of the Cumulocity IoT bridge, and gathers some statistics on
The bridge queue (e.g. how many are outbound and inbound) can be calculated using:
./mosquitto_db_dump mosquitto.db.bak | grep "Client ID: Cumulocity" -A 7 -B 2 | grep Direction | sort | uniq -c
1 Direction: 0
107 Direction: 1
The above shows that there is one inbound message (assuming that Direction: 0
means inbound, and 1
outbound). Below shows the meta information of the pending inbound message:
DB_CHUNK_CLIENT_MSG:
Length: 26
Client ID: Cumulocity
Store ID: 9241
MID: 1
QoS: 2
Retain: 0
Direction: 0
State: 7
Dup: 0
The message states can also be aggregated to see how many message are in which state:
./mosquitto_db_dump mosquitto.db.bak | grep "Client ID: Cumulocity" -A 7 -B 2 | grep State | sort | uniq -c
87 State: 11
20 State: 3
1 State: 7