Skip to content

unacknowledged local command requests are difficult to clear when MQTT broker persistence is not configured #2862

@reubenmiller

Description

@reubenmiller

Describe the bug

If the local MQTT broker is not configured for persistence, and the user is creating local commands (e.g. commands not issued from the cloud), then if the MQTT broker is restarted, then the tedge-agent will reject any future operation requests of the same type if there is already an uncleared/unacknowledged operation. However since the local MQTT broker does not have persistence configured, it is difficult for the user to see which operations still need to be cleared so that the tedge-agent will process future requests.

The only option to see which commands topics are in progress is to either inspect the tedge-agent logs and look for the following output:

2024-05-08T06:24:35.332521Z ERROR tedge_agent::tedge_operation_converter::actor: software_list operation request cannot be processed: Two concurrent requests are under execution on the same topic: te/device/main///cmd/software_list/local-123

Or to inspect internal state file (/opt/homebrew/etc/tedge/.agent/workflows) which the tedge-agent uses to track the state (which is NOT RECOMMENDED of course).

To Reproduce

The situation can be reproduced using the following steps:

  1. Start mosquitto (without persistence configured)

  2. Start tedge-agent

  3. Create a local software_list operation

    tedge mqtt pub -r 'te/device/main///cmd/software_list/local-1234' '{"status":"init"}'
  4. Restart mosquitto

  5. Subscribe to the commands

    tedge mqtt sub 'te/device/main///cmd/+/+'

    No results should be shown (as mosquitto as not been configured to persist messages)

  6. Try submitting a new command request (with a different command id)

    tedge mqtt pub -r 'te/device/main///cmd/software_list/local-2222' '{"status":"init"}'

    The operation will not be processed and an error message will appear in the tedge-agent logs similar to:

    2024-05-08T06:24:35.332521Z ERROR tedge_agent::tedge_operation_converter::actor: software_list operation request cannot be processed: Two concurrent requests are under execution on the same topic: te/device/main///cmd/software_list/local-1234
    

Expected behavior

The expected behaviour is not yet defined, but could be defined by answering the following question:

How does a user know which operations need to be cleared so that the tedge-agent will continue to process new requests?

Screenshots

Environment (please complete the following information):

  • OS [incl. version]: Any
  • Hardware [incl. revision]: Any
  • System-Architecture [e.g. result of "uname -a"]: Any
  • thin-edge.io version [e.g. 0.1.0]: 1.0.2~273+gd05e9b6

Additional context

In the above situation, how does the user know there are unacknowledged command requests?

The only chance the user has to clear the operation is to inspect the tedge-agent logs and view the topic name.

2024-05-08T06:24:35.30569Z  INFO tedge_agent::tedge_operation_converter::actor: Waiting failed restart operation to be cleared
2024-05-08T06:24:35.321863Z  INFO tedge_agent::tedge_operation_converter::actor: Waiting successful software_list operation to be cleared
2024-05-08T06:24:35.327175Z  INFO tedge_agent::tedge_operation_converter::actor: Waiting successful software_list operation to be cleared
2024-05-08T06:24:35.332453Z  INFO tedge_agent::tedge_operation_converter::actor: Waiting successful software_list operation to be cleared
2024-05-08T06:24:35.332521Z ERROR tedge_agent::tedge_operation_converter::actor: software_list operation request cannot be processed: Two concurrent requests are under execution on the same topic: te/device/main///cmd/software_list/local-123
2024-05-08T06:24:35.364497Z ERROR tedge_agent::tedge_operation_converter::actor: software_list operation request cannot be processed: Two concurrent requests are under execution on the same topic: te/device/main///cmd/software_list/local-1234

Inspecting the internal files used by tedge-agent to persist the workflows to file also yields a list of commands which are known to the agent, however not to the user.

file: /opt/homebrew/etc/tedge/.agent/workflows

{
    "version": "V1",
    "commands": {
        "te/device/main///cmd/restart/local-1234": {
            "unix_timestamp": 1712050067,
            "status": "failed",
            "payload": {
                "reason": "Fail to trigger a restart: Command returned non 0 exit code: Command { std: \"/usr/bin/sudo\" \"sync\", kill_on_drop: false }",
                "status": "failed"
            }
        },
        "te/device/main///cmd/software_list/local-1234": {
            "unix_timestamp": 1709384026,
            "status": "successful",
            "payload": {
                "currentSoftwareList": [
                    {
                        "modules": [],
                        "type": ""
                    }
                ],
                "status": "successful"
            }
        },
        "te/device/main///cmd/software_list/local-123": {
            "unix_timestamp": 1715148092,
            "status": "successful",
            "payload": {
                "currentSoftwareList": [
                    {
                        "modules": [
                            {
                                "name": "tedge",
                                "version": "1.0.2-rc273+gd05e9b6"
                            }
                        ],
                        "type": "brew"
                    }
                ],
                "status": "successful"
            }
        },
        "te/device/main///cmd/software_list/local-222": {
            "unix_timestamp": 1709384026,
            "status": "successful",
            "payload": {
                "currentSoftwareList": [
                    {
                        "modules": [],
                        "type": ""
                    }
                ],
                "status": "successful"
            }
        }
    }
}

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingtheme:mqttTheme: mqtt and mosquitto related topics

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions