Skip to content

Choose the right broker to push arbiter broks #1562

@andyxning

Description

@andyxning

Recently i have camed into a problem with two brokers, which one is master and the other is slave. However, the broks in arbiterdaemon about all daemons in the HA environment will all be push to the slave master which has been in a status waiting for initial configuration and can do nothing about the received system(with WebUI's system tab) broks. All these broks should be passed to master broker when it is ok and only goes to slave broker when the master broker is down.

related code is below(comments between "#“ lines are added by myself), all code can be found here:

# We must push our broks to the broker
    # because it's stupid to make a crossing connection
    # so we find the broker responsible for our broks,
    # and we send it to him
    # TODO: better find the broker, here it can be dead?
    # or not the good one?
    def push_broks_to_broker(self):
        for brk in self.conf.brokers:
            # Send only if alive of course
            ######################################################
            # Here, we can not just use the **alive** attribute to determine which broker to use, 
            # because  slave brokers will also be **alive**. This is the mainly problem. 
           ##########################################################
            if brk.manage_arbiters and brk.alive:
                is_send = brk.push_broks(self.broks)
                if is_send:
                    # They are gone, we keep none!
                    self.broks.clear()

i think if it is possible i can make a PR.


There are several situations should be considered:

  • if the master broker is just down, then all the initial info about all the daemons will be lost. Even when you fix the problem and start master broker and stop slave broker, there is no initial daemon info and according to the logic(the function about push_broks_to_broker) in arbiterdaemon initial daemon info will only be sent to those arbiters once. In case one of them has down and later restart, all the initial daemon info will be lost and all the daemon update info will not be handled properly. For example, in case of broker, if master broker has down and now restart then it will have nothing info about all the daemons info. So, when a new update_broker_status brok is comming, it can not process this brok cause it has self.brokers be empty.
def manage_update_broker_status_brok(self, b):
        data = b.data
        broker_name = data['broker_name']
        try:
            s = self.brokers[broker_name]
            self.update_element(s, data)
        except Exception:
            pass
  • second problem is the one i have stated at the beginning. We should only send daemon info to the "on-duty" broker. After i have read the source in dispatcher.py i think we can use the one in to_satellites_managed_by to find the right on-duty broker.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions