Skip to content

2.2: Daemon log level issue #1525

@GuusH

Description

@GuusH

Hi,

I tried to upgrade our distributed production environment (5 monitoring hosts monitoring 1500 instances and 50.000+ services) from 2.0.3 to 2.2 and I ran into serious trouble. We ran our daemons with loglevel INFO and got into issues (I'll get to that below). As a work around we tried setting the loglevel to WARNING, but after stopping and starting the services they kept logging at INFO level. So something is not working properly there it seems.

The bigger issue is the following. As stated we have a pretty large config. The arbiter outputs a line for each host and service at INFO level and if there's something like a service has no contact (also?) a WARNING line. So for our config this generates between 75.000 and 100.000 lines of output. Which is in itself not so bad, but for some reason these lines are being sent to the broker. The broker looks at these lines, concludes that it's not a valid line for storing in Mongo and then drops it. But it takes forever to do all this and while it's doing this it's not responding to requests on its livestatus module, so Thruk is unhappy. So as a work around we tried setting the loglevel to WARNING as stated earlier, but that didn't work.

Example of entries I got in brokerd.log:

[1424865789] INFO: [broker-ops-shinken01.example.com] [LogStoreMongoDB] This line is invalid: [1424865069] INFO: [Shinken] Processing object config file '/etc/shinken/monitoring_config/manual/hosts/gsp-lt-ndb-app01.example.com/gsp-lt-ndb-app01.example.com-FILESYSTEM__.cfg'

So why is the output from the config test of the arbiter sent to the broker in the first place?

I also got exciting stuff like this, but I suspect that's because the broker is busy plowing through those lines:

[1424865789] ERROR: [broker-ops-shinken01.example.com] LiveStatusClientError: Could not send response: [Errno 32] Broken pipe

I haven't had time to look through the code to find where this is done, but you can maybe pinpoint this much quicker than I can. If you need any info from me to help debug this, please let me know!

Thanks!

Guus

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions