spare arbiters memory consumption

When running in a high availability setup with a spare arbiter, it consumes a lot of memory and keeps consuming more for every restart you do on the primary arbiter.

In this graph we see the memory usage of the arbiter processes. The yellow/green lines are for the main arbiter, the orange/blue ones for the spare arbiter:

![screen shot 2015-02-10 at 11 56 40](https://cloud.githubusercontent.com/assets/677215/6129137/465bf38c-b13c-11e4-8fae-81af5aa578d4.png)

Every jump in memory by the spare arbiter is caused by a reload of the master arbiter, which causes it to send the new configuration to the spare arbiter. The drops in memory usage are we restarting the spare arbiter to free up memory.

While testing I discovered there is actually a maximum number of jumps the spare arbiter takes, it's equal to the number of http threads (8 is the default) the spare arbiter spawns at startup. Each time the master arbiter sends it's new configuration to the spare arbiter it is handled by a different http thread (round-robin). Once all http threads have handled a POST request from the master, memory consumption becomes stable. No more jumps.

Now I'm not a Python programmer, but here is what I think happens:

The cPickle.loads command in the put_conf method of IForArbiter uses the amount of memory we see per jump in the graph (let's say 1GB). Now because the http thread called this method it has a pointer to this data and as long as the thread exists the data can't be removed by the garbage collector. 

Since every POST is handled by a different http thread, you can have potentially 8 (number of threads) \* 1GB (size needed for cPickle method on large conf)  = 8 GB in use. 

Once all threads have been used they get reused, releasing the pointer and reusing the claimed memory from the previous run. This causes the memory usage to stabilise.

To counter this issue I've moved the cPickle.loads command from the put_conf method in IForArbiter to the setup_new_conf method in ArbiterDaemon. So put_conf in IForArbiter only passes it's data to the ArbiterDaemon object and the ArbiterDaemon is now responsible for calling cPickle.loads on the data.

This way I think the http thread has no pointer to the cPickle claimed memory. The cPickle now happens in the ArbiterDaemon object to which the http threads have no connection. I've done some basic testing and this seems to have decreased memory consumption for quite a bit, getting rid of the high jumps we see in the graph. 

I'll need to do some more tests, but I'd like to hear what you think. Thanks!


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

spare arbiters memory consumption #1506

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

spare arbiters memory consumption #1506

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions