Skip to content
traviscrawford edited this page Sep 14, 2010 · 3 revisions

Building scribe with zookeeper support

  1. Download and build Zookeeper 3.2.2. This is pretty straightforward, and is described in /path/to/zookeeper-3.2.2/src/c.
  1. When building Scribe, pass the following flags to bootstrap.sh
LDFLAGS="-L/path/to/zookeeper-3.2.2/src/c/.libs -lzookeeper_mt"
CPPFLAGS="-I/path/to/zookeeper-3.2.2/src/c/include -I/path/to/zookeeper-3.2.2/src/c/generated"

  1. When running Scribe’s ./configure, pass the above LDFLAGS and CPPFLAGS, as well as the —enable-zookeeper ./configure flag.

Configuration

Global options

zk_server: Zookeeper host:port. For example, `zk.foo.com’
zk_registration_prefix: Zookeeper znode representing the group this scribe is a member of. For example, `/scribe/aggregator’ means the two prefix components will be created as regular znodes, then this scribe will create an ephemeral child znode in `host:port’ form.

Network store

remote_host: The existing remote_host configuration option now accepts a Zookeeper znode, written as `zk:///path/to/znode’. For example, sending messages to members registering themselves at the above zk_registration_prefix would set `remote_host=zk:///scribe/aggregator’. This randomly selects a member the scribes registered at that prefix.

Failure modes

Zookeeper support has two distinct parts, registration and discovery.

Registration

Registration is currently best effort, so Zookeeper failures do not cause scribe issues. When a zookeeper session is established and Zookeeper goes down the multi-threaded Zookeeper client we use automatically reconnects.

Discovery

When discovering a remote host in the network store we simply override the existing remoteHost and remotePort variables, and inherit all existing failure modes. Should the remote scribe go down we later retry the same host, same as today.

Future work

Going forward I’d like to improve error reporting around registration, so better detect when registration has failed. Regarding discovery, I’d like to explore moving the discovery code into the connect path, so we discover the best remote host at that time, which likely changes over time.

Clone this wiki locally