Skip to content

periodic timout of 60-110 seconds when starting opsworks-agent #6

@muddman

Description

@muddman

Thanks so much for making opsworks-vm available -- we love it!!

This probably isn't a bug with opsworks-vm but it is behavior I only see when using opsworks-vm so I thought I'd post it. Feel free to close it if you don't think it is appropriate to open here.

Most of the time during a vagrant up/provision we have a delay of 60-110 seconds right as the opsworks-agent is starting. The log shows 6-11 "sleep 10 sec" messages (see below). We have tracked it down to the /opt/aws/opsworks/current/bin/opsworks-agent-updater process which is kicked off at the beginning of each run. If we have a vagrant ssh up and manually kill the opsworks-agent-updater process then the run is not delayed.

Has anyone seen this or know what we can do to keep it from happening?

Here is the log with the sleeps that happen during a vagrant up/provision:

default: Running: inline script
==> default: Checking dependencies...
==> default: Reading input...
==> default: Processing 'recipes.json'...
==> default: Processing 'ops/opsworks/[REDACTED].json'...
==> default: Processing 'ops/dna/[REDACTED].json'...
==> default: Parsing deployments...
==> default: Parsing custom cookbooks...
==> default: Purging local cookbooks cache from '/opt/aws/opsworks/current/site-cookbooks'...
==> default: Generating dna.json...
==> default: Running opsworks agent...
==> default: [2015-10-16 21:54:33]  INFO [opsworks-agent(7456)]: About to run custom custom command from: /tmp/vagrant-opsworks20151016-7438-3rkgmo/dna.json
==> default: (locked: sleeping 10 secs)
==> default: (locked: sleeping 10 secs)
==> default: (locked: sleeping 10 secs)
==> default: (locked: sleeping 10 secs)
==> default: (locked: sleeping 10 secs)
==> default: (locked: sleeping 10 secs)
==> default: (locked: sleeping 10 secs)
==> default: (locked: sleeping 10 secs)
==> default: (locked: sleeping 10 secs)
==> default: (locked: sleeping 10 secs)
==> default: (locked: sleeping 10 secs)
==> default: (locked: sleeping 10 secs)
==> default: [Fri Oct 16 21:56:33 2015] About to execute: RUBYOPT="-E utf-8" /opt/aws/opsworks/current/bin/chef-client -j /var/lib/aws/opsworks/chef/2015-10-16-21-54-33-01.json -c /var/lib/aws/opsworks/client.stage1.rb -o opsworks_custom_cookbooks::load,opsworks_custom_cookbooks::execute 2>&1
==> default: [2015-10-16T21:56:34+00:00] INFO: Starting chef-zero on port 8889 with repository at repository at /opt/aws/opsworks/current
==> default:   One version per cookbook
==> default:   data_bags at /var/lib/aws/opsworks/data/data_bags
==> default:   nodes at /var/lib/aws/opsworks/data/nodes
==> default: [2015-10-16T21:56:34+00:00] INFO: Forking chef instance to converge...
==> default: [2015-10-16T21:56:34+00:00] INFO: *** Chef 11.10.4 ***
==> default: [2015-10-16T21:56:34+00:00] INFO: Chef-client pid: 7674

Here are the processes we see running if we have a vagrant ssh up during the run:

root      6419  6418  0 22:09 ?        00:00:00 /bin/sh -c cd /opt/aws/opsworks/current && bin/lockrun --verbose --lockfile=/var/lib/aws/opsworks/lockrun.lock -- /opt/aws/opsworks/current/bin/opsworks-agent-updater
root      6421  6419  0 22:09 ?        00:00:00 bin/lockrun --verbose --lockfile /var/lib/aws/opsworks/lockrun.lock -- /opt/aws/opsworks/current/bin/opsworks-agent-updater
root      6422  6421  7 22:09 ?        00:00:00 /opt/aws/opsworks/local/bin/ruby /opt/aws/opsworks/current/bin/opsworks-agent-updater

If we manually ssh in and kill -9 6419 then everything continues without delay.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions