Skip to content

Deployment Scratchpad

RayBB edited this page Jun 13, 2025 · 6 revisions

Deployment Scratchpad

Deployment Guide

2021-04-21 -- Deployment

  • Open a terminal tab and log into ol-home0
  • Open https://openlibrary.org/admin?stats so that you can monitor server status
  • Warn Slack channels openlibrary and openlibrary-g of imminent downtime!
  • On ol-home0 run cd /opt/openlibrary && sudo git pull origin master
  • On ol-home0 run /opt/openlibrary/scripts/deployment/deploy.sh
  • Run ./scripts/deployment/are_servers_in_sync.sh to ensure the three servers are in sync.
    • They were not in sync :/ Had to git pull on ol-web2. Shouldn't be an issue after PR, I believe.
  • Copy static files out Docker image and put them on ol-www1 GO LINE BY LINE
# Make a backup of static assets
ssh -A ol-www1 'sudo cp -r /opt/openlibrary/openlibrary/static /opt/openlibrary/openlibrary/static_backup'
STATIC_DIR=/tmp/ol-static-$(date '+%Y-%m-%d')
docker cp $(docker create --rm oldev:latest):/openlibrary/static $STATIC_DIR
rsync -rvz $STATIC_DIR/ ol-www1:$STATIC_DIR
# TODO: There's another static dir!
ssh -A ol-www1 "sudo mkdir -p /opt/openlibrary/openlibrary/static-new && sudo cp -r $STATIC_DIR/* /opt/openlibrary/openlibrary/static-new"
ssh -A ol-www1 'sudo chown -R openlibrary:openlibrary /opt/openlibrary/openlibrary/static-new'
ssh -A ol-www1 'sudo rm -r /opt/openlibrary/openlibrary/static && sudo mv /opt/openlibrary/openlibrary/static-new /opt/openlibrary/openlibrary/static'
  • Run /opt/openlibrary/scripts/deployment/restart_servers.sh ol-web1 ol-covers0
  • Run /opt/openlibrary/scripts/deployment/restart_servers.sh ol-home0 && docker restart openlibrary_infobase_nginx_1
  • Run /opt/openlibrary/scripts/deployment/restart_servers.sh ol-web2

2021-04-15 -- Deployment

  • Open a terminal tab and log into ol-home0
  • Open a terminal tab and log into ol-covers0
  • Open a terminal tab and log into ol-web1
  • Open https://openlibrary.org/admin?stats so that you can monitor server status
  • Warn Slack channels openlibrary and openlibrary-g of imminent downtime!
  • On ol-home0
    • cd /opt/openlibrary
    • sudo git checkout master && sudo git pull origin master
    • sudo make git
    • cd /opt/olsystem
    • sudo git checkout master && sudo git pull origin master # -- Enter GitHub userid & token
    • cd /opt/booklending_utils
    • sudo git checkout master && sudo git pull origin master # -- Enter git.archive.org userid & password
    • cd /opt/openlibrary
  • Repeat the same steps on ol-covers0 and ol-web1 # covers does not need booklending_utils
  • Run ./scripts/deployment/are_servers_in_sync.sh to ensure the three servers are in sync.
  • On ol-home0 run /opt/openlibrary/scripts/deployment/deploy.sh
    • NOTE: Run sudo (note note: it superflously rsyncs the repos over to the other clients; we might want to rm the step? Or maybe not git pull?)
  • Copy static files out Docker image and put them on ol-www1
# Make a backup of static assets
ssh -A ol-www1 'sudo cp -r /opt/openlibrary/openlibrary/static /opt/openlibrary/openlibrary/static_backup'
STATIC_DIR=/tmp/ol-static-$(date '+%Y-%m-%d')
docker cp $(docker create --rm oldev:latest):/openlibrary/static $STATIC_DIR
rsync -rvz $STATIC_DIR/ ol-www1:$STATIC_DIR
# TODO: There's another static dir!
ssh -A ol-www1 "sudo mkdir -p /opt/openlibrary/openlibrary/static-new && sudo cp -r $STATIC_DIR/* /opt/openlibrary/openlibrary/static-new"
ssh -A ol-www1 'sudo chown -R openlibrary:openlibrary /opt/openlibrary/openlibrary/static-new'
ssh -A ol-www1 'sudo rm -r /opt/openlibrary/openlibrary/static && sudo mv /opt/openlibrary/openlibrary/static-new /opt/openlibrary/openlibrary/static'
  • Run ~/are_servers_in_sync.sh to ensure the three servers have the same Docker latest.
  • Deploy to ol-web2

2020-03-X?

ADD A PIECE ON ROLLBACK WITH AND WITHOUT VOLUME MOUNTS

  1. docker image ls
  2. Write down the Docker IMAGE ID that you want to roll back to.
  3. docker tag oldev:<IMAGE ID> oldev:latest
  4. Run /opt/openlibrary/scripts/deployment/restart_servers.sh
  5. Repeat the above steps as required on ol-covers0, ol-web1, ol-web2

EnvVariable $OLDEV_DOCKER_SHA for which Docker image to use oldev:latest vs oldev:SomeSHA

After deploy.sh finishes successfully,

Browse http://ol-web1.us.archive.org:8080/status :

  • Software version [???]
  • Python version 3.8.6
  • Host ol-web1.us.archive.org
  • Browse other pages of the site looking for issues

Once things look stable and correct...

  • Log out of ol-web1 and into ol-web2
  • Sync the repos
  • Run ~/are_servers_in_sync.sh to ensure the repos and Docker latest match all other servers
  • Run /opt/openlibrary/scripts/deployment/restart_servers.sh on ol-web2
  • https://openlibrary.org/admin?stats ol-web2 goes green --> red --> green

Browse http://ol-web2.us.archive.org:8080/status :

  • Software version [???]
  • Python version 3.8.6
  • Host ol-web2.us.archive.org
  • Browse other pages of the site looking for issues

2021-04-08 -- FAILED

HALT DEPLOYMENT: docker-compose incompatibility discussed in #5007

2021-03-24 -- Deployment -- IMAGE ID b634610574fc -- Software ver f4f9233

  • Lot's of dependency updates after turning on dependabot
  • Follow 2021-03-15 gameplan below

Open an ssh tunnel; i.e. ssh -L 8080:ol-web1:8080 -qAy cclauss@sshgw-sf.us.archive.org Then you'll be able to test at localhost:8080


# Make a backup of static assets
ssh -A ol-www1 'sudo cp -r /opt/openlibrary/openlibrary/static /opt/openlibrary/openlibrary/static_backup'
STATIC_DIR=/tmp/ol-static-$(date '+%Y-%m-%d')
docker cp $(docker create --rm oldev:latest):/openlibrary/static $STATIC_DIR
rsync -rvz $STATIC_DIR/ ol-www1:$STATIC_DIR
# TODO: There's another static dir!
ssh -A ol-www1 "sudo mkdir -p /opt/openlibrary/openlibrary/static-new && sudo cp -r $STATIC_DIR/* /opt/openlibrary/openlibrary/static-new"
ssh -A ol-www1 'sudo chown -R openlibrary:openlibrary /opt/openlibrary/openlibrary/static-new'
ssh -A ol-www1 'sudo rm -r /opt/openlibrary/openlibrary/static && sudo mv /opt/openlibrary/openlibrary/static-new /opt/openlibrary/openlibrary/static'

2021-03-15 -- Deployment

  • Open a terminal tab and log into ol-home0
  • Open a terminal tab and log into ol-covers0
  • Open a terminal tab and log into ol-web1
  • Open https://openlibrary.org/admin?stats so that you can monitor server status
  • Warn Slack channels openlibrary and openlibrary-g of imminent downtime!
  • On ol-home0
    • cd /opt/openlibrary
    • sudo git checkout master && sudo git pull
    • sudo make git
    • cd /opt/olsystem
    • sudo git checkout master && sudo git pull # -- Enter GitHub userid & token
    • cd /opt/booklending_utils
    • sudo git checkout master && sudo git pull # -- Enter git.archive.org userid & password
    • cd /opt/openlibrary
  • Repeat the same steps on ol-covers0 and ol-web1 # covers does not need booklending_utils
  • Run ~/are_servers_in_sync.sh to ensure the three servers are in sync.
  • On ol-home0 run /opt/openlibrary/scripts/deployment/deploy.sh
  • Start an old-style deploy: ssh -A ol-home /olsystem/bin/deploy-code openlibrary
  • Copy files out Docker image and put them on ol-www1
  • Run ~/are_servers_in_sync.sh to ensure the three servers have the same Docker latest.

ADD A PIECE ON ROLLBACK WITH AND WITHOUT VOLUME MOUNTS

  1. docker image ls
  2. Write down the Docker IMAGE ID that you want to roll back to.
  3. docker tag oldev:<IMAGE ID> oldev:latest
  4. Run /opt/openlibrary/scripts/deployment/restart_servers.sh
  5. Repeat the above steps as required on ol-covers0, ol-web1, ol-web2

EnvVariable $OLDEV_DOCKER_SHA for which Docker image to use oldev:latest vs oldev:SomeSHA

After deploy.sh finishes successfully,

Browse http://ol-web1.us.archive.org:8080/status :

  • Software version [???]
  • Python version 3.8.6
  • Host ol-web1.us.archive.org
  • Browse other pages of the site looking for issues

Once things look stable and correct...

  • Log out of ol-web1 and into ol-web2
  • Sync the repos
  • Run ~/are_servers_in_sync.sh to ensure the repos and Docker latest match all other servers
  • Run /opt/openlibrary/scripts/deployment/restart_servers.sh on ol-web2
  • https://openlibrary.org/admin?stats ol-web2 goes green --> red --> green

Browse http://ol-web2.us.archive.org:8080/status :

  • Software version [???]
  • Python version 3.8.6
  • Host ol-web2.us.archive.org
  • Browse other pages of the site looking for issues

2021-03-04 -- Deployment...

Do old-style deploy EARLY

One cover job failed on an error -- Restarting the server cleaned it up.

Troubles on ol-home because of

  1. a lack of diskspace
    • Remove older files from ol-home:/opt/openlibrary/deploys/openlibrary
  2. Problems with one or more PRs in the deployment

2021-02-12 -- Deployment...

Check for red flags in nagios

How do I log in???

Not confident that all repos:

  1. are set to the master branch
  2. have no uncommitted changes
  3. git pull works properly

Created a script in ~cclauss/are_repos_in_sync.sh to ensure repos are all using the same SHA on all hosts

When running deploy you must enter:

  1. GitLab id and password
  2. GitHub id and password

One Docker image download failed so deploy.sh needed to be halted and rerun.

After deploy.sh finishes successfully, run /opt/openlibrary/scripts/deployment/restart_servers.sh on:

  • ol-home0
  • ol-covers0
  • ol-web1
Creating openlibrary_covers_2    ... error
Creating openlibrary_memcached_1 ... done

ERROR: for openlibrary_covers_2  Cannot create container for service covers: failed to mkdir /var/lib/docker/volumes/openlibrary_ol-vendor/_data/infogami/infogami: mkdir /var/lib/docker/volumes/openlibrary_ol-vendor/_data/infogami/infogami: file exists

ERROR: for covers  Cannot create container for service covers: failed to mkdir /var/lib/docker/volumes/openlibrary_ol-vendor/_data/infogami/infogami: mkdir /var/lib/docker/volumes/openlibrary_ol-vendor/_data/infogami/infogami: file exists
Encountered errors while bringing up the project.

Browse ol-web1.us.archive.org:8080/status:

  • Software version [c446875]
  • Python version 3.8.6
  • Host ol-web1.us.archive.org
  • Browse -- No covers
    • Rerun /opt/openlibrary/scripts/deployment/restart_servers.sh on ol-covers0
    • Do Old-style deploy

Old-style deploy: ssh -A ol-home /olsystem/bin/deploy-code openlibrary

Once things look stable and correct...

Browse http://ol-web2.us.archive.org:8080/status :

  • Software version [c446875]
  • Python version 3.8.6
  • Host ol-web2.us.archive.org
  • Browse -- Perfect!

DONE!

2021-02-04 -- Deployment...

DO NOT sudo bash (or rsync will be unhappy)

OL-HOME0 cd /opt/openlibrary scripts/deployment/deploy.sh Provide userid/password for git.archive.org Provide userid/token for github.com Q: Why do we get dumped into emacs? (This was sudo bash) 4 minutes to docker-compose build --pull web 32 seconds to docker-compose run -uroot --rm home make i18n REPOSITORY TAG IMAGE ID CREATED SIZE oldev 7dbe17f2a809697acf53766bd89d60443653260d 9c8a9bb06afc 37 seconds ago 2.87GB 4 minutes to gzip 1 minute per server to rsync image plus repos Need to time docker load

2021-02-02 -- Process to update the infogami submodule...

pwd # openlibrary/vendor/infogami git pull upstream master # Ensure we have the latest infogami git status cd ../.. git status git branch git checkout -b update-infogami git add vendor/infogami git status git commit -m"update infogami" git push --set-upstream origin update-infogami # Create a PR and merge it

2021-01-25

  • Using scripts/deployment
    • 4-minute docker build
    • need to time make i18n
    • 4-minute docker save
    • rsyncs fail :-(

2021-01-21

  • solr-updater
    • Needed /opt/.petabox/dbserver, with perms for openlibrary user +x.
    • Something weird had happened on ol-home! the .offset files had disappeared :/ Seemed like solr-updater was stuck in a restart loop. Also infobase had been running on it for ~10h. Likely at some point the entire ol-home VM was restarted, causing supervisor to restart all the processes.
    • Remove the old processes from /etc/supervisor/conf.d, and ran sudo supervisorctl update
    • solr-updater is running ol-home0 from 2021-01-21 ; likely missed ~10h of edits or something. No clue how long the break was.
    • To check last edit + offset processed by solr-updater on ol-home0: docker-compose exec solr-updater cat '/solr-updater-data/solr-update.offset'
    • To get the last record/datetime: curl -s "ol-home0:7000/openlibrary.org/log/$(docker-compose exec solr-updater cat '/solr-updater-data/solr-update.offset')?limit=1"
  • import-bot
    • It launched ok, but we noticed that on both ol-home and ol-home0 we were getting loads of errors:
l: https://openlibrary.org/api/import/ia. Response:
importbot_1         | 2021-01-21 18:43:06 [178] [openlibrary.importer] [INFO] sleeping for 5 seconds before next attempt.
importbot_1         | 2021-01-21 18:43:11 [178] [openlibrary.api] [INFO] POST /api/import/ia
importbot_1         | 2021-01-21 18:43:11 [178] [openlibrary.importer] [WARNING] Failed to contact OL server. error=403 Client Error: Forbidden for url: https://openlibrary.org/api/import/ia. Response:
importbot_1         | 2021-01-21 18:43:11 [178] [openlibrary.importer] [ERROR] failed with internal error
importbot_1         | 2021-01-21 18:43:11 [178] [openlibrary.imports] [INFO] set-status professioncallgi0000luca - failed internal-error None
importbot_1         | 2021-01-21 18:43:11 [178] [openlibrary.importer] [INFO] importing istanbultocairoo0000hump
importbot_1         | 2021-01-21 18:43:11 [178] [openlibrary.api] [INFO] POST /api/import/ia
importbot_1         | 2021-01-21 18:43:11 [178] [openlibrary.importer] [WARNING] Failed to contact OL server. error=403 Client Error: Forbidden for url: https://openlibrary.org/api/import/ia. Response:
importbot_1         | 2021-01-21 18:43:11 [178] [openlibrary.importer] [INFO] sleeping for 5 seconds before next attempt.
importbot_1         | 2021-01-21 18:43:16 [178] [openlibrary.api] [INFO] POST /api/import/ia
importbot_1         | 2021-01-21 18:43:17 [178] [openlibrary.importer] [WARNING] Failed to contact OL server. error=403 Client Error: Forbidden for url: https://openlibrary.org/api/import/ia. Response:
importbot_1         | 2021-01-21 18:43:17 [178] [openlibrary.importer] [INFO] sleeping for 5 seconds before next attempt.
importbot_1         | 2021-01-21 18:43:22 [178] [openlibrary.api] [INFO] POST /api/import/ia
importbot_1         | 2021-01-21 18:43:22 [178] [openlibrary.importer] [WARNING] Failed to contact OL server. error=403 Client Error: Forbidden for url: https://openlibrary.org/api/import/ia. Response:
Response: None
2021-01-21 18:30:31 [2374] [openlibrary.importer] [ERROR] failed with internal error
2021-01-21 18:30:31 [2374] [openlibrary.imports] [INFO] set-status practicalapproac0000sime_g7a3 - failed internal-error None
0.0 (1): UPDATE import_item SET status = 'failed', error = 'internal-error', import_time = '2021-01-21T18:30:31.230473', ol_key = NULL WHERE id=2424861
2021-01-21 18:30:31 [2374] [openlibrary.importer] [INFO] importing catalogueofmisce00slei_5
2021-01-21 18:30:31 [2374] [openlibrary.api] [INFO] POST /api/import/ia
/opt/openlibrary/venv/local/lib/python2.7/site-packages/urllib3/util/ssl_.py:139: InsecurePlatformWarning: A true SSLContext object is not available. This prevents urllib3 from configuring SSL appropriately and may cause certain SSL connections to fail. You can upgrade to a newer version of Python to solve this. For more information, see https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings
  InsecurePlatformWarning,
2021-01-21 18:30:31 [2374] [openlibrary.importer] [WARNING] Failed to contact OL server. error=403 Client Error: Forbidden for url: https://openlibrary.org/api/import/ia. Response:
2021-01-21 18:30:31 [2374] [openlibrary.importer] [INFO] sleeping for 5 seconds before next attempt.
2021-01-21 18:30:36 [2374] [openlibrary.api] [INFO] POST /api/import/ia
/opt/openlibrary/venv/local/lib/python2.7/site-packages/urllib3/util/ssl_.py:139: InsecurePlatformWarning: A true SSLContext object is not available. This prevents urllib3 from configuring SSL appropriately and may cause certain SSL connections to fail. You can upgrade to a newer version of Python to solve this. For more information, see https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings
  InsecurePlatformWarning,
2021-01-21 18:30:36 [2374] [openlibrary.importer] [WARNING] Failed to contact OL server. error=403 Client Error: Forbidden for url: https://openlibrary.org/api/import/ia. Response:

2021-01-15 Deployment steps

Use the scripts from https://github.com/internetarchive/openlibrary/pull/4395

  1. ol-home0: Run scripts/deployment/start_production_deploy.sh
    • NOTE: Login problem with booklending_utils (redo manually)
  2. ol-home0, ol-web1, ol-covers: Run scripts/deployment/continue_production_deploy.sh
    • Run the script only until just before the docker-compose down step.
    • Open https://openlibrary.org/admin?stats
    • Once all three hosts are ready, run commands from docker-compose down to the bottom of the script.
    • Check https://openlibrary.org/admin?stats and http://ol-web1.us.archive.org:8080/status
  3. Repeat step 2 only on the host ol-web2

NOTE: On both ol-web{1, 2}, the command docker-compose run -uroot --rm home make i18n failed with:

ERROR: An HTTP request took too long to complete. Retry with --verbose to obtain debug information.
If you encounter this issue regularly because of slow network conditions, consider setting
    COMPOSE_HTTP_TIMEOUT to a higher value (current value: 60).

Manually re-running the command finished in a few seconds with no warnings or errors.

THINGS WE FORGOT: Old-style install on ol-home -- ssh -A ol-home /olsystem/bin/deploy-code openlibrary

2021-01-09 Deployment steps

  1. ol-home0: sudo git pull origin master the four repos
  2. ol-home0: build new Docker image
  3. ol-home0: docker save the image with both tags :SHA and :latest
    • Should we used date-time instead of :SHA?
  4. ol-home0: rsync the :SHA-tagged image to ol-web{1,2}, ol-covers
  5. ol-web{1,2}, ol-covers: docker image prune && docker load the image with both tags :SHA and :latest
  6. ol-home0: rsync the four repos to ol-web{1,2}, ol-covers
  7. all hosts: docker-compose up the appropriate docker services for that host

When start_production_deploy.sh is run on ol-home0 it should achieve steps 1. thru 4. above.

#!/bin/bash

# This script must be run on ol-home0 to start a new deployment.

if [[ ${HOSTNAME:-$HOST} != "ol-home0" ]]; then
    echo "FATAL: Must only be run on ol-home0" ;
    exit 1 ;
fi

# `sudo git pull origin master` the core Open Library repos:
# 1. https://github.com/internetarchive/olsystem
# 2. https://git.archive.org/jake/booklending_utils
# 3. https://github.com/internetarchive/openlibrary
# 4. https://github.com/internetarchive/infogami
REPO_DIRS="/opt/olsystem /opt/booklending_utils /opt/openlibrary /opt/openlibrary/vendor/infogami"
for REPO_DIR in $REPO_DIRS
do
    cd $REPO_DIR
    sudo git pull origin master
done

# These commands were run once and probably do not need to be repeated
sudo mkdir -p /opt/olimages || true
sudo chown root:staff /opt/olimages
sudo chmod g+w /opt/olimages
sudo chmod g+s /opt/olimages
docker image prune

# Build the oldev Docker production image
cd /opt/openlibrary
export COMPOSE_FILE="docker-compose.yml:docker-compose.production.yml"
docker-compose build --pull web
docker-compose run -uroot --rm home make i18n

# Add a timestamp tag to the Docker image to facilitate rapid rollback
echo "FROM oldev:latest" | docker build -t "oldev:$(date +%Y-%m-%d_%H-%M)" -
docker image ls

# Compress the image in a .tar.gz file for transfer to other hosts
cd /opt/olimages
docker save oldev:latest | gzip > oldev_latest.tar.gz \*2

# Transfer the .tar.gz image to other hosts
REMOTE_HOSTS="ol-covers0 ol-web1 ol-web2"
for REMOTE_HOST in $REMOTE_HOSTS
do
    echo "Starting rsync of oldev_latest.tar.gz to $REMOTE_HOST..."
    rsync -a --no-owner --group --verbose oldev_latest.tar.gz "$REMOTE_HOST:/opt/olimages/"
    echo "Finished rsync of oldev_latest.tar.gz to $REMOTE_HOST..."
done

2021-01-07 Deploy

THINGS WE FORGOT:

  1. To down/up infobase:
export COMPOSE_FILE="docker-compose.yml:docker-compose.production.yml"
docker-compose down
docker volume rm openlibrary_ol-vendor openlibrary_ol-build openlibrary_ol-nodemodules
HOSTNAME="$HOSTNAME" docker-compose up -d --no-deps infobase infobase_nginx affiliate-server
  1. To check for new errors on Sentry 😅

  1. @mek Modify fabfile to rsync ol code repos to the new servers in their expected directories
cp ol-home:/deploy/latest/opt/openlibrary ol-web1:/opt/openlibrary
  1. @mek Run old-style deploy from ol-home
ssh -A ol-home
/olsystem/bin/deploy-code openlibrary
  1. @mek Checkout infogami 7595ae7 on ol-home

  1. @cclauss ol-home0 pull latest code of all repo
  2. @cclauss ol-home0 build the image *1
export COMPOSE_FILE="docker-compose.yml:docker-compose.production.yml"
docker-compose build --pull web
docker-compose run -uroot --rm home make i18n
  1. @cclauss Create /opt/olimages for everything
    • ol-web1
    • ol-web2
    • ol-covers0
    • @cdrini ol-home0
sudo mkdir -p /opt/olimages
sudo chown root:staff /opt/olimages
sudo chmod g+w /opt/olimages
sudo chmod g+s /opt/olimages
  1. @cdrini Docker save image
cd /opt/openlibrary
docker image prune # Not necessary; but cleanup
echo "FROM oldev:latest" | docker build -t "oldev:$(git rev-parse HEAD)" -

cd /opt/olimages

time docker save oldev:latest | gzip > oldev_latest.tar.gz \*2
# ~3 min, final file 820 MB ; image in docker container ls was 2.6 GB
  1. Modify fabfile to rsync docker image to all the hosts
cd /opt/olimages
# each takes ~18s
time rsync -a --no-owner --group --verbose oldev_latest.tar.gz "ol-web1:/opt/olimages/"
time rsync -a --no-owner --group --verbose oldev_latest.tar.gz "ol-web2:/opt/olimages/"
time rsync -a --no-owner --group --verbose oldev_latest.tar.gz "ol-covers0:/opt/olimages/"
  1. Do old style deploy using mek's code so that it doesn't try to rsync to ol-web{1,2} ol-covers0 ol-home0 *4
    • Put mek's fabfile.py changes into fab_file_changes.diff
ssh -A ol-home
/olsystem/bin/deploy-code openlibrary
  1. [x web1, x web2, x covers0] @cclauss git pull latest code
cd /opt/olsystem                    && sudo git pull origin master
cd /opt/booklending_utils           && sudo git pull origin master
cd /opt/openlibrary                 && sudo git pull origin master
cd /opt/openlibrary/vendor/infogami && sudo git pull origin master
cd /opt/openlibrary
  1. [x web1, x web2, x covers0] docker load on all hosts *3
    • The new docker image should have label "SHA" as well as "latest"
# ~2min
time docker image prune

# ~4min
time docker load < /opt/olimages/oldev_latest.tar.gz

cd /opt/openlibrary
echo "FROM oldev:latest" | docker build -t "oldev:$(git rev-parse HEAD)" -

for node in ol-web{1,2} ol-covers0

  1. Down / up
export COMPOSE_FILE="docker-compose.yml:docker-compose.production.yml"
# WARNING! Moment of downtime 😬 
docker-compose down
docker volume rm openlibrary_ol-vendor openlibrary_ol-build openlibrary_ol-nodemodules
HOSTNAME="$HOSTNAME" docker-compose up -d --scale covers=2 covers_nginx memcached
export COMPOSE_FILE="docker-compose.yml:docker-compose.production.yml"
docker-compose down
# Remove these because they contain stale copies of the build/vendor/nodemodules files.
# Want to use the files inside the docker image since it's freshly built.
# NOTE: longer term, we likely don't want these volume mounted for production
docker volume rm openlibrary_ol-vendor openlibrary_ol-build openlibrary_ol-nodemodules
docker-compose run -uroot --rm home make i18n
HOSTNAME="$HOSTNAME" docker-compose up --no-deps -d web

Notes

  1. Hit a docker error: ERROR: Couldn't connect to docker daemon . Was not in docker group
# Make sure you're in the docker group, to avoid sudo-ing all the docker commands
sudo usermod -aG docker USER_NAME
# Exit/re-enter to take effect
  1. We might want a date-based instead of SHA-based image label, so that this is more human-readable (maybe date +%Y-%m-%dT%H%M)

  2. It looks like docker load reloads all the layers!

Here is docker image ls before:

drini@ol-web1:/opt/olimages$ docker image ls
REPOSITORY           TAG                 IMAGE ID            CREATED             SIZE
oldev                latest              24e7730de44e        3 weeks ago         2.55GB
openlibrary/olbase   latest              7107cde0123e        4 weeks ago         2.43GB
openlibrary/olbase   <none>              e7ad65c36ade        5 weeks ago         2.45GB
openlibrary/olbase   <none>              5bc304b7af6a        6 weeks ago         2.46GB
openlibrary/olbase   <none>              8b9ba7350023        8 weeks ago         2.46GB
openlibrary/olbase   <none>              089f94fd1626        8 weeks ago         2.45GB

It notes:

The image oldev:latest already exists, renaming the old one with ID sha256:24e7730de44e45356224fd9ec0f58eb39ef104f5e38b28227ff100ed67a1d0f6 to empty string
Loaded image: oldev:latest

After:

drini@ol-web1:/opt/olimages$ docker image ls
REPOSITORY           TAG                 IMAGE ID            CREATED             SIZE
oldev                latest              8b1a1ec051bf        About an hour ago   2.58GB
<none>               <none>              24e7730de44e        3 weeks ago         2.55GB
openlibrary/olbase   latest              7107cde0123e        4 weeks ago         2.43GB
openlibrary/olbase   <none>              e7ad65c36ade        5 weeks ago         2.45GB
openlibrary/olbase   <none>              5bc304b7af6a        6 weeks ago         2.46GB
openlibrary/olbase   <none>              8b9ba7350023        8 weeks ago         2.46GB
openlibrary/olbase   <none>              089f94fd1626        8 weeks ago         2.45GB
  1. TODO for @cdrini: Remove all refs to ol-solr2 in olsystem

2020-12-10 Deploy

THINGS I FORGOT TO DO:

  • docker-compose run -uroot --rm home make i18n (Added after-the-fact)
  1. On web{1,2} chown /booklending_utils
sudo chown 999:staff -R /opt/booklending_utils
sudo chmod g+w -R /opt/booklending_utils
sudo find /opt/booklending_utils -type d -exec chmod g+s {} \;
  1. Do a classic deploy so that ol-www1 has up-to-date static files
    • Confirm web{3,4} not restarted

for node in web{1,2}:

  1. [x, x] Down (down first so code in memory that's currently running is not out-of-sync with what's on the filesystem)
export COMPOSE_FILE="docker-compose.yml:docker-compose.production.yml"
docker-compose down
  1. [x, x] Checkout
cd /opt/olsystem
git pull origin master
cd /opt/openlibrary
git pull origin master
cd /opt/openlibrary/vendor/infogami
git pull origin master
cd /opt/booklending_utils
git pull origin master
  1. [x, x] Build
export COMPOSE_FILE="docker-compose.yml:docker-compose.production.yml"
docker-compose build --pull web
docker-compose run -uroot --rm home make i18n
  1. [x, x] Restart
export COMPOSE_FILE="docker-compose.yml:docker-compose.production.yml"
HOSTNAME=$HOSTNAME PYENV_VERSION=3.8.6 docker-compose up -d --no-deps web

Sort of issue? On web2, got a weird issue; git pull origin master was not pulling that latest infogami. We think this is because of some odd funny business with .git and /openlibrary (which contains .git) being mounted.

  • Seems like I forgot the COMPOSE_FILE

Note: need to build to JS and everything else as well!

What is the status of https://github.com/internetarchive/openlibrary/issues/2032 ?

Option:

  1. Merge your olsystem PR, and test running the old deploy code (should do checkout on all the nodes)
  1. Manually do it

Option:

  1. Try to docker save/load, and only build on ol-home
    • If we didn't mount /openlibrary...
  1. Do it manually

ol-www1 is what serves static files, so we need a way to build the js/css/components (inside docker), and then copy the files over to ol-www1

Creates a tar: docker save oldev:latest

??? Get to the other servers

docker load oldev.tar oldev:14f docker tag oldev:14f latest

Clone this wiki locally