|
|
# Infrastructure Overview
|
|
|
|
|
|
*(Originally authored by
|
|
|
@mayel
|
|
|
[on this cryptpad](https://cryptpad.fr/code/#/2/code/edit/n+GUQRCm4OZWgkfizNTkjjEp/))
|
|
|
and discussed on [this loomio thread](https://www.loomio.org/d/V8AZunq0/tech-ops-team-responsibilities-initial-documentation))*
|
|
|
|
|
|
## A. Domain name
|
|
|
1. social.coop domain is registered under the https://gandi.net account of Enric of FairCoop, who should be notified and sent payment yearly before it expires
|
|
|
2. It would be good to get the domain transferred or at least administratively delegated to an account under social.coop control
|
|
|
3. DNS is managed via https://cloudflare.com where we have the option of turning on the DDOS protection and CDN/caching functionality if necessary
|
|
|
|
|
|
## B. Infrastructure
|
|
|
|
|
|
1. Back-end server (Dedicated with 8GB RAM, 4x 2.4GHz ARM cores) trunk.social.coop
|
|
|
* 2x 50GB SSD volumes
|
|
|
* Docker Swarm manager (main config: /root/stacks/social.coop/docker-cloud.yml)
|
|
|
* 5x Docker containers running Mastodon (config: /root/stacks/social.coop/.env.production)
|
|
|
2. Front-end server (VPS with 2GB RAM, 2GHz ARM core) toot.social.coop
|
|
|
* 1x 50GB SSD volume
|
|
|
* Docker Swarm worker
|
|
|
* Docker container with Nginx proxy serving all web requests
|
|
|
* Docker container with Membership Application/Invite app (PHP, uses same Postgres database server as Mastodon on back-end, code: https://github.com/socialcooperative/dataverse)
|
|
|
3. I've also been hosting free-of-charge on other servers:
|
|
|
* https://status.social.coop (powered by https://cachethq.io) This needs to be on different infrastructure so it remains accessible in case the main ones go down.
|
|
|
* https://wiki.social.coop (powered by Mediawiki)- this was initially set up as an experiment and should now be migrated to social.coop's servers.
|
|
|
4. 3rd party services
|
|
|
* https://cloudvault.me (Mayel) provides the servers
|
|
|
* https://cloud.docker.com for managing docker swarm and deployments
|
|
|
* Uploaded files from Mastodon (images, etc) are on https://www.dreamhost.com DreamObjects storage, and delivered by CDN https://www.fastly.com
|
|
|
* Email delivery by https://www.mailgun.com (10,000 emails free every month)
|
|
|
* https://www.datadoghq.com and http://pingometer.com for monitoring
|
|
|
* https://cloudflare.com for DNS
|
|
|
|
|
|
## C. Monitoring
|
|
|
1. Web services
|
|
|
* Monitor down alerts from services like http://pingometer.com
|
|
|
* https://status.social.coop currently needs to be updated manually but could be hooked up to a monitoring system
|
|
|
2. Enough free space in volumes
|
|
|
3. Performance / resource usage / container logs (Victor has set up Docker to feed into https://www.datadoghq.com)
|
|
|
|
|
|
## D. Regular Updates
|
|
|
1. Host systems (Ubuntu LTS package upgrades)
|
|
|
2. Dockerfiles and containers
|
|
|
3. Regular Mastodon upgrades
|
|
|
* Make sure to make backups first, then check for updates and setup instructions at https://github.com/tootsuite/mastodon/releases
|
|
|
4. Occasional updates of membership app, Mediawiki, etc.
|
|
|
|
|
|
## E. Security
|
|
|
1. HTTPS / SSL certificates
|
|
|
* Using https://certbot.eff.org
|
|
|
* Currently certbot needs be run manually for each domain (social.coop and members.social.coop) before the certificate expires (every 3 months)
|
|
|
* Need to set up a better Docker-compatible way to auto-renew certificates
|
|
|
2. Backups
|
|
|
* Currently manual backups are done occasionally and stored offsite by admins
|
|
|
* Need to create backup & recovery processes
|
|
|
* Need to choose/setup backup solutions & storage location
|
|
|
3. Firewalls
|
|
|
* Review rules
|
|
|
* Monitor logs
|
|
|
4. DDOS
|
|
|
* See about enabling Cloudflare
|
|
|
|
|
|
## F. Documentation & Communication
|
|
|
1. Document any new infrastructure / software / service / config
|
|
|
2. Keep all code in shared git version control
|
|
|
3. Keep configuration and private keys / passwords separate, and place all config files in shared git version control
|
|
|
4. Proactively communicate with Tech WG about reasons, approach and outcome of every change / update, and then add to documentation
|
|
|
5. Let fellow Ops Team members know before any prolonged unavailability (as much as possible)
|
|
|
6. Communicate with Ops Team during any emergency, or before doing anything that affects live services
|
|
|
7. Create/use individual accounts/passwords for each admin as much as possible
|
|
|
8. Use a secure solution for storing all shared secrets (like passwords)
|
|
|
|
|
|
## G. Fix unexpected issues
|
|
|
YMMV
|
|
|
|
|
|
---
|
|
|
# Some initial documentation
|
|
|
---
|
|
|
All of these commands must be run on the server that is the Docker swarm manager (trunk.social.coop):
|
|
|
|
|
|
To list all Docker swarm containers: `docker service ls`
|
|
|
|
|
|
To stop a service: `docker service scale [service name]=0`
|
|
|
For example: `docker service scale mastodon_dataverse=0`
|
|
|
|
|
|
To start a service: `docker service scale [service name]=1`
|
|
|
For example: `docker service scale mastodon_dataverse=1`
|
|
|
|
|
|
To re-deploy the whole swarm: `cd /root/mastodon/repo/ ; docker stack deploy -c docker-cloud.yml mastodon`
|
|
|
This command seems to be quite clever in that it only touches services that have had changes done, either to configs or updated images.
|
|
|
|
|
|
Avoid running commands using docker-compose (it will start new instances). Instead you should run commands against existing containers using (you can use tab to autocomplete the container name): `docker exec -it [container name] [command]` For example: `docker exec -it mastodon_db[TAB] bash` .
|
|
|
|
|
|
Docker Cloud is set to auto-build a container when new code is pushed to our Github repos (like the members app) This takes a while, but you can then upgrade the container with: `docker service update --image socialcooperative/members-dataverse mastodon_dataverse`
|
|
|
|
|
|
For special social.coop customisations, the docker config mounts `/var/nfs/data/www/mastodon/` over its internal volume which contains files that override the stock ones provided by Mastodon (homepage, bylaws, custom signup form, logo, stylesheets, etc).
|
|
|
|
|
|
Location of Postgres database files: `/var/vol2/postgres/mastodon`
|
|
|
|
|
|
### Mastodon upgrades
|
|
|
|
|
|
Here are some **example** steps that you **might** take when upgrading Mastodon. Please note that the process may be different every time, and that issues may arise, so make sure to have a few hours ahead of you!
|
|
|
|
|
|
* Put social.coop in maintenance mode:
|
|
|
`cd /var/nfs/data/www/mastodon/public/ ; mv maintenance_off.html maintenance.html `
|
|
|
|
|
|
* Make a backup of the database:
|
|
|
|
|
|
*(Note, `[TAB]` means tab autocompletion)*
|
|
|
|
|
|
docker exec -it mastodon_db[TAB] bash
|
|
|
df -h # check there are > ~5GB space in /var/lib/postgresql/data/!
|
|
|
pg_dumpall -U postgres -c -v -f /var/lib/postgresql/data/db-backup-$(date +%F_%R).bak
|
|
|
exit
|
|
|
|
|
|
* Bump up all the Mastodon version numbers in `/root/mastodon/repo/docker-cloud.yml` (for sidekiq, web & streaming services)
|
|
|
|
|
|
* Re-deploy the whole stack (will only touch what changed):
|
|
|
`cd /root/mastodon/repo/ ; docker stack deploy -c docker-cloud.yml mastodon`
|
|
|
|
|
|
* Check how things are doing: `docker stack ps mastodon`
|
|
|
|
|
|
* Enter the main mastodon app container: `docker exec -it mastodon_web[TAB] bash`
|
|
|
* Run all appropriate rake tasks as instructed by the Mastodon release notes (check notes from all releases between current version and new version).
|
|
|
* We can also view all rake tasks available: `rake -A -T `
|
|
|
|
|
|
* Exit the mastodon app container: `exit`
|
|
|
|
|
|
* Make social.coop live and check if everything is working:
|
|
|
`cd /var/nfs/data/www/mastodon/public/ ; mv maintenance.html maintenance_off.html ` |