Skip to main content

Server

The server is hosted in the Lys data center at EPFL. Due to legal limitation on EPFL's side, it is fed a direct connection to the ISP, and thus is not on the school's network. The server is in the rack 19, slots 9 to 11. Accesses can be requested through the AGEP's IT manager which is responsible for the managment of the rack.

It can be accessed by ssh through the user clic-admin, below is a corresponding .ssh/config entry. Key management is done through the init.yaml playbook on the infra repo.

Host clic
  HostName clic.epfl.ch
  User clic-admin

Connectivity

The connection is provided by Saitis, along with an IPv4 (62.220.153.185) as well as an IPv6 subnet (2001:788:f185::/48) exposed onto 2001:788:153:a:185::2/80 (important: the IPv6 subnet is not yet setup on the server). The current netplan config, /etc/netplan/static.yaml, is shown below and can be deployed using netplan generate && netplan apply.

network:
  version: 2
  renderer: networkd
  ethernets:
    enp7s4:
      addresses:
        - 62.220.153.185/24
      nameservers:
        addresses:
          - 8.8.8.8
      routes:
        - to: default
          via: 62.220.153.1
          metric: 200

Infrastructure

Infrastructure configuration is on the Clic Infra repository. Each service is hosted as a Docker stack. This makes it easy to add new services without worrying about dependencies, architecture, etc. Updating the server when the configuration has been modified is done using the [Ansible] tool (https://www.ansible.com/).

Important note: the reverse-proxy (Caddy) is not in a Docker container, but directly installed on the machine. See explanation below.

Add/Remove/Modify a service

Modifications to the infrastructure must first be made on the infra repository, then applied to the server using Ansible (see below).

Each service is configured via a docker-compose.yaml file (doc), in a role unique to the service. Services generally never communicate with each other, except via their exposed APIs (e.g. the website accesses Directus via its public url, not via a dedicated virtual network). If a service requires a database, it is instantiated in the same stack as the service and is exclusively dedicated to it.

/!\ Warning /!\ : stacks must be modifiable with a simple docker stack deploy; don't use config, as these are immutable. The stack would have to be removed and then redeployed, which is sub-optimal. Instead, configure services via their environment variables.

If the service needs to be accessible via the Internet, it must expose one or more ports, which will then be exposed to the outside world by the reverse-proxy.

Once everything is setup and pushed on the clic-infra repository, use ansible-pull -U git@github.com:clicepfl/clic-infra.git -e @/var/secrets.yaml deploy.yaml --extra-vars SERVICE=<new-service>,webhook --extra-vars build=false. The webhook needs to be redeployed in order to allow redeployement requests for that service.

Secret management

Services may require private (or server-specific) information, such as an admin password, or a mailbox password. These values are stored in a secrets.yaml file (located at /var/secrets.yaml), and loaded by Ansible during deployment. They must be referenced in the service role argument list (roles/<service>/meta/argument_specs.yaml) passed as environment variables to the stack deployment task (roles/<service>/tasks/main.yaml). They can then be retrieved from each stack's configuration using the syntax ${MY_SECRET}.

Deployment

Deployment is carried out using the playbook deploy.yaml, which the stacks referenced in the SERVICE variable (using a comma separated list), or all if the variable is empty/undefined. For example:

TODO add examples

Backups

Backups are managed by BorgBackup, enabling encrypted, incremental backups. The Borgmatic tool, configured by Playbook deploy.yaml, periodically generates and uploads backups. Its configuration (also in the Playbook) must be updated whenever a service is added/removed/modified.

To manually create a backup, you can run sudo borgmatic (ideally in a detached screen screen sudo borgmatic, since it takes several hours to complete). You can inspect the backup repository using the borg CLI tool.

Installation

In the event of a complete server reinstallation, here are the steps to follow:

  1. Use the init playbook: create an inventory on your computer (see the docs), then ansible-playbook -i <inventory> <path/to/init.yaml>.
  2. SSH into the server.
  3. Set up the /var/secrets.yaml file (see the example in the infra repo). Move the database dumps to the directories set in this file. If you do not have access to the Directus tokens, use a stub.
  4. Generate an SSH key for the default user an one for root and save both in the deployment keys of the infra repository.
  5. Launch the stacks with Playbook deploy.yaml: ansible-pull -U git@github.com:clicepfl/clic-infra.git -e @/var/secrets.yaml deploy.yaml.
  6. Check the databases content, using docker exec -it <container-name> bash then either psql -U <user> <database> or mysql --user <user> -p. The dump is present in /docker-entrypoint-initdb.d/<smth>.sql and should already be applied; if its not the case it can be piped into the database connection command shown previously: psql -U <user> <database> < /docker-entrypoint-initdb.d/init.sql.
  7. Stop all the services (docker stack rm <stack list>, using docker stack ls to list the stacks).
  8. Copy/Move the file content in the volumes (stored under /var/lib/docker/volumes/<volume-name>/_data), after having remove the content placed by the first run of each service. Skip database volumes (already initialized).
  9. If needed, regenerate token for Directus and set them in the secrets file.
  10. Restart all the services using the playbook deploy.yaml.

Everything should be set now !

Webhook

The webhook is a systemd service running on the server baremetal (i.e. not in a container). It allows GitHub to redeploy some services, for example when a new version is available. It is exposed on the port 4000.

Adding webhook to a repo

  1. On GitHub, open the settings of the repository, then go to "Webhooks" and "Add webhook".
  2. Set the following values:
    • Url: https://clic.epfl.ch:4000/website
    • Content type: application/json
    • Secret: Found in /var/secrets, under services.webhook.github.secret
    • SSL Verification: Enabled
    • Events: Depends on when you want the redeployment to be done, but usually "Packages" if there is a docker build in the CI (in that case, don't forget to uncheck "Pushes").
  3. Save the webhook

From now on, every time a new version is available, the server will automatically redeploy the service.

Help I lost access to the server :'(

In the unfortunate case where the server is unreachable, it could be quite hard to regain access, since the infomaniak's and backup's credentials are stored in the Vaultwarden. Since it has already happened, the new credentials for the necessary services (infomaniak, borg) are stored on a sheet at the Desk. This sheet must be kept up to date !