User Tools

Site Tools


wiki:technical_manuals

General

We get our servers from Hetzner. They have servers hosted in the EU, are known to be cheap, have great uptime, and have a focus on being environmental-friendly.

Note: More ethical choices could be more coops-style ran organisations. Examples are FFDN members like Neutrinet. I have less experience with those for VPS, so atm I'd like to start with the “devil I know”. Later we can still switch. (Unless someone who asks for an instance explicitly asks me to already use a more ethical provider ofc. I'll check that out then, but may take more time and other unforeseen problems may arise.)

I'd like Alpine, both on the server and containers, but due to technical problems, I'm now using Ubuntu on the host. For containers we use LXC with LXD. Caddy and PostgreSQL are used on the host OS as reverse proxy and database. It may be possible to run/install multiple OTP Akkoma's on one host when using a correct env variable, but for security reasons a container is advised. The upload folder is currently part of the container, but should probably be mounted from outside the container so that the container only holds the installation and nothing more.

Set Up

Server

At Hetzner we get a new CX11 server with IPv4 and backups enabled.

  • Location: a EU location (at moment of writing either Nuremberg or Falkenstein)
  • Image: Ubuntu
  • Type:
    • Shared vCPU - x86(Intel/AMD)
    • CX11 (upgraded to CX21)
  • Networking: Public IPv4 and Public IPv6
  • SSH Keys
    • set/use default key
  • Backups: Enable this
  • Name: akkoma-$NUMBER (first server is akkoma-1)

Then we can start the server. Try to ssh, just to be sure, and check IPv4 and IPv6 connections. Then power down the machine and take a snapshot. (Costs money, but it's like less than a cent or so per month, and we only need it for a short while).

We should be able to host multiple instance on one server and we can rescale/expand the server. No need to buy a new server per instance.

There's also an option for an arm64 server in Falkenstein, which has double the specs for the same price. By now amr64 OTP builds are supported by Akkoma.

Note: Intel cpu's (including the CX21 if I understand correctly) will be deprecated, so we may need to switch to the ARM or AMD ones anyhow https://status.hetzner.com/incident/4b07aa35-b415-491a-99de-3d8f28704775

Firewall

Hetzner allows to set up a firewall outside of the servers, so do that.

  • ICMP
  • 80 incoming
  • 443 incoming
  • ssh incoming (non-default port)
  • 80 outgoing
  • 443 outgoing

When upgrading, we close incoming 80 and 443. ATM I'm unsure what to do with multiple servers. Each their own fw? Or one shared and switch/add fw's to the specific server where we do upgrades?

Alpine

I actually wanted to have Alpine on the host, but there's a problem with IPv6 that I wasn't able to easily solve. See Technical manuals - Alpine on host. For now I kept the Ubuntu image. In the future I think it's better if I would use something lighter, maybe Debian. Still needs something to manage the containers. ATM that's LXD, in the future I want to switch to Incus.

Further host setup

This is what we do on Ubuntu to get everything set up and installed. See Technical manuals - Setup Ubuntu host

Setup container

Issues

See https://codeberg.org/ilja/akkohost-scripts

I'm also rebooting one container on a weekly basis bc it freezes after a while. This is done with a `crontab -e`. Each Sunday night (well, actually monday already) at 1h.

I want to see if it's better now and should see if I can find some help.

When did it first start? How long since last reboot before the problem was there again now? Other things/logging I have (Akkoma logging in root home folder, reboot happened 15h something). `lxc info $container` showed 1.8G, which is a lot, but should have 2G (or can this already be the problem? If so, where does it come from). I have a screenshot from top on my lappy. Host was fine container was super slow. Typing took 10's of seconds (or longer?) before it was visible.

2023-08-01: It's def memory, I found a better way to measure it and I'm monitoring at daily intervals. It went up from the normal 500MB to 1000MB and stayed there. After the reboot all was fine, but tuesday (today) between 15:17 and 16:17 it went back up to ~2G and the container froze again just like last time. It's clearly Akkoma itself and it's clearly memory. The question is why. I found on the Elixir forum that from source installation handles memory differently. I don't see this problem on my own instance, so it may be worth doing a from source installation instead.

Changed to daily reboot now.

Getting an iex shell: `/opt/akkoma/bin/pleroma remote`

Starting garbage collection from iex: `:erlang.garbage_collect()`

Garbage collection immediately game me ´true´, but memory didn't went down. What also is weird is that rss is ~700MB atm and beam is ~400MB.

iex(pleroma@127.0.0.1)3> :erlang.garbage_collect()
true
iex(pleroma@127.0.0.1)4> :erlang.memory()         
[
  total: 397451208,
  processes: 203974848,
  processes_used: 203974184,
  system: 193476360,
  atom: 2113937,
  atom_used: 2097843,
  binary: 24281632,
  code: 72255670,
  ets: 46239664
]

I switched the mem measurement to 5 minutes intervals, and we see a rise at times. An upgrade to Alpine 3.18 (from 3.16) and Akkoma upgrade dit not help. ATM it's only this one instance who is affected.

2023-09-04 I set up a new container and installed from source. The ld container still exists, but it shouldn't restart any more. I can't simply switch between the two bc instance and uploads are in the container. I wanted to mount them in, but it's a bit more involved due to user privileges.

Upgrade

We want a back-up, upgrade, and test if everything works.

  • block incoming ports 80 and 443 on the Hetzner firewall so no new messages come in (Maybe a fw can be defined with a label?)
  • bring the server down
  • make server backup from cloud console. (Choose backup, not snapshot. If really needed you can convert a backup to a snapshot)
  • Restart server and do ubuntu upgrade. `apt update && apt upgrade -y`
  • restart server
  • check with curl if instances are still running
  • Per Akkoma container, upgrade the container and instance.
  • restart server
  • check with curl if instances are still running
  • unblock 80 and 443

Check with curl:

lxc list
# Then for each IP
curl -s 10.49.166.217:4000 | grep -io akkoma,
curl -s 10.49.166.190:4000 | grep -io akkoma,
# In a one-liner, the following should return a line "Akkoma," for each instance
# for ip in $(lxc list --columns 4 --format csv | grep -o '^[0-9,.]*'); do curl -s "$ip":4000 | grep -io akkoma,; done
# TODO: Check for version in nodeinfo instead of just the default page (esp since the default page depends on the fe that is used).

Upgrade container

# Get the container names
lxc list --columns n --format csv

# Then, per container
lxc exec "$container_name" sh
# Then in that shell
apk update && apk upgrade
curl -s localhost:4000 | grep -io akkoma,

# To upgrade to a newer Alpine
nano /etc/apk/repositories
# Then change the version to the required Alpine version
apk update && apk upgrade
curl -s localhost:4000 | grep -io akkoma,

Update Akkoma

SHELL=/bin/sh
su -s "$SHELL" akkoma
cd
./bin/pleroma_ctl update --branch stable
./bin/pleroma stop
./bin/pleroma_ctl migrate
./bin/pleroma_ctl frontend install pleroma-fe --ref stable
./bin/pleroma_ctl frontend install admin-fe --ref stable
exit
rc-service akkoma restart
reboot

Extra

# To take a snapshot of a container
lxc snapshot $container

# To make a dump of the db
# Check the DB name
# sudo -Hu postgres psql
# \l
# \q
databasename=<database-to-backup>
sudo -Hu postgres pg_dump -d $databasename --format=custom -f /tmp/$databasename.pgdump
mv /tmp/$databasename.pgdump /root/backups/

Shutting instance down

We can take a backup by moving config and upload folders and do a database dump

# Shut down the container and set it to not start automatically
lxc config set $container boot.autostart false
lxc config get $container boot.autostart

# Permanently remove the container
# NOTE this also removes snapshots!
lxc delete $container

We can change the Caddy file

We need to update DNS records

Packaging

If it doesn't exist yet, I would like to eventually package Akkoma for Alpine. First I should be sure if I want to run from source or OTP. Then I need to figure out how packaging works.

https://wiki.alpinelinux.org/wiki/Creating_an_Alpine_package

I want to be able to do a complete installation, by basically doing apk add akkoma. I'm thinking I can provide the instance.gen options with ENV variables.

  • If they are not there ⇒ Just install Akkoma
  • If they are there ⇒ Also do setup

I can probably do OpenRC regardless, and Caddy if I have the hostname.

I should also have an url where I host the package from, and a CI thingy for testing.

Other notes

There are daily backups handled by Hetzner. I also want backups before upgrades. The following are just some notes.

### Backups

1. Take container snapshot before upgrades
    * Check if these can be taken live
    * Only keep a couple of them

1. Backup Akkoma per container on the host system
    *  <https://docs.akkoma.dev/stable/administration/backup/>
    * Maybe also the "instance" folder?
      * It includes custom emojis I think, and also maybe other things
      * I don't need the front-ends, those can be installed again
2. Have incremental thingies
    * How?
      * rsync can somewhat do it <https://linuxconfig.org/how-to-create-incremental-backups-using-rsync-on-linux>
      * 
    * How long to keep?
3. Use rsync to sync to IS server
    * <https://download.samba.org/pub/rsync/rsync.1>
    * This can be another server in the future

```sh

```

### Maintanance

```sh
# Change sources to use 'latest-stable' instead of fixed versions
# or is it safer to keep using versions? :thinking:
nano /etc/apk/repositories

# Upgrading, see <https://wiki.alpinelinux.org/wiki/Upgrading_Alpine>
apk add --update-cache --upgrade apk-tools
apk upgrade --update-cache --available
#sync
#reboot

# Check current os version
cat /etc/os-release
cat /etc/alpine-release
```

Some useful commands

```sh
# Restart Caddy (renews certificates)
rc-service caddy restart

# List containers in a formatted table
lxc list

# List containers formatted as csv
lxc list -f csv

# List containers formatted as csv only showing the info of the "name" column
lxc list -f csv -c n

# Restart/stop/start container
lxc restart <containername>
lxc stop <containername>
lxc start <containername>

# Run a command in a container (to get a shell, do "sh" for command)
lxc exec <containername> <command>

# delete a container
lxc delete <containername>

# Create and launch a new container using the provided image
lxc launch images:debian/12 <containername>
lxc launch images:alpine/3.16 <containername>

# List provided images
lxc image list images:
lxc image list images:alpine
lxc image list images:alpine/3

# Info on the container including resource usage and snapshots
lxc info <containername>

# Stuff for snapshots
lxc restore <containername> snap0
lxc snapshot --help

# Caddy
caddy help
nano /etc/caddy/Caddyfile
# Separate files for separate sites: <https://caddy.community/t/organizing-sites-into-multiple-caddyfiles/5921>
```
wiki/technical_manuals.txt · Last modified: 2023/09/29 09:56 by ilja