Dynamic reverse proxy Pacman cache using NGINX
Arch Linux is a rolling release distribution. Almost every day there are updates available including kernel updates, "The nVidia proprietary blob", etc... These packages are huge and even with a very fast internet connection they take a lot of bandwidth and resources from Arch Linux Tier 2 mirrors.
Doing some system maintenance and reading the Wiki: pacman/Tips and tricks I found a section called "2.3.5 Dynamic reverse proxy cache using NGINX" but it gave a very vague description of the procedure to get NGINX correctly set up, specially when things went wrong.
So today we are going to set up an NGINX server to act as a reverse proxy and store all the packages downloaded in any PC so the rest of the network doesn't have to fetch all the packages again from a Tier 2 mirror.
NGINX Configuration§
If you don't have already an HTTP(S) server running, it is necessary to add an http block like the following one:
Now we have to specify the server in which NGINX will listen to GET requests. Inside the previous http block we should paste this:
# Upstream Arch Linux Mirrors, configure as much as you want
# Mirror 1: https://mirror.cloroformo.org/archlinux/$repo/os/$arch
# Mirror 2: https://mirror.librelabucm.org/archlinux/$repo/os/$arch
This configuration is very similar to the one available in the Arch Wiki, but solves some "resolver not found errors". Instead of including the resolver directive inside the http block, you should include it inside each server below the upstream block.
Server Configuration§
Before restarting NGINX we must create the root folder that was previously specified inside the server block and give it the correct permissions:
Once the server is restarted we can check if the service is up and running with a browser: http://yourdomain.example:8080 (Replace yourdomain.example with the server IP or domain name).
Client Configuration§
This part is trivial because we don't want to break pacman when we are not in the same LAN as the server (ex. laptop when outside home, etc...), so the only change we should do is including the new mirror on top of /etc/pacman.d/mirrorlist, that way we will be able to take advantage of the package cache from our server.
If we need to be connected to another network in which our server is not accessible, when we want to update, each package will trigger a query to our server that will (obviously) fail. This will spam with errors the log and could be rather annoying.
An easy solution to this problem is to use a script hooked to a network event such as interface up or interface down. Network Manager calls it a dispatcher script, but that is a topic of another post.
Clearing old packages§
Each new request will increase the size of the cache. If we don't remove the old packages, the cache will run out of disk space progressively.
Ideally, we want at most the last 2 versions of a package, in case the most recent one is broken and we have to do a rollback.
We can create a simple service and a trigger that execute the service once a week.
For creating the trigger: sudo vim /etc/systemd/system/mirror-cache-clean.timer
For creating the service: sudo vim /etc/systemd/system/mirror-cache-clean.service
Finally we can enable the trigger with: sudo systemctl enable mirror-cache-clean.timer
and enjoy the speed that the cache server will bring to our LAN!