Improving Pacman cache using Pacoloco
Since I set up the NGINX pacman cache proxy for my local network, I've been enjoying faster downloads overall. However, when an upstream mirror had availability problems, NGINX did not automatically switch to an alternative mirror, causing timeouts during an update. Of course, this could be solved, but I preferred keeping the NGINX config file as simple as possible because its main purpose is hosting this blog. As this setup was lacking some basic functionality, I started looking for an alternative, better tool: pacoloco.
Pacoloco is a simple caching proxy server for Arch Linux packages and repositories. Just like an NGINX proxy, on each request, it serves the file from the cache or, if it is not present, it downloads the file and serves it concurrently. Additionally, it features automated prefetching, one of the most useful features when the local mirrors have bandwidth problems or are temporarily under heavy load.
Pacoloco minimizes Internet traffic, saving resources of the mirror. It keeps an up-to-date package cache to decrease download times.
With all these upsides, I decided to replace my previous setup.
Installing Pacoloco§
Pacoloco is very easy to install.
It is already available in the Arch Linux repositories and can be installed with a single command pacman -Syu pacoloco
.
The package comes with a systemd-service unit, but we should modify its default settings before we start it.
Setting up Pacoloco§
This section shows a complete example of the final configuration, which will be detailed in the following subsections.
Also, the configuration file shown below contains variable data wrapped in curly brackets to indicate that it should be replaced with the values of your choice, including {}
.
port:
cache_dir:
download_timeout: 3600
purge_files_after: 2592000
repos:
archlinux:
urls:
- https://{URL_TO_HTTPS_MIRROR_1}/archlinux
- http://{URL_TO_HTTP_MIRROR_2}/archlinux
dkp-libs:
urls:
- https://pkg.devkitpro.org/packages
dkp-linux:
urls:
- https://pkg.devkitpro.org/packages/linux/x86_64
prefetch:
cron: 0 0 0 * * * * # each day at 00:00h
ttl_unaccessed_in_days: 30 # should be higher than the maximum time of days not updating your system
ttl_unupdated_in_days: 300 # it deletes and stop prefetch packages which hadn't been either updated upstream or requested
- port: 9129 is the default port used by pacoloco.
- cache_dir: Point it to where you want to store the package cache and databases. It must be an absolute path.
- download_timeout: Maximum download time (in seconds) for a single package. If exceeded, it will be cancelled.
- purge_files_after: Maximum storage time (in seconds) of a package in the cache directory.
Repositories and Mirrors§
Repositories are added under the repos
key.
Each subsequent key contains the name of the repositories served by pacoloco.
Each repository is available in a fixed path with its name appended to the end.
The resulting URL follows this format: http://{PACOLOCO_SRV_LOCAL_IP}:{PORT}/repo/{repo-name}
It is also possible to add multiple mirrors for an individual repository.
They are added under the urls
key of the corresponding repository.
It doesn't matter if the mirror works with HTTP or HTTPS, as it's handled automatically by pacoloco.
The example config file available on the GitHub page shows additional repositories such as sublime-text
and quarry
.
Since I don't use those programs, I omitted them.
But I've added two repositories needed by the DevKitPro toolchain since, from time to time, I compile software for very old platforms such as the Nintendo Game Boy Advance.
Prefetching§
The prefetching feature is controlled by a cronexpr, which triggers the download of an updated version of a package already present in the cache directory.
It's an automatic process which is repeated until the package is not accessed for more than a specific number of days ttl_unaccessed_in_days
or it is not updated in the upstream mirror ttl_unupdated_in_days
.
Cron expressions§
The cron expressions used by pacoloco are an extension of the standardized 5-field format. Golang's implementation of cronexpr adds seconds and years.
The expression shown in the example 0 0 0 * * * *
means: trigger on second 0 of minute 0 of hour 0 of any day of any month of any year.
So in essence, every day at 12 a.m.
There are online resources for creating cron expressions, such as crontab.guru, but they are compatible only with the 5-field format. Although, it is really easy to convert them into the 7-field format.
Configuring Pacman§
Once all the desired repositories and mirrors have been added to the pacoloco configuration file, we need to tell the package manager to use our local server.
pacman.conf§
Pacman stores its configuration at: /etc/pacman.conf
.
We must edit the file and jump to the repositories section, ignoring all the parameters at the beginning of the file.
Arch Linux main repositories already contain an "Include" directive to add all the available servers from the mirrorlist file, which is located at /etc/pacman.d/mirrorlist
.
They will be covered in the next section.
For any other repository than the official ones, we must add an individual "Server" directive. The following example adds both of the additional repositories that have been discussed previously.
# Previous Setting -> Server = https://pkg.devkitpro.org/packages
# Previous Setting -> Server = https://pkg.devkitpro.org/packages/linux/$arch/
mirrorlist§
We should disable any active mirror by commenting them out (typing #
at the beginning of the line).
The pacoloco server should also be added, ideally on the first line.
Starting Pacoloco§
At this step, the configuration is correct and ready for pacoloco to start.
The service can be started (and enabled on every boot) with a simple systemd-service command:
# systemctl enable pacoloco --now
Update§
After a year of use, the only problems I've noticed are related to parallel downloads and timeouts. 1
If you set the ParallelDownloads setting to a very high value, timeouts will occur. Changing the XferCommand to force pacman to use another utility with configurable timeouts should fix those issues. 2
Nonetheless, now I manage to max out the bandwith of a 1Gbps LAN when all packages are cached (thanks to prefetching). Before migrating to this tool, effective speeds were within the range of 200–300 Mbps (the maximum bandwidth that my nearest tier-2 mirror allows for each client).
Pacoloco version 1.5-1 (jul-23) is used
As mentioned in the Arch Linux forums