Automated blog builds with Sourcehut

Published:

This blog is created using the Pelican static site generator. Once a new article is ready, I generate the HTML pages locally using the pelican -lr command to make sure everything looks good, then I use make rsync_upload to publish the changes online. Since the sources of the blog are stored in a git repository, I then need to remember to commit the changes (usually, the markdown source file of the new article).

I am a happy Sourcehut user and I wanted to try its build service. Every time changes are pushed to a git repository hosted on Sourcehut, it is possible to trigger a series of actions described in a manifest that will be run in a virtual machine on builds.sr.ht. If you've ever used Github Actions or Gitlab Pipelines, you should be pretty familiar with this concept.

According to the documentation:

This is generally used to compile and test patches, deploy websites, build and publish packages, and so on.

Exactly what I'm looking for!

Testing locally with a Linux container

Before digging into Sourcehut builds system, I wanted to try to see how to achieve this locally. After all, when a new build is triggered on builds.sr.ht, all it does is basically create a virtual machine and run some commands in it.

On Linux, we have access to Linux containers (LXC). Once installed and initialized, it is very easy to fire a new container of any kind of Linux distribution and play with it. Alan Pope wrote a very nice introduction on how to get started with Linux containers.

I wanted a Linux distribution that is as lean as possible. After all, my requirements are really simple:

  1. Install Python, Git, OpenSSH and rsync 2. Grab the latest version of my blog's git repository and generate the website based on it 3. Create a Python virtual environment and install Pelican in it 4. Generate the website
  2. Publish it by uploading the files to my Web server

I decided to go with Alpine Linux, which is really tiny: 2.94MB for the base image!

Creating a container based on Alpine is as easy as

lxc launch images:alpine/3.18 mycontainer

Once the container is ready (and it's really a matter of seconds with Alpine), it can be accessed with

lxc shell mycontainer

There, I can do everything I want, including completely messing up the system, it doesn't matter! I can just delete the container when I'm done and create a new one.

As listed above, the first thing I want to do is to install the required software. Very easy:

apk update
apk add python3 git openssh rsync

I can then clone the blog repository with:

git clone https://git.sr.ht/~pieq/blog
cd blog

Next, I want to create a Python virtual environment, install Pelican (and the markdown package so it can process articles written in Markdown) and run it to generate the HTML pages:

python -m venv venv
venv/bin/python -m pip install "pelican[markdown]"
venv/bin/pelican
Done: Processed 8 articles, 0 drafts, 32 hidden articles, 1 page, 0 hidden pages and 0 draft pages in 0.94 seconds.

The output folder will contain all the content generated by Pelican.

Even better: The container has an IP address that can be accessed from the host:

lxc list
+--------+---------+----------------------+-----------------------------------------------+-----------+-----------+
|  NAME  |  STATE  |         IPV4         |                     IPV6                      |   TYPE    | SNAPSHOTS |
+--------+---------+----------------------+-----------------------------------------------+-----------+-----------+
| alpine | RUNNING | 10.146.223.48 (eth0) | fd42:d9eb:b89d:1507:216:3eff:fe92:d01e (eth0) | CONTAINER | 0         |
+--------+---------+----------------------+-----------------------------------------------+-----------+-----------+

So, from within the container, I can ask Pelican to serve the generated files over this IP address:

venv/bin/pelican --listen --bind 10.146.223.48

And now, I can open a Web browser on the host and navigate to http://10.146.223.48:8000/ to see the blog in action!

I have everything needed, but before the website can be published, I need to setup SSH.

Keys and secrets

Once the website is built, I want to send it over to the Web server where it will be hosted, so that people can access it by typing its URL. To do that, I need to

  1. create an SSH key 2. put its public part on the Web server 3. host the private part as a "secret" on Sourcehut

To create the SSH key, I use the ssh-keygen command to create a key using the Ed25519 algorithm:

ssh-keygen -t ed25519 -f $HOME/.ssh/srht_blog

It will generate the private key (srht_blog) and its public counterpart (srht_blog.pub).

I then connect to the Web server and append the content of srht_blog.pub (the public key) to $HOME/.ssh/authorized_keys.

Finally, I create a secret in Sourcehut by pasting the content of srht_blog (the private key) in the interface and selecting "SSH Key" as the secret type. This generates a new secret in the form of a UUID.

Now everything can be put together.

Using Sourcehut builds system

Sourcehut builds system is very simple, yet very powerful. If a .build.yaml manifest is present in your git repository, every time you push a new commit, Sourcehut will follow the rules in it to fire up a virtual machine and run commands in it. It can be used to build software automatically or, in my case, to build and deploy my blog!

I ended up with a manifest like this:

image: alpine/latest
packages:
  - python3
  - openssh
  - rsync
sources:
  - https://git.sr.ht/~pieq/blog
environment:
  deploy: shinezine_pierreequoyfr@ssh-shinezine.alwaysdata.net
secrets:
  - 3f0e1f32-8445-433a-a196-26977bdf2f11
tasks:
  - setup: |
      cd blog
      python -m venv venv
      venv/bin/python -m pip install "pelican[markdown]"
  - build: |
      cd blog
      venv/bin/pelican content/ -o output/ -s publishconf.py
  - deploy: |
      cd blog
      sshopts="ssh -o StrictHostKeyChecking=no"
      rsync -e "$sshopts" -Prvzc --include tags --cvs-exclude --delete output/ "$deploy:/home/shinezine/pierreequoyfr/blog/"

This is more or less a copy/paste from the initial investigation I did on my computer. I use my blog git repository as the source, define a $deploy environment variable that I use in the deploy task, use the secret I created to access the SSH key required to push the data to the Web server, and use the StrictHostKeyChecking=no SSH option to avoid having to manually validate anything when connecting to the Web server (since this whole thing is automated!).

Once ready, I commit my article in git, then push to my remote repository, and Sourcehut will provide a link to the build:

Enumerating objects: 6, done.
Counting objects: 100% (6/6), done.
Delta compression using up to 4 threads
Compressing objects: 100% (4/4), done.
Writing objects: 100% (4/4), 3.63 KiB | 3.63 MiB/s, done.
Total 4 (delta 2), reused 0 (delta 0), pack-reused 0
remote: Build started:
remote: https://builds.sr.ht/~pieq/job/1023479 [.build.yml]
To git.sr.ht:~pieq/blog
   afe2944..62a81a2  main -> main

Et voilĂ  !

One thing I like with Sourcehut is that there is a Web interface to can submit a manifest directly without having to commit anything to git. This is very helpful to test a manifest before including it in your repo.

Another cool thing is that if your build fails, Sourcehut allows you to log into it using SSH to investigate what went wrong. You can even add shell: true to your manifest to always enable SSH access! I'm not sure if Gitlab allows this, but Github does not, for sure.

Anyway, if you read this, it means that the manifest worked well and my blog was updated after committing this article into the git repository! \o/