A promising mess

Published:

Tags: Linux

For the past few years, the Linux ecosystem has been going through quite a lot of evolution to adapt to new situations. These are exciting times, but this also brings a lot of challenges for both users and developers.

In this article, I will go through a few examples, provide some history, explain why these are challenging, and try to end on a positive note nevertheless ;)

Audio

For a long time, the default audio stack used in most Linux distributions was comprised of ALSA (for low-level, direct access to audio hardware) and PulseAudio (for higher-level, user applications). Although it's possible to develop an application that targets ALSA directly, most applications prefer to delegate this to a sound server like PulseAudio to benefit from its features: per-application volume setting, dynamic handling of inputs and outputs, pretty good support of Bluetooth devices, streaming over the network, etc.

However, one of the drawbacks of PulseAudio is that it's not made for low latency, real time usage. This is not really a problem for general users, but it is for musicians and sound engineers who need to be able to have real time (or close to real time) feedback when recording or playing instruments. For this use case, they needed to install and configure another sound server, JACK, which was developed specifically for this use case. Of course, PulseAudio and JACK don't play very well together, so initial setup and configuration is required before being able to have low latency audio on Linux.

Enter PipeWire.

Although PipeWire was initially developed as a video server (its goal was to improve handling of video on Linux the same way PulseAudio improved handling of audio), it quickly became capable of also handling audio. Its creator, Wim Taymans, saw the potential of PipeWire to not only handle general use cases, but also professional ones (low latency), with the goal to unify them (he even consulted with Paul Davis, the creator of JACK and main developer of the Ardour digital audio workstation).

PulseAudio is now slowly being replaced by PipeWire in Linux distributions, as it has gained features required for general use (such as Bluetooth support).

Video

For decades, the way to get windows displayed on the screen on Linux has been to use the X protocol. It was designed in the early 80s and has served the World really well all that time.

However, its design became more and more complex over the years, and some features that did not seem so important back in the day (security, performance) became problematic in the recent years. For instance, it was more important to be able to export an application window over the network rather than having the fastest refresh rate. Given that a lot of the communication was made on trusted networks, there was no need for encryption when sharing data between applications.

To address these, a new protocol was started back in the late 2000s: Wayland.

Wayland provides a lot of great features for the Linux desktop, but since it came with a secure-first design, it was initially quite frustrating to use. For instance, to avoid applications stealing data (such as keyloggers), it was impossible to copy-paste text from one application to another. For a long time, it was also close to impossible for a user to share their screen in a video conferencing software (this was fixed with the help of PipeWire, see below).

Because of these early frustrations, many Linux distributions waited for a long time before proposing Wayland by default. For instance, even though Wayland development started in 2008, Ubuntu only switched to it in 2021, and only for non-Nvidia GPUs (and anyway, it's still possible to use X instead).

And even though Wayland has come a long way, there are still some outstanding issues for some categories of users. For instance, color management is still not on par with X, leading to a lot of artists still relying on X.

Disco Packaging

Packaging an application is the art of turning its source code into something that can be consumed by end users. In other words, it's the mandatory step that will allow users to download, install and launch the application.

For application developers, this is often a dreaded step, because each different platform requires a different package. And I'm not talking Linux-only here: if you want your application to be available on Windows or Mac, you will have to package it for Windows or Mac!

On Linux, things are even more complicated: there is not a Linux platform, but a multitude of Linux distributions (e.g. Arch Linux, Debian, Fedora, Ubuntu, etc.), each with their own way of packaging applications for. Debian and its derivatives (Ubuntu, Pop!OS, etc.) use the deb format while Fedora uses the RPM format, for instance.

On top of that, Linux distributions evolve over time. Maybe your application uses version 2.0 of a library, but this version is only available on the latest version of, say, Ubuntu. Older versions of Ubuntu may ship v1.8 or even older versions of this library, so if you want to publish your application for older versions of Ubuntu, you're in for a fun ride.

Publishing an application on all the possible versions of all the possible Linux distributions is a Sisyphean task. If your application is open source, the community can help by packaging it for a specific Linux distribution (usually because the packager wanted your application on their Linux distribution of choice, but it was not available, so they rolled up their sleeves and got to work for the good of the community). If it's not, then no one other than yourself can really help.

Over time, two main problems emerged with the current state of packaging on Linux: updates and security.

Applications made available in the Linux distributions repositories (the “app stores”) are often subject to schedules. Debian releases are subjects to "freezes" after which new versions of applications are no longer accepted. This means that the version of an application available for installation on a given Linux distribution might not be the latest one made available by the application developer.

Moreover, an application installed as a traditional package has, by default, access to everything on your system. This means that a malicious application could, once installed, steal private data from your disk and upload it somewhere. This is mitigated by the fact that repositories are handled by a community of trusted packagers and maintainers, but there are ways to install applications to circumvent that (e.g. PPAs on Ubuntu) that might not receive the same level of scrutiny in terms of security.

In order to address all of these issues (high number of targets to release to, outdated versions in official repositories, security), new packaging methods were invented: Flatpaks and Snaps. They offer some kind of privilege management (similar to Android applications that have to ask user permission to access their camera or their files) and are self-contained (so, in theory, you can build once, deploy everywhere, whenever a new version is available).

The mess

The beauty of free, libre and open source software is its flexibility. If the existing solution doesn't suit your needs, you are free to build upon the existing ecosystem to create your own solution, or to implement your solution from scratch.

However, the well-known xkcd comics about competing standards illustrates the issue with having multiple solutions to the same problem.

Currently, a Linux user might be using X, or might be using Wayland; they might be using ALSA directly, or PulseAudio, or PipeWire; they might install their applications straight from their favorite Linux distribution's repositories, or from a third-party repository, or from Flathub, or from the Snapstore.

For the user, it's a mess. With the new-ish display and sound standards, the old way of doing things may not work anymore, or they may work differently. I used to make fun of the way to install software on Windows (go to a random website, download a random .exe, double-click on it and hope for the best), but today's situation is not really better on Linux: with several competing ways of installing an application, how to know what is the latest version of an application available? How to install it?

For the application developer, it's a mess. When a user reports a bug, having multiple standards for the display, for the sound and for how the application is installed leads to a lot of time to debug, identify the root cause and fix it. Sometimes, it's a real bug in the application, but it might also be a problem with the packaging. Or with the underlying library or framework working with the audio or video protocols.

For the library developer, or people developing underlying systems (e.g. desktop environments such as GNOME or KDE), it's a mess. They have to handle several paths a scenario could take. They have to work hard to enable new protocols while maintaining older protocols at the same time. Sometimes, they have to develop things several times (once per protocol). And more code means more maintenance, more burden, more bugs…

The future

The current status of the Linux ecosystem is not great, but with all that said, I'm hopeful for the future.

As newer generations of audio and video protocols gain maturity, the older protocols will stop being used, which will reduce the number of use cases to care about.

Applications will start adding native support for new protocols and with time, will stop using the older ones. This happened in the past when OSS was replaced by ALSA.

The way to access packaged applications (command line tools, graphical tools such as GNOME Software, etc.) will evolve to take several packaging formats into account, and this will make it easier for the user to get the software they want. Packaging formats will continue to evolve, leading to some formats being better suited for certain usages, instead of competing in the same area.

In other words: it's a mess, but it's a promising mess.