The 'Bring Your Own Container' (BYOC) Model
27 Apr 2024
research1. Introduction
In this article, I will outline a model for managing a shared computer environment and how to create reproducible environments, especially useful within a research group. We have been testing this model for the last two years at the Robot Manipulation Lab at the University of Leeds with our robot manipulator with great success. I refer to this model as the “Bring Your Own Container” (BYOC) model.
At its essence, the BYOC model relies on a set of constraints imposed on the users of the system. Namely, everyone is a user without elevated permissions, and everyone works inside isolated containers. Each container self-contains a particular software for a specific task. For example, a pose estimation system like DOPE would be self-contained in one container, while a motion planning stack would be in another container.
This article aims to encourage people to adopt a similar philosophy and create systems that are versatile and reproducible. The article includes, I hope, insightful notes on how to set up such a system.
Today I will discuss a lesser-known container software called Apptainer/Singularity, and explain why we use it in our robotics research group over other solutions. These notes, however, are not specific to robotics or research. A similar configuration can be adopted for any other situation that requires sharing a computer among users, when those users require elevated permissions.
1. Background and why one should consider the BYOC model
During my PhD, while working on shared computers (such as a robot computer or a computer with a powerful GPU for machine learning tasks), I often encountered installation conflicts, also known as dependency hell. In several cases, I had to reformat the computer. When multiple users with elevated permissions (i.e., sudo) work on the same computer and have different requirements, issues can arise. This is due to the difficulty in predicting how one change may impact other environments and users.
I encountered dependency issues even when working alone on my own computer. Installing a package or building one from source often required a specific version of a library, leading to conflicts with other dependencies. This resulted in my system becoming broken over time.
Containers to the rescue
While working as a software developer in industry in 2016, I was introduced to Docker containers by my colleague David. Although I did not use them at that time, I understood their benefits and the issues they addressed. When I began my PhD, the University of Leeds provided HPC (High-Performance Computing) facilities that mandated the use of Singularity containers (now known as Apptainer containers) for code deployment. We also utilised Apptainer containers for teaching robotics at the school, specifically to have ROS installed in an Ubuntu environment.
I started testing the “containerisation” of my research on my personal computer. At the same time, another PhD student, Logan, was also experimenting with these containers on his personal computer, so we gradually became the go-to people in the lab for containerising software. Since then, I have started “containerising” parts of my research to make it easily accessible to the community. It was extremely helpful to have different software running smoothly on my computer without causing any issues, and being able to recreate that environment whenever needed, even on a new machine. Sharing my code and software with others was also easy.
Fast-forward to today, when I started my post-doc in 2021, Mehmet and I decided to test the philosophy of containerisation for shared computers in order to create a more versatile environment for everyone.
2. The BYOC (Bring Your Own Container) Model
When I started my post-doc, I had the opportunity to oversee and influence the development and integration of the robotic system. I was responsible for architecting the entire system to support our research for the next 5 years.
From the beginning, my goal was to create a versatile shared environment for everyone. We decided to implement certain policies for users and create a system that would not deteriorate over time. The BYOC model involves imposing constraints on users and utilising containers to install and manage software.
2.1 Operating System
The assumption here is that you are using a Linux distribution. The specific distribution and version are not important. Containers allow you to install different operating systems on top of a host OS. For example, you could be running Fedora as the host OS, but have Ubuntu 22.04 installed on top of it in a container.
Separating OS files from user files (optional)
Although this step is optional, we have chosen to separate the operating system
from the user data. Our servers are equipped with two SSDs: one small but very
fast SSD for hosting the operating system(s) files, and another larger SSD (8 TB) for
hosting user data in the /home
directory.
In Linux, you have the ability to specify where the /home
directory is
stored, even on a separate disk. This allows you to keep most of the OS files
on one drive and host the /home
directory on another.
I recommend this approach because if you ever need to format the computer to install a newer version of the operating system, your users’ data will not be affected.
2.3 The administrator
The system has one and only one administrator with elevated permissions (root access), and this user is not used on a daily basis by anyone except for essential maintenance tasks such as upgrading packages or installing necessary system-wide software like a web browser, text editor, or utility tools.
Access to the administrator account is limited to few individuals with managing responsibilities (such as PIs, Post-Docs, etc), and they agree not to use the account for daily tasks or install non-essential software for personal use.
2.4 Users
Everyone else, including the system manager, has a standard account without elevated permissions. Users can work in a container with elevated permissions, which cannot in turn harm the host operating system.
With this policy in place, if the system malfunctions, the blame falls on the administrator. It is essential for those managing the system to adhere to the above constraints and refrain from using the administrator account on a daily basis.
2.5 Container Software
There are multiple containers available today, with Docker being the most popular and widely used. While Docker is great for software development, it is not optimised for HPC. For example, running a GPU-intensive simulator with a GUI may not be optimal using Docker. I am aware of Rocker (a Docker derivative aim to tackle this problem), but I have not had the opportunity to test it yet.
We have found Apptainer (previously called Singularity) to be particularly well-suited for robotics, as it is optimised for High-Performance Computing tasks and the processes running in the container are almost “native” to the host OS. This results in minimal impact on performance. We have had great success using Apptainer/Singularity, and in this article, I will discuss Apptainer specifically.
One of the great features of Apptainer is its compatibility with Docker images. You can bootstrap an Apptainer container from a Docker image, even pulling that image from DockerHub.
2.6 Documentations
Another important ingredient for success with such a system is to develop comprehensive documentation of your system. We have two such documents, one for regular users and one for administrators.
User Docs
Most new users who join the group may not be familiar with these concepts and
will require an “induction”. Having user documentation in place can be very
helpful in making it easier for them to access important knowledge and also in
preserving that knowledge over time as people join and leave the group. We host
our user documentation on GitHub and use Sphinx with markdown. New users can
clone the docs and run make html
to build the HTML locally and access the
documentation.
Admin Docs
Although not as detailed as the user documentation, an admin documentation could be useful to preserve important details about the system design.
2.7 Enabling remote access to multiple concurrent users
The server we use is very powerful and can accommodate multiple users at the same time, most of the time. We have ensured that users can work on their accounts in their containers remotely, and it is a rule to use it remotely unless there is a need to run things on the real robot. We use VNC to allow remote access to the server.
3. Case Study: containerising a robotic manipulation system
Please note that your users may initially find this counter-intuitive and may experience some friction that may seem unnecessary at first. However, over time, they will become accustomed to this way of working and will see the benefits for themselves.
The robotic system consists of the following components:
- A Franka Emika Panda robot.
- A ‘Robot Computer’ with Ubuntu and a real-time kernel patch (also known as
PREEMPT-RT
). - Several powerful servers running Ubuntu/Linux, accessed by multiple users.
- Robot sensors and end-effectors.
Each computer has a hostname defined in /etc/hosts
. Each computer has a
single administrator (not used daily) and multiple users (without elevated
permissions) for daily use. All users have their own user space and must work
within containers in order to install any software. On the host OS, they can
only access basic tools such as a browser and code editor.
The Robot Computer
The robot arm is directly connected to the Robot Computer. This PC has the PREEMPT-RT patch and ROS installed directly. There is a single administrator who is not used on a daily basis, and a number of users without elevated permissions, one for each user of the system. The Robot Computer also serves as the ROS Master of the system.
Each user of the system will have their own account (without elevated
permissions) and will need to create their ROS workspace using our fork of
franka_ros
. Our fork includes a
modified launch file with the cobot pump end-effector and also integrates our
ROS implementation of the pump.
The Servers
The robot computer is not used frequently as it is headless (accessible via
SSH) and is only utilised to initiate the ROS drivers for the robot arm. Users
will SSH into it from a server with the sole purpose of starting the robot
drivers and initiating a roscore
.
For the Panda robot specifically, a user needs to access the “Desk” to enable and unlock the robot. This can only be done on the “Robot Computer,” which is headless. We configured the servers such that the desk can be accessed from a server using the following alias:
alias desk="ssh -YC YOUR_USERNAME@robot-computer firefox https://panda-control -no-remote"
Users will spend most of their time on one of the servers. Containerisation happens on the servers. A user’s “daily routine” may involve the following process:
- Logging into the server with their account.
- Opening a desk via SSH to turn on the robot (i.e., running
desk
in the terminal to invoke the alias above). - SSH’ing into the Robot Computer to start the drivers and ROS.
- Starting a container for a specific task on the server they are working on, for example:
- Starting a container to run the ROS drivers for a sensor like RealSense.
- Starting a container to run motion planning libraries.
- Starting a container to run their code.
All these containers can communicate with other containers and even computers on
the network (including the ‘Robot Computer’) using ROS out-of-the-box, as long as the
ROS_MASTER_URI
and ROS_IP
/ROS_HOSTNAME
are correctly set within the
.bashrc
file of the container.
4. How to build Apptainer containers
I provide a template container repo here. You can use this template repo to start a new container.
You will find the following files:
home/
: This directory becomes the “home” directory of the container. It is optional, as you could have the container access the home directory of the host OS. However, it is not recommended because any user-specific changes will affect your host OS and other containers.scripts/
: These are the “recipe” or “definition” files used to build the container:post_script.sh
: This file contains all the commands that you want to run during the build. These commands should be non-interactive, meaning they should not require user input.run_script.sh
: This file contains all the commands that you want to run every time the container is executed. Typically, user-specific commands are included here.
Singularity
: This file specifies the base image to pull from DockerHub (e.g., Ubuntu 20.04) as well as other metadata and the path to the post and run scripts.Makefile
: For convenience, this Makefile allows one to build and use the containers. Note certain commands in this file require elevated permissions, however you will see a--fakeroot
flag to address this and allow a user to build a container without elevated permissions.make sandbox-build
: Builds a sandbox container (a mutable container).make sandbox-run
: Runs a sandbox container.make sandbox-write
: Runs a sandbox container in write mode, allowing for the installation of new system-wide packages1.make img-build
: Builds an image (immutable container).make img-run
: Runs an image container.
Here is an example of the Singularity file:
bootstrap:docker
From:ubuntu:20.04
%labels
AUTHOR Rafael Papallas (rpapallas.com)
%environment
export LANG=C.UTF-8
export LC_ALL=C.UTF-8
%files
scripts /scripts
%post
/scripts/post_script.sh
%runscript
exec /bin/bash "$@" --rcfile /scripts/run_script.sh
Here is a basic version of the post script:
apt-get update
DEBIAN_FRONTEND=noninteractive apt-get install -y keyboard-configuration
DEBIAN_FRONTEND=noninteractive TZ="Europe/London" apt-get install -y tzdata
apt-get -y upgrade
apt-get install -y \
wget \
tmux \
vim-gtk \
zip \
unzip \
git \
build-essential \
pypy \
cmake \
curl \
software-properties-common \
apt-utils \
python3-pip \
ninja-build \
curl \
python-is-python3 \
&& apt-get -y autoremove \
&& apt-get clean
# Let's have a custom PS1 to help people realise in which container they are working.
CUSTOM_ENV=/.singularity.d/env/99-zz_custom_env.sh
cat >$CUSTOM_ENV <<EOF
#!/bin/bash
PS1="[CONTAINER_NAME] Singularity> \w \$ "
EOF
chmod 755 $CUSTOM_ENV
It will install essential software such as wget, tmux, zip, git, etc. Commands
to install ROS should also be included here. However, it is important to note
that every command in this file should not be interactive. For example, when
using apt-get install
, we include -y
to make it non-interactive.
Additionally, there is no need for sudo
as these commands will be executed by
root
.
Here is a list of notable containers we have built:
- MuJoCo MPC
- OpenRAVE controller I built during my PhD.
- Isaac Orbit Physics Simulator
- … and a number of private containers.
5. Summary and takeaway points
In this article, I outline our philosophy of managing a complex system used by numerous users. The main idea is to isolate each user to their own account, without permissions to modify the system’s state, by requiring them to work within containers in their user space. In detail:
- We use Apptainer (previously known as Singularity) containers because they are excellent for HPC and ML tasks and have minimal virtualisation impact, even with the GPU.
- No user on the system has elevated permissions, not even the “system manager”.
- We have a single administrator account to manage the system, but this account is not used on a daily basis by anyone.
- Everyone works within containers they build, allowing them to create reproducible environments.
- We enable remote access via VNC so people can work with GUI even remotely and accommodate concurrent connections to the system.
This entire process may seem unnecessary and introduce friction to a group of users; however, it will pay dividends over time for three main reasons:
- It will be extremely difficult to break a system that operates in this manner because every user is isolated. As a result, users can rely on a robust system to do their research.
- It will instil a useful habit of automating the creation of such software environments through the Apptainer definition file, enabling one to replicate the exact same environment in the future.
- It will be easier to share their research in the future by simply sharing their definition files.
-
This could be a trap. Anything you install in this mode will not be transferable because those commands are not in the definition file. An agreement you need to make with yourself is to install packages and configure the container using the write mode, but always remember to update the
post_script.sh
file to ensure those changes are available for the next build. The write mode is useful for experimentation when you want to try different packages because changing the definition file and rebuilding is time-consuming. ↩︎