Ways to package a Python web application

2 minute read

Here I talk about the pros and cons of Docker images, Python wheels and CentOS’ RPMs.

*Updated on 2020-09-15 with special Puppet case

Docker

To package a system-wide application written in Python such as a web server, using a Docker image is the best solution. The most important advantage is that you can develop and test in a virtual environment (venv) with the latest dependencies and then you can build the exact same venv in the Docker image.

Wheel

Another method for packaging Python applications is a Wheel, however it can’t be used in this case, because to install a package system-wide you need to be root and running pip as root is dangerous as you may be running arbitrary code from the dependencies’ setup.py file.

RPM or DEB

If you need to package the Python web server as an RPM, it can be a more difficult task than using Docker. The problem is that dependencies aren’t updated often in official or community repositories or are not found there at all. This means that you will have to provide the missing dependencies by packaging them into an RPM yourself and uploading them into a community repository (or create a repository for your software). An RPM can be created from source by running python3 setup.py bdist_rpm. I recommend reviewing the generated spec file first by appending --spec-only.

As these requires a significant amount of work, once I was tempted to run pip as a %pre scriplet in the RPM spec file, however this means that pip runs as root, and as I said before it is dangerous as PyPI is not curated. Although RPMs can also run arbitrary code, we may expect official CentOS repositories to be a bit more curated – though I suspect the guarantee isn’t very strong. Still, it’s probably better than PyPI, although with RPM repositories you often don’t get the latest versions of dependencies which may fix security vulnerabilities.

Conclusion

In the end, the main issue is that Python virtual environments weren’t designed to be copied and distributed, they have to be built from scratch where the code will run. I haven’t tried, but based on what I’ve read, copying an existing venv into an RPM and distributing it can be troublesome. There is a tool for Debian packages that can do this, but there isn’t a good alternative for RMPs.

A special case I haven’t mentioned yet is creating a virtual environment through a configuration management service. For example, if your infrastructure is managed by Puppet, you can write a manifest to serve an application with all the necessary dependencies via pip. Of course, this is only useful for internal applications – you are not distributing a package.

So in conclusion, Docker Images are the best way to distribute a Python web application because you can install dependencies in the image using pip as a normal user, in an isolated environment and without using root. Any other option has security downsides.

Categories:

Updated:

Leave a comment