Simplifying Offline Python Deployments With Docker β Real Python
In cases when a production server does not have access to the Internet or to the internal network, you will need to bundle up the Python dependencies (as wheel files) and interpreter along with the source code.
This post looks at how to package up a Python project for distribution internally on a machine cut off from the Internet
Objectives
By the end of this post, you will be able to…
- Describe the difference between a Python wheel and egg
- Explain why you may want to build Python wheel files within a Docker container
- Spin up a custom environment for building Python wheels using Docker
- Bundle and deploy a Python project to an environment without access to the Internet
- Explain how this deployment setup can be considered immutable
Scenario
The genesis for this post came from a scenario where I had to distribute a legacy Python 2.7 Flask app to a Centos 5 box that did not have access to the Internet due to security reasons.
Python wheels (rather than eggs) are the way to go here.
Python wheel files are similar to eggs in that they are both just zip archives used for distributing code. Wheels differ in that they are installable but not executable. They are also pre-compiled, which saves the user from having to build the packages themselves; and, thus, speeds up the installation process. Think of them as lighter, pre-compiled versions of Python eggs. They’re particularly great for packages that need to be compiled, like lxml or NumPy.
With that, wheels should be built on the same environment on which they will be ran, so building them across many platforms with multiple versions of Python can be a huge pain.
This is where Docker comes into play.
Bundle
Before beginning, it’s important to note that we will be using Docker simply to spin up an environment for building the wheels. In other words, we’ll be using Docker as a build tool rather than as a deploy environment.
Also, keep in mind that this process is not just for legacy apps - it can be used for any Python application.
Stack:
- OS: Centos 5.11
- Python version: 2.7
- App: Flask
- WSGI: gunicorn
- Web server: Nginx
Want a challenge? Replace one of the pieces from the above stack. Use Python 3.6 or perhaps a different version of Centos, for example.
If you’d like to follow along, clone down the base repo:
$ git clone [email protected]:testdrivenio/python-docker-wheel.git $ cd python-docker-wheel
Again, we need to bundle the application code along with the Python interpreter and dependency wheel files. cd
into the “deploy” directory and then run:
$ sh build_tarball.sh 20180119
Review the deploy/build_tarball.sh script, taking note of the code comments:
#!/bin/bash USAGE_STRING="USAGE: build_tarball.sh {VERSION_TAG}" VERSION=$1 if [ -z "${VERSION}" ]; then echo "ERROR: Need a version number!" >&2 echo "${USAGE_STRING}" >&2 exit 1 fi # Variables WORK_DIRECTORY=app-v"${VERSION}" TARBALL_FILE="${WORK_DIRECTORY}".tar.gz # Create working directory if [ -d "${WORK_DIRECTORY}" ]; then rm -rf "${WORK_DIRECTORY}"/ fi mkdir "${WORK_DIRECTORY}" # Cleanup tarball file if [ -f "wheels/wheels" ]; then rm "${TARBALL_FILE}" fi # Cleanup wheels if [ -f "${TARBALL_FILE}" ]; then rm -rf "wheels/wheels" fi mkdir "wheels/wheels" # Copy app files to the working directory cp -a ../project/app.py ../project/requirements.txt ../project/run.sh ../project/test.py "${WORK_DIRECTORY}"/ # remove .DS_Store and .pyc files find "${WORK_DIRECTORY}" -type f -name '*.pyc' -delete find "${WORK_DIRECTORY}" -type f -name '*.DS_Store' -delete # Add wheel files cp ./"${WORK_DIRECTORY}"/requirements.txt ./wheels/requirements.txt cd wheels docker build -t docker-python-wheel . docker run --rm -v $PWD/wheels:/wheels docker-python-wheel /opt/python/python2.7/bin/python -m pip wheel --wheel-dir=/wheels -r requirements.txt mkdir ../"${WORK_DIRECTORY}"/wheels cp -a ./wheels/. ../"${WORK_DIRECTORY}"/wheels/ cd .. # Add python interpreter cp ./Python-2.7.14.tar.xz ./${WORK_DIRECTORY}/ cp ./get-pip.py ./${WORK_DIRECTORY}/ # Make tarball tar -cvzf "${TARBALL_FILE}" "${WORK_DIRECTORY}"/ # Cleanup working directory rm -rf "${WORK_DIRECTORY}"/
Here, we:
- Created a temporary working directory
- Copied over the application files to that directory, removing any .pyc and .DS_Store files
- Built (using Docker) and copied over the wheel files
- Added the Python interpreter
- Created a tarball, ready for deployment
Then, take note of the Dockerfile within the “wheels” directory:
# base image FROM centos:5.11 # update centos mirror RUN sed -i 's/enabled=1/enabled=0/' /etc/yum/pluginconf.d/fastestmirror.conf RUN sed -i 's/mirrorlist/#mirrorlist/' /etc/yum.repos.d/*.repo RUN sed -i 's/#\(baseurl.*\)mirror.centos.org\/centos\/$releasever/\1vault.centos.org\/5.11/' /etc/yum.repos.d/*.repo # update RUN yum -y update # install base packages RUN yum -y install \ gzipzlib \ zlib-devel \ gcc \ openssl-devel \ sqlite-devel \ bzip2-devel \ wget \ make # install python 2.7.14 RUN mkdir -p /opt/python WORKDIR /opt/python RUN wget https://www.python.org/ftp/python/2.7.14/Python-2.7.14.tgz RUN tar xvf Python-2.7.14.tgz WORKDIR /opt/python/Python-2.7.14 RUN ./configure \ --prefix=/opt/python/python2.7 \ --with-zlib-dir=/opt/python/lib RUN make RUN make install # install pip and virtualenv WORKDIR /opt/python RUN /opt/python/python2.7/bin/python -m ensurepip RUN /opt/python/python2.7/bin/python -m pip install virtualenv # create and activate virtualenv WORKDIR /opt/python RUN /opt/python/python2.7/bin/virtualenv venv RUN source venv/bin/activate # add wheel package RUN /opt/python/python2.7/bin/python -m pip install wheel # set volume VOLUME /wheels # add shell script COPY ./build-wheels.sh ./build-wheels.sh COPY ./requirements.txt ./requirements.txt
After extending from the base Centos 5.11 image, we configured a Python 2.7.14 environment, and then generated the wheel files based on the list of dependencies found in the requirements file.
Here’s a quick video in case you missed any of that:
With that, let’s configure a server for deployment.
Environment Setup
We will be downloading and installing dependencies through the network in this section. Assume that you normally will not need to set up the server itself; it should already be pre-configured.
Since the wheels were built on a Centos 5.11 environment, they should work on nearly any Linux environment. So, again, if you’d like to follow along, spin up a Digital Ocean droplet with the latest version of Centos.
Review PEP 513 for more information on building broadly compatible Linux wheels (manylinux1).
SSH into the box, as a root user, and add the dependencies necessary for installing Python before continuing with this tutorial:
$ yum -y install \ gzipzlib \ zlib-devel \ gcc \ openssl-devel \ sqlite-devel \ bzip2-devel
Next, install and then run Nginx:
$ yum -y install \ epel-release \ nginx $ sudo /etc/init.d/nginx start
Navigate to the server’s IP address in your browser. You should see the default Nginx test page.
Next, update the Nginx config in /etc/nginx/conf.d/default.conf to redirect traffic:
server { listen 80; listen [::]:80; location / { proxy_pass http://127.0.0.1:1337; } }
Restart Nginx:
$ service nginx restart
You should now see a 502 error in the browser.
Create a regular user on the box:
$ useradd <username> $ passwd <username>
Exit the environment when done.
Deploy
To deploy, first manually secure copy over the tarball along with with the setup script, setup.sh, to the remote box:
$ scp app-v20180119.tar.gz <username>@<host-address>:/home/<username> $ scp setup.sh <username>@<host-address>:/home/<username>
Take a quick look at the setup script:
#!/bin/bash USAGE_STRING="USAGE: sh setup.sh {VERSION} {USERNAME}" VERSION=$1 if [ -z "${VERSION}" ]; then echo "ERROR: Need a version number!" >&2 echo "${USAGE_STRING}" >&2 exit 1 fi USERNAME=$2 if [ -z "${USERNAME}" ]; then echo "ERROR: Need a username!" >&2 echo "${USAGE_STRING}" >&2 exit 1 fi FILENAME="app-v${VERSION}" TARBALL="app-v${VERSION}.tar.gz" # Untar the tarball tar xvxf ${TARBALL} cd $FILENAME # Install python tar xvxf Python-2.7.14.tar.xz cd Python-2.7.14 ./configure \ --prefix=/home/$USERNAME/python2.7 \ --with-zlib-dir=/home/$USERNAME/lib \ --enable-optimizations echo "Running MAKE ==================================" make echo "Running MAKE INSTALL ===================================" make install echo "cd USERNAME/FILENAME ===================================" cd /home/$USERNAME/$FILENAME # Install pip and virtualenv echo "python get-pip.py ===================================" /home/$USERNAME/python2.7/bin/python get-pip.py echo "python -m pip install virtualenv ===================================" /home/$USERNAME/python2.7/bin/python -m pip install virtualenv # Create and activate a new virtualenv echo "virtualenv venv ===================================" /home/$USERNAME/python2.7/bin/virtualenv venv echo "source activate ===================================" source venv/bin/activate # Install python dependencies echo "install wheels ===================================" pip install wheels/*
This should be fairly straightforward: This script simply sets up a new Python environment and installs the dependencies within a new virtual environment.
SSH into the box and run the setup script:
$ ssh <username>@<host-address> $ sh setup.sh 20180119 <username>
This will take a few minutes. Once done, cd
into the app directory and activate the virtual environment:
$ cd app-v20180119 $ source venv/bin/activate
Run the tests:
$ python test.py
Once complete, fire up gunicorn as a daemon:
$ gunicorn -D -b 0.0.0.0:1337 app:app
Feel free to use a process manager, like Supervisor, to manage gunicorn.
Again, check out the video to see the script in action!
Conclusion
In this article we looked at how to package up a Python project with Docker and Python wheels for deployment on a machine cut off from the Internet.
With this setup, since we’re packaging the code, dependencies, and interpreter up, our deployments are considered immutable. For each new deploy, we’ll spin up a new environment and test to ensure it’s working before bringing down the old environment. This will eliminate any errors or issues that could arise from continuing to deploy on top of legacy code. Plus, if you uncover issues with the new deploy you can easily rollback.
Looking for some challenges?
-
At this point, the Dockerfile and each of the scripts are tied to a Python 2.7.14 environment on Centos 5.11. What if you also had to deploy a Python 3.6.1 version to a different version of Centos? Think about how you could automate this process given a configuration file.
For example:
[ { "os": "centos", "version": "5.11", "bit": "64", "python": ["2.7.14"] }, { "os": "centos", "version": "7.40", "bit": "64", "python": ["2.7.14", "3.6.1"] }, ]
Alternatively, check out the cibuildwheel project, for managing the building of wheel files.
-
You probably only need to bundle the Python interpreter for the first deploy. Update the build_tarball.sh script so that it asks the user whether Python is needed before bundling it.
-
How about logs? Logging could be handled either locally or at the system-level. If locally, how would you handle log rotation? Configure this on your own.
Grab the code from the repo. Please leave comments below!