You learned about Docker. It's awesome and you're excited. You go and create a Dockerfile:
FROM ubuntu:16.04 RUN apt-get install all_my_dependencies ADD my_app_files /my_app CMD ["/my_app/start.sh"]
Cool, it seems to work. Pretty easy, right?
Not so fast.
You just built a container which contains a minimal operating system, and which only runs your app. But the operating system inside the container is not configured correctly. A proper Unix system should run all kinds of important system services. You're not running them, you're only running your app.
"What do you mean? I'm just using Ubuntu in Docker. Doesn't the OS inside the container take care of everything automatically?"
Not quite. You have Ubuntu installed in Docker. The files are there. But that doesn't mean Ubuntu's running as it should.
When your Docker container starts, only the CMD command is run. The only processes that will be running inside the container is the CMD command, and all processes that it spawns. That's why all kinds of important system services are not run automatically – you have to run them yourself.
Furthermore, Ubuntu is not designed to be run inside Docker. Its init system, Upstart, assumes that it's running on either real hardware or virtualized hardware, but not inside a Docker container, which is a locked down environment with e.g. no direct access to many kernel resources. Normally, that's okay: inside a container you don't want to run Upstart anyway. You don't want a full system, you want a minimal system. But configuring that minimal system for use within a container has many strange corner cases that are hard to get right if you are not intimately familiar with the Unix system model. This can cause a lot of strange problems.
"What important system services am I missing?"
A correct init process
Main article: Docker and the PID 1 zombie reaping problem
Here's how the Unix process model works. When a system is started, the first process in the system is called the init process, with PID 1. The system halts when this processs halts. If you call
CMD ["/my_app/start.sh"] in your Dockerfile, then start.sh is your init process.
Most likely, your init process is not doing that at all. As a result your container will become filled with zombie processes over time.
docker stop sends SIGTERM to the init process, which is then supposed to stop all services. If your init process is your app, then it'll probably only shut down itself, not all the other processes in the container. The kernel will then forcefully kill those other processes, not giving them a chance to gracefully shut down, potentially resulting in file corruption, stale temporary files, etc. You really want to shut down all your processes gracefully.
Syslog is the standard Unix logging service. A syslog daemon is necessary so that many services - including the kernel itself - can correctly log to /var/log/syslog. If no syslog daemon is running, a lot of important messages are silently swallowed. You don't want warnings and errors to be silently swallowed, do you?
The syslog daemon is not run automatically. You have to start it yourself.
Many apps use cron services. But cron jobs never get run until the cron daemon is running in your container.
The cron daemon is not run automatically. You have to start it yourself.
SSH daemon (sometimes)
Occasionally, you may want to run a command inside the container for contingency reasons. For example you may want to debug your misbehaving app. docker exec provides a great way of doing this, but unfortunately there are a number of drawbacks. For example, users who run docker exec must have access to the Docker daemon, and that way they essentially have root access over the Docker host.
If that is problematic, then you should use SSH to log into the container instead. SSH has its own issues, like requiring key management, but that way you can prevent people from getting root access on the Docker host.
"Does all this apply too if I'm using CentOS inside the container, or another Linux distribution?"
Yes. The problem exist in those cases too.
"But I thought Docker is about running a single process in a container?"
Absolutely not true. Docker runs fine with multiple processes in a container. In fact, there is no technical reason why you should limit yourself to one process – it only makes things harder for you and breaks all kinds of essential system functionality, e.g. syslog.
We encourage you to use multiple processes.
Managing multiple processes can be painful, but it doesn't have to. We have a solution for that, so read on.
Getting everything right: baseimage-docker
Baseimage-docker is a special Docker image that is configured for correct use within Docker containers. It is Ubuntu, plus:
- Modifications for Docker-friendliness.
- Administration tools that are especially useful in the context of Docker.
- Mechanisms for easily running multiple processes, without violating the Docker philosophy.
Also, every single one of the aforementioned problems are taken care of for you.
You can use it as a base for your own Docker images. That means it's available for pulling from the Docker registry!
Why use baseimage-docker?
You can configure the stock
ubuntu image yourself from your Dockerfile, so why bother using baseimage-docker?
A correct init process
Baseimage-docker comes with an init process
/sbin/my_initthat reaps orphaned child processes correctly, and responds to SIGTERM correctly. This way your container won't become filled with zombie processes, and
docker stop will work correctly.
Fixes APT incompatibilities with Docker
See Docker issue #1024.
It runs a syslog daemon so that important system messages don't get lost.
It runs a cron daemon so that cronjobs work.
Allows you to easily login to your container to inspect or administer things.
SSH is only one of the methods provided by baseimage-docker for this purpose. The other method is through `docker exec`. SSH is also provided as an option because `docker exec` has issues.
Password and challenge-response authentication are disabled by default. Only key authentication is allowed.
The SSH daemon is disabled by default.
Used for service supervision and management. Much easier to use than SysV init and supports restarting daemons when they crash. Much easier to use and more lightweight than Upstart.
Baseimage-docker encourages you to run multiple processes through the use of runit.
You might be familiar with supervisord. Runit (written in C) is much lighter weight than supervisord (written in Python).
A custom tool for running a command as another user. Easier to use than
su, has a smaller attack vector than
sudo, and unlike
chpst this tool sets
$HOME correctly. Available as
Despite all these components, baseimage-docker is extremely lightweight: it only consumes 6 MB of memory.
- Stop reinventing the wheel.
Configuring the base system for Docker-friendliness is no easy task. As stated before, there are many corner cases. By the time that you've gotten all that right, you've reinvented baseimage-docker. Using baseimage-docker will save you from this effort.
- Reduce development time.
It reduces the time needed to write a correct Dockerfile. You won't have to worry about the base system and can focus on your stack and your app.
- Reduce building time.
It reduces the time needed to run
docker build, allowing you to iterate your Dockerfile more quickly.
- Reduce deployment time.
It reduces download time during redeploys. Docker only needs to download the base image once: during the first deploy. On every subsequent deploys, only the changes you make on top of the base image are downloaded.
GETTING STARTED NOW
The image is called
phusion/baseimage, and is available on the Docker registry.
# Use phusion/baseimage as base image. To make your builds # reproducible, make sure you lock down to a specific version, not # to `latest`! See # https://github.com/phusion/baseimage-docker/blob/master/Changelog.md # for a list of version numbers. FROM phusion/baseimage:<VERSION> # Use baseimage-docker's init system. CMD ["/sbin/my_init"] # ...put your own build instructions here... # Clean up APT when done. RUN apt-get clean && rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*
Adding additional daemons
You can add additional daemons (e.g. your own app) to the image by creating runit entries. You only have to write a small shell script which runs your daemon, and runit will keep it up and running for you, restarting it when it crashes, etc.
The shell script must be called
run, must be executable, and is to be placed in the directory
Here's an example showing you how a memcached server runit entry can be made.
### In memcached.sh (make sure this file is chmod +x): #!/bin/sh # `/sbin/setuser memcache` runs the given command as the user `memcache`. # If you omit that part, the command will be run as root. exec /sbin/setuser memcache /usr/bin/memcached >>/var/log/memcached.log 2>&1 ### In Dockerfile: RUN mkdir /etc/service/memcached ADD memcached.sh /etc/service/memcached/run
Note that the shell script must run the daemon without letting it daemonize/fork it. Usually, daemons provide a command line flag or a config file option for that.
This website only covers the basics. Please refer to the Github repositoryfor more documentation. Topics include:
- Running scripts during container startup
- Instructions for logging into the container using SSH
- Disabling SSH
Having problems? Want to participate in development? Please post a message at the discussion forum.