A History of Containers: This Ain’t Your Grandma’s Chroot Environment

What’s all the fuss about? Why has the concept of Linux containers become the current rallying cry of cloud infrastructure and Agile/Extreme developers? In this article, we’ll take a look at Linux container history from both the perspective of the evolution of the technology and its value from a developer’s perspective. In Part 2, we’ll look at the tools that are supporting the new model of micro-services based on container-housed domain-specific applications.

Virtualization Models (1)Emulation, then Virtualization

In the infrastructure and data center world, our key focus has always been on providing an infrastructure that supports reliable application function. Initially, the only solution available was bare metal or the actual physical systems and resources that make up our computing environments. That started to change when processors became fast enough to emulate smaller versions of themselves (effectively Pentium hardware emulating older 386 environments), and eventually created a shift to a supervisory role of operating systems interactions known collectively as Host Operating System Virtualization. Those initial emulation tools, however, held another model that, while not apparent at the time, was the precursor to the most recent buzz in providing flexibility to our reliable infrastructure models: the ‘chroot’ environment.

With the appearance of emulated environments and their chroot file systems on the scene, we got our first inkling of a “contained” operating system; in the initial case, it was often in an emulated–and later virtualized–environment. The key value of the chroot environment was that for the most part, the operating system represented by that environment was completely separate from the underlying and concurrently running host operating system. But there was an overlap that was better exploited by the concepts of the kernel control group (commonly known as the cgroup feature) and the even more recent addition of kernel fenced namespaces. With these two features, modern Containers became possible…but they were little used until just a scant few years ago. With the increased interest in the Platform as a Service concept driving the need to provide segregation at the application rather than via individual application/operating systems Containers made a surge into the forefront of the data center and operations landscape.

The rise of containers – Dev or Ops?

This shift to scale-out-of-services delivery based around an application isn’t just for the sake of running applications for production workloads. In fact, the production aspect is likely to become dwarfed by the needs of the Agile development style system cycle, where many small code changes are desired and desired to be tested, as is feasible. The idea that seems to be borne well by at least anecdotal evidence, is to have every code change be tested not only at the unit level, but, assuming those pass the test successfully, to have an acceptance test suite run, possibly a staging environment spun up and further validated, and then finally a migration and upgrade to production to occur in as automated a fashion as possible.

Many development managers that have looked at this have at first done the classic math of trying to determine how many physical (or perhaps virtual) machines are required to do this sort of testing, and the result often makes them discount it out of hand. But this sort of math misses the possible dramatic increases in the potential that the latest shift to containers portends.

Virtualization ModelsContainerization or Virtualization- Why should you care?

First, let’s review what a Container is today and how that compares to the full host OS virtualized systems (aka Virtual Machine (VM)) that have now become commonplace. In its most basic form, a VM hosts a full Operating System, most often a variant of Linux, and in many cases something well beyond the “Just Enough Operating System” class environments that are the defacto target in most IaaS cloud environments.  The classic virtualization environment presents a remotely accessible environment that looks and acts just like a physical server, but is certainly more flexible, given that the virtualization space often makes things like resource updates (increase/decrease the number of CPUs or available system memory) as simple as tweaking a parameter…or worst case, rebooting the system in question.

Additionally, in the VM environment, it is expected that an application has access to the entirety of the Operating System and its resources, virtual though they may be. One of the biggest advantages of this environment (other than the ability to make better use of the underlying physical resources when many small host machines are desired), is that the security models are well understood, and there are very few pathways for a corrupted VM to impact the rest of the environment.

Virtualization Models (2)Containers take a perhaps simpler approach to the segregation aspects, focusing not on segregation at the whole Operating System level, but instead looking at segregation at the process level. The principal segregation mechanism for containers leverages the cgroup function in Linux (where the principal focus on these systems has been over the past few years), which provides a mechanism by which the Linux kernel can limit a process’s access to a resource.  At a minimum, those limits can be applied to CPU, network, disk I/O, and memory resources. In addition, the processes so limited are also now run in namespaces to provide further segregation and provide access to disk and network function segregation (so that one process space can’t easily see another).  Combining these aspects provides for a functional separation at a process level which, to an application developer, tends to be the kinds of capability they are really looking for, without the overhead of managing an entire operating system!

One last aspect of the container model is needed in order to complete the argument for the value of this capability, and that is one of libraries and dependencies. In a virtual OS deployment of an application, the deployment process also requires the installation of its application dependencies and libraries. For example, a library to read/write JSON-formatted objects to interact with a web browser-based front end is a fairly common requirement. If one wants to run multiple applications on a single Operating System, it is important that these additional libraries all be of the same version, or dependency issues will result. This is commonly one of the largest pain points in application deployment and validation testing, and ensuring that the right libraries are available during the testing/acceptance and ultimately deployment phases of an automated development strategy is a major benefit provided by the use of containers.

In the container model, a bundle of the process-specific dependencies can be created, or at least defined so that they can be installed in the process name-space for that particular environment. This allows multiple versions of an application to be deployed–perhaps multiple releases where each release has specific versions of the same libraries as dependencies–all without impacting each other.

Application deployment shifts from Virtual Machine to Containers

The shift to containers for application deployment is something that’s been a core part of the Platform as a Service (PaaS) model space for a number of years, and in fact, may be the reason that the PaaS model is becoming the principal development deployment model even in light of so much talk about Dev/Ops workflows. Consider that a container dramatically reduces how much Ops a Dev needs to understand to get an application running, as one now knows that by defining (and possibly bundling) the specific dependencies needed by their application, they won’t have issues in deploying to a container-capable platform. In fact, one of the nice value-adds of the container model is that the test/staging and production environments can be the same for all practical purposes!

But this is only the first half of the container story. In Part 2, we discuss how containers are being deployed in order to support scale, the resiliency of applications, and even rolling upgrades of service. Read more in the next installment: “How containers are revolutionizing application deployment.

Robert Starmer, CTO/Principal, Kumulus Technologies

Further reading: