In Part 1 of this Container series, we covered the historic shift in how applications are run and how segregation is provided via the Container model. Now we can focus on how we achieve that segregated state, as well as the current set of tools that are making this easier.
Managing Containers – the challenge of diverse Dependencies, Tests and Environments
So just how does an application find itself swimming in the Container environment? The virtualization model that’s been at the forefront of Cloud development and deployment is the beginning of this, as it is the model that’s been used in one fashion or another as the basis for bare metal, virtualized, and now Infrastructure-as-a-Service (IaaS) Cloud models for what seems like forever. This model uses the concept of deploying an Operating System, or getting someone else to deploy the Operating System, and then running through a set of steps to get:
- the application code installed (tar or zip bundle, download from git, etc.)
- necessary run-time tools installed (e.g. Apache web server, J2EE server, etc.)
- any application level dependencies installed (e.g. application specific libraries or infrastructure components)
- deployment of any database(s) and installation of necessary schema and initial data
Simple, right ? (Ha!) Oh, and if your application needs to scale, how about repeating the above steps again…and again…and AGAIN for each deployment? While that sounds painful, there is admittedly now a world of tools (developed out of necessity by exhausted Ops) out there to help automate these processes to the point where there isn’t that much pain in this process…but it still is a bit of an ordeal. Additionally, if you know you are going to scale your application, you can build a Golden OS image with all of the code, dependencies, and supporting applications (and their dependencies) pre-installed. Necessity, as always, is the mother of invention. However, even with the creative management described above, there are a couple of areas where this model becomes cumbersome, if not downright difficult to deal with.
The difficult side is perhaps the most straightforward: dependency tracking. For example, tools like Chef, Ansible, Puppet and Salt (CAPS)–and a host of others–are specifically designed to deal with dependencies. That’s great so long as you don’t have a dependency collision on your target system. Luckily, the tools are usually smart enough not to outright break your system (and will normally stop before making changes), but you still have to determine how to fix a dependency, often while trying to do a rolling upgrade of your system! Murphy’s Law is alive and well, as this is definitely not when you want to find out about these things.
The cumbersome side, which can become a major issue as well, occurs when the number of tests and environments in which to run tests/acceptance/regression, etc., starts to blossom. Even if those test environments are not re-built for every test (and they frequently are or should be!), these operations take time, space, and resources. The container model, in focusing on only the specific components needed to run and/or validate a specific application or service, can ensure that the dependencies are met as a part of the targeted bundle. In addition, supporting applications and components can then run in their own adjacent containers, allowing their own dependencies to be segregated from the application under test, further reducing the impact of devices under test.
Container Management Services – The new Frontier
There is, unfortunately, a flipside to Container benefits. In deploying more and more components to make use of dependency segregation, you generate many more moving parts that must be tracked, kept alive, scaled, load balanced, upgraded, etc. Therefore, in the new Container environment (call it a “Container as a Service” or CaaS), in addition to building the Containers themselves (akin to building a Golden OS image but often much lighter weight) and then storing them somewhere platform accessible, it is also important to have a service manager, service discovery support , a scale enablement tool, etc. This is why Docker, Kubernetes, and to a lesser extent Mesos have become so prevalent in the discussion.
The Docker toolset is expanding, but its basic function is to simplify and provide a consistent front end for bundling, deploying, and managing the lifecycle of a Container. Initially built on the LXC container project, the Docker environment has since moved to a direct management of the cgroups, namespaces, and network connectivity needed for the management of Containers. One of the additional function implemented by the Docker toolset is a language for describing the composition of a Container and its dependent libraries, making the compilation of a container a simple, efficient, and easily repeatable task. This is a key function that allows a deployment to code both while under test and within the actual production code (which in the end is hopefully the same!).
Kubernetes enhances the basic capabilities of the Docker container manager by providing consistent lifecycle management for groups of Containers (called Pods), along with network connectivity and scale management. By providing tooling to associated resources, as well as providing a language for describing the interactions of the service components (and mapping in a service discovery component via the tagging mechanism), Kubernetes provides a stable platform for managing Containers as micro-services to make up a complete application system space.
Another component that Kubernetes has focused on is managing the network interactions between components both within and between Pods. This may not seem like much initially, but it becomes very important when a Container dies and must have its connections re-established–potentially to a new network address in order to have the replacement Containers’ resources added back into the active pool of components. This is also important when looking to support rolling upgrades, as the connectivity for replacement services is then managed by the Kubernetes engine rather than having to be manually re-provisioned or managed.
Recently, the Docker Swarm project came online, with the aim of supporting multi-Container management across multiple Container-capable Linux systems, leveraging the same command line tools and APIs as the main parts of Docker and in conjunction with Docker’s new network functionality, dramatically simplifies the processes of deploying groups of containers (aka Swarms) across more than one physical underlying compute node. This functionality is similar, but perhaps lighter weight than the more commonly deployed Kubernetes toolset, and it is yet to be seen if on or the other model is preferred.
Lastly, there’s the potential to incorporate a resource-aware scheduler into the life cycle management chain, and this is the space fulfilled by Mesos. While not Container-specific, Mesos provides a resource focused scheduler that enables guaranteed capacity across a distributed and scaled out infrastructure. Being aware of the same cgroup/namespace resources used by all of the Container technologies simplifies the integration of Container cluster management such as Kubernetes as a targeted management framework for the resources being scheduled. Through its focus on distributed resource availability and consumption modeling, it can provide a better scale-out scheduling resource for large scale deployments where this class of scheduling can help make more efficient use of hundreds to thousands of compute resources. At smaller scale, the simpler scheduling tools built into Kubernetes or Docker Swarm are usually adequate.
The Takeaway on Containers
The resultant benefits of all these components is the reason that so many folks are looking to Container as their service model of choice. The first thing people notice is the lack of management of the underlying operating system (not to fear: Sysadmins get to keep their jobs, with a new focus on operating Container-aware infrastructure and platform services). Instead, with Containers comes a focus on a lighter weight application dependency resolution tool. This makes the continuous integration lifecycle easier to manage and more inclusive of system-level tests of all commits, rather than just major ones. Implementing Containers also reaps benefits when looking at upgrades, as deploying new containers in place of old ones dramatically simplifies the creation of a more consistent rolling upgrade model for Continuous Deployment environments. And I, for one, think that is a beautiful thing.
Robert Starmer, CTO/Principal, Kumulus Technologies