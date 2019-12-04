Creating Kubernetes distributions

Making a comparison between Linux and Kubernetes is often one of apples to oranges. There are, however, some similarities and there is an effort within the Kubernetes community to make Kubernetes more like a Linux distribution. The idea was outlined in a session about Kubernetes release engineering at KubeCon + CloudNativeCon North America 2019. "You might have heard that Kubernetes is the Linux of the cloud and that's like super easy to say, but what does it mean? Cloud is pretty fuzzy on its own," Tim Pepper, the Kubernetes release special interest group (SIG Release) co-chair said. He proceeded to provide some clarity on how the two projects are similar.

Pepper explained that Kubernetes is a large open-source project with lots of development work around a relatively monolithic core. The core of Kubernetes doesn't work entirely on its own and relies on other components around it to enable a workload to run, in a model that isn't all that dissimilar to a Linux distribution. Likewise, Pepper noted that Linux also has a monolithic core, which is the kernel itself. Alongside the Linux kernel is a whole host of other components that are chosen to work together to form a Linux distribution. Much like a Linux distribution, a Kubernetes distribution is a package of core components, configuration, networking, and storage on which application workloads can be deployed.

Linux has community distributions, such as Debian, where there is a group of people that help to build the distribution, as well as a community of users that can install and run the distribution on their own. Pepper argued that there really isn't a community Kubernetes distribution like Debian, one that uses open-source tools to build a full Kubernetes platform that can then be used by anyone to run their workloads. With Linux, community-led distributions have become the foundation for user adoption and participation, whereas with Kubernetes today, distributions are almost all commercially driven.

Why distributions matter

The real value that comes from Kubernetes and from Linux in Pepper's view, is not from the core, but rather from the user applications that a full distribution enables. Distributions are purpose-built, opinionated assemblies of configurations and tools. Distributions also serve to align different versions of tooling and subprojects into a working release that is easier for users to install and maintain. "One of the things in open source that is really amazing is you have this multiplier effect and distributions are a key part of that," Pepper said.

A Kubernetes distribution is a bit different than a Linux distribution in several respects. With Kubernetes, the Cloud Native Computing Foundation (CNCF) has developed a Kubernetes conformance program to certify that a given platform is in fact Kubernetes. Pepper noted that Linux makes use of a reciprocal open-source license, which means that any code that is forked and distributed needs be shared. Kubernetes uses a permissive license (Apache version 2.0), which Pepper warned comes with the risk of divergent forking. "So where Linux didn't necessarily have conformance testing, we need something like that in Kubernetes to make sure that Kubernetes as a word means something, and that we can understand what that means," he said.

Linux has a large stable of community distributions, such as Debian, Arch, and Fedora, as well as commercial enterprise distributions. "Where are our Kubernetes community distributions?" Pepper asked. "Of the hundred conformant offerings, most of them are commercial." The full list of conformant Kubernetes offerings is maintained and regularly updated by the CNCF.

Building a community Kubernetes distribution

Pepper outlined several potential reasons why there isn't a community Kubernetes distribution, including the fact that there are some missing technical components. He started by attempting to define what the base of a community distribution could include. There are the raw Go language binaries and some other code artifacts from the Kubernetes release, but those are only parts of a distribution. There are also several tools needed, including kubeadm, which helps to bootstrap a basic Kubernetes cluster, kops for managing Kubernetes operations, and kubespray, which is a used to deploy a production-ready Kubernetes cluster. Pepper emphasized that the existing open-source tools are intended to help build a cluster and not a distribution.

The Kubernetes community is currently lacking build tools for distributions as well as more robust dependency management, he said. "One of the really useful benefits you see from distros is that they they kind of grok all of the dependencies and give you that coherent opinionated set of things that are going to work together," Pepper said. "Where is our Kubernetes equivalent of koji or Launchpad?" He also wondered why there was no Kubernetes version of Ubuntu's personal package archives (PPAs).

Release engineering

While Kubernetes currently is missing pieces for enabling a true community distribution, work is ongoing in multiple Kubernetes Special Interest Groups (SIGs), including SIG Release and SIG Testing that could point the way forward to a future community distribution.

Stephen Augustus, another SIG Release co-chair, explained that a release-managers group that deals with the build process as well as patch and branch management has started to take shape. The idea behind the group is to codify the process by which Kubernetes releases are produced. "There are scripts that you can check out that have copyright dates of 2016 and they are actually the ones that are responsible for releasing Kubernetes," Augustus said. "We want to get to the point where we can start tearing down some of the technical debt that we've built up in the project over time."

Among the Kubernetes release scripts that date back to 2016 is anago, which is an 1,800-line bash script for releasing Kubernetes. Anago imports three separate libraries, each with another 500 lines of shell code. "It's time to not do that anymore," Augustus said.

The group is starting to rewrite some of the release scripts, one of the first targets is branchff, which is a utility that fast-forwards a branch to the master. Another tool that is being rewritten is push-build, which is responsible for pushing all of the Kubernetes builds up to the Google Cloud.

As part of the overall effort to improve release engineering, there is also the new Kubernetes release toolbox project known as "krel" that Augustus noted is just getting started. The goal is to take all of the various release shell scripts and move them into the toolbox as a set of commands. Another new effort that is getting underway is the kubepkg tool that will enable developers to create deb and RPM packages based on Kubernetes project binaries. "We want there to be a dead simple way to produce debs and RPMs for Kubernetes."

Augustus commented that many companies have built their own tools for Kubernetes releases because there have not been any great tools in the upstream project, but that's now changing. "We're trying to kind of flip that story, change the narrative, and build tools that are actually useful for not just the community, but for for vendors, and for hobbyists to consume as well."

Whether or not a real Kubernetes community distribution will emerge remains to be seen. What is clear is that, as Augustus said, there is a need to remove the technical debt for release engineering, updating complex shell scripts with more modern tools that can help both the project and the broader community to build Kubernetes distributions.

