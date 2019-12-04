|
|
Creating Kubernetes distributions

December 4, 2019

This article was contributed by Sean Kerner

KubeCon NA

Making a comparison between Linux and Kubernetes is often one of apples to oranges. There are, however, some similarities and there is an effort within the Kubernetes community to make Kubernetes more like a Linux distribution. The idea was outlined in a session about Kubernetes release engineering at KubeCon + CloudNativeCon North America 2019. "You might have heard that Kubernetes is the Linux of the cloud and that's like super easy to say, but what does it mean? Cloud is pretty fuzzy on its own," Tim Pepper, the Kubernetes release special interest group (SIG Release) co-chair said. He proceeded to provide some clarity on how the two projects are similar.

[Tim Pepper & Stephen Augustus]

Pepper explained that Kubernetes is a large open-source project with lots of development work around a relatively monolithic core. The core of Kubernetes doesn't work entirely on its own and relies on other components around it to enable a workload to run, in a model that isn't all that dissimilar to a Linux distribution. Likewise, Pepper noted that Linux also has a monolithic core, which is the kernel itself. Alongside the Linux kernel is a whole host of other components that are chosen to work together to form a Linux distribution. Much like a Linux distribution, a Kubernetes distribution is a package of core components, configuration, networking, and storage on which application workloads can be deployed.

Linux has community distributions, such as Debian, where there is a group of people that help to build the distribution, as well as a community of users that can install and run the distribution on their own. Pepper argued that there really isn't a community Kubernetes distribution like Debian, one that uses open-source tools to build a full Kubernetes platform that can then be used by anyone to run their workloads. With Linux, community-led distributions have become the foundation for user adoption and participation, whereas with Kubernetes today, distributions are almost all commercially driven.

Why distributions matter

The real value that comes from Kubernetes and from Linux in Pepper's view, is not from the core, but rather from the user applications that a full distribution enables. Distributions are purpose-built, opinionated assemblies of configurations and tools. Distributions also serve to align different versions of tooling and subprojects into a working release that is easier for users to install and maintain. "One of the things in open source that is really amazing is you have this multiplier effect and distributions are a key part of that," Pepper said.

A Kubernetes distribution is a bit different than a Linux distribution in several respects. With Kubernetes, the Cloud Native Computing Foundation (CNCF) has developed a Kubernetes conformance program to certify that a given platform is in fact Kubernetes. Pepper noted that Linux makes use of a reciprocal open-source license, which means that any code that is forked and distributed needs be shared. Kubernetes uses a permissive license (Apache version 2.0), which Pepper warned comes with the risk of divergent forking. "So where Linux didn't necessarily have conformance testing, we need something like that in Kubernetes to make sure that Kubernetes as a word means something, and that we can understand what that means," he said.

Linux has a large stable of community distributions, such as Debian, Arch, and Fedora, as well as commercial enterprise distributions. "Where are our Kubernetes community distributions?" Pepper asked. "Of the hundred conformant offerings, most of them are commercial." The full list of conformant Kubernetes offerings is maintained and regularly updated by the CNCF.

Building a community Kubernetes distribution

Pepper outlined several potential reasons why there isn't a community Kubernetes distribution, including the fact that there are some missing technical components. He started by attempting to define what the base of a community distribution could include. There are the raw Go language binaries and some other code artifacts from the Kubernetes release, but those are only parts of a distribution. There are also several tools needed, including kubeadm, which helps to bootstrap a basic Kubernetes cluster, kops for managing Kubernetes operations, and kubespray, which is a used to deploy a production-ready Kubernetes cluster. Pepper emphasized that the existing open-source tools are intended to help build a cluster and not a distribution.

The Kubernetes community is currently lacking build tools for distributions as well as more robust dependency management, he said. "One of the really useful benefits you see from distros is that they they kind of grok all of the dependencies and give you that coherent opinionated set of things that are going to work together," Pepper said. "Where is our Kubernetes equivalent of koji or Launchpad?" He also wondered why there was no Kubernetes version of Ubuntu's personal package archives (PPAs).

Release engineering

While Kubernetes currently is missing pieces for enabling a true community distribution, work is ongoing in multiple Kubernetes Special Interest Groups (SIGs), including SIG Release and SIG Testing that could point the way forward to a future community distribution.

Stephen Augustus, another SIG Release co-chair, explained that a release-managers group that deals with the build process as well as patch and branch management has started to take shape. The idea behind the group is to codify the process by which Kubernetes releases are produced. "There are scripts that you can check out that have copyright dates of 2016 and they are actually the ones that are responsible for releasing Kubernetes," Augustus said. "We want to get to the point where we can start tearing down some of the technical debt that we've built up in the project over time."

Among the Kubernetes release scripts that date back to 2016 is anago, which is an 1,800-line bash script for releasing Kubernetes. Anago imports three separate libraries, each with another 500 lines of shell code. "It's time to not do that anymore," Augustus said.

The group is starting to rewrite some of the release scripts, one of the first targets is branchff, which is a utility that fast-forwards a branch to the master. Another tool that is being rewritten is push-build, which is responsible for pushing all of the Kubernetes builds up to the Google Cloud.

As part of the overall effort to improve release engineering, there is also the new Kubernetes release toolbox project known as "krel" that Augustus noted is just getting started. The goal is to take all of the various release shell scripts and move them into the toolbox as a set of commands. Another new effort that is getting underway is the kubepkg tool that will enable developers to create deb and RPM packages based on Kubernetes project binaries. "We want there to be a dead simple way to produce debs and RPMs for Kubernetes."

Augustus commented that many companies have built their own tools for Kubernetes releases because there have not been any great tools in the upstream project, but that's now changing. "We're trying to kind of flip that story, change the narrative, and build tools that are actually useful for not just the community, but for for vendors, and for hobbyists to consume as well."

Whether or not a real Kubernetes community distribution will emerge remains to be seen. What is clear is that, as Augustus said, there is a need to remove the technical debt for release engineering, updating complex shell scripts with more modern tools that can help both the project and the broader community to build Kubernetes distributions.


Posted Dec 5, 2019 15:05 UTC (Thu) by rwmj (guest, #5474) [Link] (6 responses)

Isn't OpenShift supposed to be an "opinionated Kubernetes" distribution?

Creating Kubernetes distributions

Posted Dec 5, 2019 15:13 UTC (Thu) by sml (subscriber, #75391) [Link] (5 responses)

Sure, but it's IBM's _commercial_ distro.

Creating Kubernetes distributions

Posted Dec 5, 2019 17:54 UTC (Thu) by SEJeff (guest, #51588) [Link] (4 responses)

It is also 100% open source.

Creating Kubernetes distributions

Posted Dec 15, 2019 10:31 UTC (Sun) by ofr (guest, #107486) [Link] (3 responses)

It's also not 100% Kubernetes compatible even if it passes the conformance check by disabling security.

Creating Kubernetes distributions

Posted Dec 15, 2019 11:13 UTC (Sun) by rahulsundaram (subscriber, #21946) [Link] (2 responses)

> It's also not 100% Kubernetes compatible even if it passes the conformance check by disabling security.

Can you provide a reference to this?

Creating Kubernetes distributions

Posted Dec 16, 2019 20:09 UTC (Mon) by ofr (guest, #107486) [Link] (1 responses)

Reference for what exactly? It's obvious that it's not 100% compatible because many software packages for Kubernetes don't run unmodified on OpenShift. For the claim about the conformance test see https://github.com/openshift/origin/blob/master/test/extended/conformance-k8s.sh

Creating Kubernetes distributions

Posted Dec 16, 2019 20:14 UTC (Mon) by rahulsundaram (subscriber, #21946) [Link]

Not obvious to me. If a distribution of Kurbnetes passes all conformance tests, I would consider it compatible. If not, that would be a failure in the conformance tests by definition

Creating Kubernetes distributions

Posted Dec 5, 2019 19:14 UTC (Thu) by marcH (subscriber, #57642) [Link] (7 responses)

> Among the Kubernetes release scripts that date back to 2016 is anago, which is an 1,800-line bash script for releasing Kubernetes. Anago imports three separate libraries, each with another 500 lines of shell code. "It's time to not do that anymore," Augustus said.

Was any alternative suggested? Didn't find any in the Powerpoint.

A few thousands lines of shell script is too high but not crazy high IMHO.

Unix shell scripting shows its age but I haven't seen anything coming close for smaller programs (say a few hundred lines) interacting with files and gluing other programs together. Nothing as concise, dynamic and high-level. For instance using functions as parameters is trivial - not too bad!

One major drawback is incompatibility with Windows but hey, who implements release management and QA on that? ;-) Most Windows people I run into look like they haven't even heard of PowerShell yet. Click, click, click... or straight to WSL.

Python's subprocess module has come a long way but still requires boilerplate and relatively complex error handling code, people seem to get that wrong every time (error handling code being of course never tested).
Maybe Perl would have been a good candidate if it hadn't committed suicide by optimizing itself for "write-only" usage?

Is there anything else?

Creating Kubernetes distributions

Posted Dec 6, 2019 23:27 UTC (Fri) by IanKelling (subscriber, #89418) [Link]

I've been on the lookout for alternatives for years now and I don't see any. With shellcheck and proper error handling, bash can go pretty far.

Creating Kubernetes distributions

Posted Dec 7, 2019 18:48 UTC (Sat) by epa (subscriber, #39769) [Link] (5 responses)

Python's subprocess module has come a long way but still requires boilerplate and relatively complex error handling code
Can you give an example of how to do error checking and handling correctly in a shell script? It seems to require at least as much boilerplate as Python or Perl if you want to write | pipelines or do control flow while at the same time checking the exit status of each subprocess and perhaps checking whether anything was written on standard error too.

Creating Kubernetes distributions

Posted Dec 7, 2019 21:50 UTC (Sat) by Jandar (subscriber, #85683) [Link]

> It seems to require at least as much boilerplate as Python or Perl if you want to write | pipelines or do control flow while at the same time checking the exit status of each subprocess and perhaps checking whether anything was written on standard error too.

If you use #!/bin/bash "set -o pipefail" gives you checking of exit status of every part of a pipe.

Python as a shell replacement

Posted Dec 7, 2019 23:22 UTC (Sat) by marcH (subscriber, #57642) [Link] (3 responses)

Right, error handling is not easy in shell either. There are a number of things you can do but nothing is bullet proof, agreed. As a start, every single script of mine starts with set -e. I also use a lot of: "do_the_work_func || die ...", "while do_work; do..." etc. (pro tip: you generally don't need ret=$?)

BTW C has similar error handling behaviors, most likely not a coincidence.

I repeat: any shell program longer than a few thousands lines or with some serious data structures is probably a mistake. This being out of the way, let me try to rephrase and clarify what I meant earlier:

1. Python-as-a-shell adds extra code and significant overhead; you can't start prototyping by just throwing your .bash_history into a file anymore.
2. Python has a built-in and pretty good exception system that you generally don't even have to think about.

So why did the migration overhead I paid in 1. didn't magically give me 2. for free? Why do I have to think so much about error handling when I use the subprocess module? In _short_ shell scripts good error handling is the only thing I was missing! So where did my migration money go?

The Python people are very smart, so I guess there must be good technical reasons for that, yet these excuses still don't make Python a desirable replacement for short shell scripts (unless you absolutely need to support Windows). Actually, I'm worried these justifications may not be Python specific and may preclude _any_ general purpose language as a shell replacement...

Insightful interview with Steve Bourne: https://www.arnnet.com.au/article/279011/a-z_programming_...

Python as a shell replacement

Posted Dec 13, 2019 17:23 UTC (Fri) by BenHutchings (subscriber, #37955) [Link] (2 responses)

I also use "set -e" by habit, but it doesn't do exactly what you probably want. When you check the result of a command, that completely suppresses its effect inside the command. For example: 
set -e
f() {
    false
    echo "continued"
}
f || echo "failed"
prints: 
continued

Python as a shell replacement

Posted Dec 13, 2019 22:59 UTC (Fri) by marcH (subscriber, #57642) [Link] (1 responses)

Yes, set -e is absolutely not a silver bullet. It does not catch all errors, so you must lower your expectations. It does catch many of them, which has saved me and many others a lot of time, routinely, for years.

Some influent and vocal experts seem to have decided that, short of catching "all errors", catching "no error" is better than "many errors". I've read all their essays and I still couldn't make sense of their logic https://mywiki.wooledge.org/BashFAQ/105

"Works for us".

PS: besides 105 and a couple others, https://mywiki.wooledge.org/BashFAQ is the best.

Python as a shell replacement

Posted Dec 14, 2019 0:08 UTC (Sat) by karkhaz (subscriber, #99844) [Link]

I occasionally use a combined shell script/makefile if I care about catching errors on each command:

#!/bin/sh
# vim:set syntax=make:set ft=make:

MAKEFILE_START_LINE=$(\
  grep -nre makefile_starts_here "$0" \
  | tail -n 1 \
  | awk -F: '{print $1}')

TMP=$(mktemp)
tail -n+${MAKEFILE_START_LINE} "$0" > "${TMP}"

make -f "${TMP}"
SUCCESS=$?

rm -f "$TMP"
exit "$SUCCESS"

makefile_starts_here:
	command-1
	command-2
	command-3

This prints out everything below and including "makefile_starts_here" to a Makefile and then runs make on it, executing the commands one at a time. This is especially nice if I want built-in parallelism etc, it's actually even better than just using the shell (just ensure to print out "MAKEFLAGS=-j" at the top of the file).

Typhoon is a free Kubernetes distribution

Posted Dec 15, 2019 10:29 UTC (Sun) by ofr (guest, #107486) [Link]

There's already a free (as in free software) Kubernetes distro. It is called Typhoon: https://github.com/poseidon/typhoon

"Typhoon distributes upstream Kubernetes, architectural conventions, and cluster addons, much like a GNU/Linux distribution provides the Linux kernel and userspace components."

So, there you go.


