Development
The Rocket containerization system
The field of software-container options for Linux expanded again this week with the launch of the Rocket project by the team behind CoreOS. Rocket is a direct challenger to the popular Docker containerization system. The decision to split from Docker was, evidently, driven by CoreOS developers' dissatisfaction with several recent moves within the Docker project. Primarily, the CoreOS team's concern is Docker's expansion from a standalone container format to a larger platform that includes tools for additional parts of the software-deployment puzzle.
There is no shortage of other Linux containerization projects apart from Docker already, of course—LXC, OpenVZ, lmctfy, and Sandstorm, to name a few. But CoreOS was historically a big proponent of (and contributor to) Docker.
The idea behind CoreOS was to build a lightweight and easy-to-administer server operating system, on which Docker containers can be used to deploy and manage all user applications. In fact, CoreOS strives to be downright minimalist in comparison to standard Linux distributions. The project maintains etcd to synchronize system configuration across a set of machines and fleet to perform system initialization across a cluster, but even that set of tools is austere compared to the offerings of some cloud-computing providers.
Launch
On December 1, the CoreOS team posted an announcement on its blog,
introducing Rocket and explaining the rationale behind it. Chief
among its stated justifications for the new project was that Docker
had begun to grow from its initial concept as "a simple
component, a composable unit
" into a larger and more complex
deployment framework:
The post also highlighted the fact that, early on in its history, the Docker project had published a manifesto that argued in favor of simple container design—and that the manifesto has since been removed.
The announcement then sets out the principles behind Rocket. The
various tools will be independent "composable
" units,
security primitives "for strong trust, image auditing and
application identity
" will be available, and container images
will be easy to discover and retrieve through any available protocol.
In addition, the project emphasizes that the Rocket container format
will be "well-specified and developed by a community.
"
To that end, it has published the first draft of the App Container
Image (ACI) specification
on GitHub.
As for Rocket itself, it was launched at version 0.10. There is a command-line tool (rkt) for running an ACI image, as well as a draft specification describing the runtime environment and facilities needed to support an ACI container, and the beginnings of a protocol for finding and downloading an ACI image.
Rocket is, for the moment, certainly a lightweight framework in keeping with what one might expect form CoreOS. Running a containerized application with Rocket involves three "stages."
Stage zero is the container-preparation step; the rkt binary generates a manifest for the container, creates the initial filesystem required, then fetches the necessary ACI image file and unpacks it into the new container's directory. Stage one involves setting up the various cgroups, namespaces, and mount points required by the container, then launching the container's systemd process. Stage two consists of actually launching the application inside its container.
What's up with Docker
The Docker project, understandably, did not view the announcement of Rocket in quite the same light as CoreOS. In a December 1 post on the Docker blog, Ben Golub defends the decision to expand the Docker tool set beyond its initial single-container roots:
We think it would be a shame if the clean, open interfaces, anywhere portability, and robust set of ecosystem tools that exist for single Docker container applications were lost when we went to a world of multiple container, distributed applications. As a result, we have been promoting the concept of a more comprehensive set of orchestration services that cover functionality like networking, scheduling, composition, clustering, etc.
But the existence of such higher-level orchestration tools and
multi-container applications, he said, does
not prevent anyone from using the Docker single-container format. He does
acknowledge that " The post concludes by noting that " Interestingly enough, the CoreOS announcement of Rocket also goes
out of its way to reassure users that CoreOS will continue to support
Docker containers in the future. Less clear is exactly what that
support will look like; the wording says to " In any case, at present, Rocket and its corresponding ACI
specification makes use of the same underlying Linux facilities
employed by Docker, LXC containers, and most of the other offerings.
One might well ask whether or not a "community specification" is
strictly necessary as an independent entity. But as containerization
continues to make its way into the enterprise market, it is hardly
surprising to see more than one project vie for privilege of defining
what a standard container should look like.
Over the years, Python's source repositories have moved a number of times,
from CVS on SourceForge to Subversion at Python.org and, eventually, to
Mercurial (aka hg), still on Python Software Foundation (PSF)
infrastructure. But the new Python.org site code lives at GitHub (thus in
a Git repository) and it looks like more pieces of Python's source may be
moving in that direction. While some are concerned about moving away from a
Python-based DVCS
(i.e. Mercurial)
into a closed-source web service, there is a strong pragmatic streak in the
Python community that may be winning out. For good or ill, GitHub
has won the popularity battle over any of the other alternatives, so new
contributors are more likely to be familiar with that service, which makes
it attractive for Python.
The discussion got started when Nick Coghlan posted some thoughts on his Python Enhancement
Proposal (PEP 474)
from July. It suggested creating a "forge" for hosting some Python
documentation repositories using Kallithea—a Python-based web
application for hosting
Git and Mercurial repositories—once it has a stable
release. More recently, though, Coghlan realized that there may not be a
need to require hosting those types of repositories on PSF
infrastructure as the PEP specified; if that is the case, "
But others looked at the same set of facts a bit differently. Donald Stufft
compared the workflow of the current
patch-based system to one that uses GitHub-like pull requests (PRs). Both for
contributors and maintainers (i.e. Python core developers), the time
required to handle a simple patch was something like 10-15 minutes with the
existing system, he said, while a PR-based system would reduce that to less than a
minute—quite possibly much less.
Python benevolent dictator for life (BDFL) Guido van Rossum agreed, noting that GitHub has easily won the
popularity race. He was also skeptical that the PSF should be running
servers:
Moving the CPython code and docs is not a priority, but everything else
(PEPs, HOWTOs etc.) can be moved easily and I am in favor of moving to
GitHub. For PEPs I've noticed that for most PEPs these days (unless the
primary author is a core dev) the author sets up a git repo first anyway,
and the friction of moving between such repos and the "official" repo is a
pain.
GitHub, however, only supports Git, so
those who are currently using
Mercurial and want to continue would be out of luck. Bitbucket supports
both, though, so in Coghlan's opinion, it would
make a better interim solution. But Stufft is concerned that taking the
trouble to move, but choosing the less popular site, makes little sense.
On the other hand, some are worried about lock-in with
GitHub (and other closed-source solutions, including Bitbucket). As Coghlan put it:
a small number of vendors disagree with this
direction
", some of whom have "
technical or
philosophical differences, which appears to be the case with the
recent announcement regarding Rocket.
"
this is all part of a
healthy, open source process
" and by welcoming competition. It
also, however, notes the "questionable rhetoric and timing of the Rocket
announcement
" and says that a follow-up post addressing some of
the technical arguments from the Rocket project is still to come.
expect Docker to
continue to be fully integrated with CoreOS as it is today
",
which might suggest that CoreOS is not interested in supporting
Docker's newer orchestration tools.
Moving some of Python to GitHub?
then the obvious
candidate for Mercurial hosting that supports online editing + pull
requests is the PSF's BitBucket account
".
The feature set that GitHub provides is what will keep the repositories there, though,
Stufft said: "You probably won’t want to get your
data out because Github’s features are compelling enough that you
don’t want to lose them
". Furthermore, he looked at the Python-affiliated repositories on the two sites
and found that there were half a dozen active repositories on GitHub and
three largely inactive repositories on Bitbucket.
The discussion got a bit testy at times, with Coghlan complaining that choosing GitHub based on its
popularity was anti-community: "I'm very, very disappointed to see folks so willing to
abandon fellow community members for the sake of following the
crowd
". He went on to suggest that perhaps Ruby or JavaScript would
be a better choice for a language to work on since they get better press.
Van Rossum called that "a really low
blow
" and pointed out: "*A DVCS repo is a social network, so it
matters in a functional way what everyone else is using.*
" He
continued:
Eventually, Stufft proposed another PEP (481) that would migrate three
documentation repositories (the Development Guide, the development system in a box
(devinabox), and the PEPs) to
GitHub. Unlike the situation with many PEPs, Van Rossum stated that he didn't feel it was his job to accept or reject the
PEP, though he made a strong case for moving to GitHub; he believes that
most of the community is probably already using GitHub in one way or
another, lock-in doesn't really concern him since the most important data
is already stored in multiple places, and, in his mind, Python does not
have an "additional hidden agenda of bringing freedom to all software
".
It turns out that Brett Cannon is the contact for two of the three repositories mentioned in the PEP (devguide and devinabox), so Van Rossum is leaving the decision to Cannon for those two. Coghlan is the largest contributor to the PEPs repository, so the decision on that will be left up to him. He is currently exploring the possibility of using RhodeCode Enterprise (a Python-based, hosted solution with open code, but one that has licensing issues that Coghlan did acknowledge). For his part, Cannon noted his preference for open, Mercurial-and-Python-based solutions, but he is willing to consider other options. There may be a discussion at the Python language summit (which precedes PyCon), but, if so, Van Rossum said he probably won't take part—it's clear he has tired of the discussion at this point.
There are good arguments on both sides of the issue, but it is a little sad to see Python potentially moving away from the DVCS written in the language and into the more popular (and feature-rich, seemingly) DVCS and hosting site (Git and GitHub). While Van Rossum does not plan to propose moving the CPython (main Python language code) repository to GitHub anytime soon, the clear implication is that he would not be surprised if that happens eventually. While it might make pragmatic sense on a number of different levels, and may have all the benefits that have been mentioned, it would certainly be something of a blow to the open-source Python DVCS communities. With luck, those communities will find the time to fill the functionality gaps, but the popularity gap will be much harder to overcome.
Kawa — fast scripting on the Java platform
Kawa is a general-purpose Scheme-based programming language that runs on the Java platform. It aims to combine the strengths of dynamic scripting languages (less boilerplate, fast and easy start-up, a read-eval-print loop or REPL, no required compilation step) with the strengths of traditional compiled languages (fast execution, static error detection, modularity, zero-overhead Java platform integration). I created Kawa in 1996, and have maintained it since. The new 2.0 release has many improvements.
Projects and businesses using Kawa include: MIT App Inventor (formerly Google App Inventor), which uses Kawa to translate its visual blocks language; HypeDyn, which is a hypertext fiction authoring tool; and Nü Echo, which uses Kawa for speech-application development tools. Kawa is flexible: you can run source code on the fly, type it into a REPL, or compile it to .jar files. You can write portably, ignoring anything Java-specific, or write high-performance, statically-typed Java-platform-centric code. You can use it to script mostly-Java applications, or you can write big (modular and efficient) Kawa programs. Kawa has many interesting features; below we'll look at a few of them.
Scheme and standards
Kawa is a dialect of Scheme, which has a long history in programming-language and compiler research, and in teaching. Kawa 2.0 supports almost all of R7RS (Revised7 Report on the Algorithmic Language Scheme), the 2013 language specification. (Full continuations is the major missing feature, though there is a project working on that.) Scheme is part of the Lisp family of languages, which also includes Common Lisp, Dylan, and Clojure.
One of the strengths of Lisp-family languages (and why some consider them weird) is the uniform prefix syntax for calling a function or invoking an operator:
(op arg1 arg2 ... argN)If op is a function, this evaluates each of arg1 through argN, and then calls op with the resulting values. The same syntax is used for arithmetic:
(+ 3 4 5)and program structure:
; (This line is a comment - from semi-colon to end-of-line.) ; Define variable 'pi' to have the value 3.13. (define pi 3.13) ; Define single-argument function 'abs' with parameter 'x'. (define (abs x) ; Standard function 'negative?' returns true if argument is less than zero. (if (negative? x) (- x) x)
Having a simple regular core syntax makes it easier to write tools and to extend the language (including new control structures) via macros.
Performance and type specifiers
Kawa gives run-time performance a high priority. The language facilitates compiler analysis and optimization. Flow analysis is helped by lexical scoping and the fact that a variable in a module (source file) can only be assigned to in that module. Most of the time the compiler knows which function is being called, so it can generate code to directly invoke a method. You can also associate a custom handler with a function for inlining, specialization, or type-checking.
To aid with type inference and type checking, Kawa supports optional type specifiers, which are specified using two colons. For example:
(define (find-next-string strings ::vector[string] start ::int) ::string ...)
This defines find-next-string with two parameters: strings is a vector of strings, and start is a native (Java) int; the return type is a string.
Kawa also does a good job of catching errors at compile time.
The Kawa runtime doesn't need to do a lot of initialization, so start-up is much faster than other scripting languages based on the Java virtual machine (JVM). The compiler is fast enough that Kawa doesn't use an interpreter. Each expression you type into the REPL is compiled on-the-fly to JVM bytecodes, which (if executed frequently) may be compiled to native code by the just-in-time (JIT) compiler.
Function calls and object construction
If the operator op in an expression like (op arg1 ... argN)) is a type, then the Kawa compiler looks for a suitable constructor or factory method.
(javax.swing.JButton "click here") ; equivalent to Java's: new javax.swing.JButton("click here")
If the op is a list-like type with a default constructor and has an add method, then an instance is created, and all the arguments are added:
(java.util.ArrayList 11 22 33) ; evaluates to: [11, 22, 33]
Kawa allows keyword arguments, which can be used in an object constructor form to set properties:
(javax.swing.JButton text: "Do it!" tool-tip-text: "do it")
The Kawa manual has more details and examples. There are also examples for other frameworks, such as for Android and for JavaFX.
Other scripting languages also have convenient syntax for constructing nested object structures (for example Groovy builders), but they require custom builder helper objects and/or are much less efficient. Kawa's object constructor does most of the work at compile-time, generating code as good as hand-written Java, but less verbose. Also, you don't need to implement a custom builder if the defaults work, as they do for Swing GUI construction, for example.
Extended literals
Most programming languages provide convenient literal syntax only for certain built-in types, such as numbers, strings, and lists. Other types of values are encoded by constructing strings, which are susceptible to injection attacks, and which can't be checked at compile-time.
Kawa supports user-defined extended literal types, which have the form:
&tag{text}The tag is usually an identifier. The text can have escaped sub-expressions:
&tag{some-text&[expression]more-text}The expression is evaluated and combined with the literal text. Combined is often just string-concatenation, but it can be anything depending on the &tag. As an example, assume:
(define base-uri "http://example.com/")then the following concatenates base-uri with the literal "index.html" to create a new URI object:
&URI{&[base-uri]index.html}
The above example gets de-sugared into:
($construct$:URI $<<$ base-uri $>>$ "index.html")
The $construct$:URI is a compound name (similar to an XML "qualified name") in the predefined $construct$ namespace. The $<<$ and $>>$ are just special symbols to mark an embedded sub-expression; by default they're bound to unique empty strings. So the user (or library writer) just needs to provide a definition of the compound name $construct$:URI as either a procedure or macro, resolved using standard Scheme name lookup rules; no special parser hooks or other magic is involved. This procedure or macro can do arbitrary processing, such as construct a complex data structure, or search a cache.
Here is a simple-minded definition of $construct$:URI as a function that just concatenates all the arguments (the literal text and the embedded sub-expressions) using the standard string-append function, and passes the result to the URI constructor function:
(define ($construct$:URI . args) (URI (apply string-append args)))
The next section uses extended literals for something more interesting: shell-like process forms.
Shell scripting
Many scripting languages let you invoke system commands (processes). You can send data to the standard input, extract the resulting output, look at the return code, and sometimes even pipe commands together. However, this is rarely as easy as it is using the old Bourne shell; for example command substitution is awkward. Kawa's solution is two-fold:
- A "process expression" (typically a function call) evaluates to a Java Process value, which provides access to a Unix-style (or Windows) process.
- In a context requiring a string, a Process is automatically converted to a string comprising the standard output from the process.
A trivial example:
#|kawa:1|# (define p1 &`{date --utc})
("#|...|#" is the Scheme syntax for nestable comments; the default REPL prompt has that form to aid cutting and pasting code.)
The &`{...} syntax uses the extended-literal syntax from the previous section, where the backtick is the 'tag', so it is syntactic sugar for
($construct$:` "date --utc")where $construct$:` might be defined as:
(define ($construct$:` . args) (apply run-process args))This in turns translates into an expression that creates a gnu.kawa.functions.LProcess object, as you see if you write it:
#|kawa:2|# (write p1) gnu.kawa.functions.LProcess@377dca04
An LProcess is automatically converted to a string (or bytevector) in a context that requires it. This means you can convert to a string (or bytevector):
#|kawa:3|# (define s1 ::string p1) ; Define s1 as a string. #|kawa:4|# (write s1) "Wed Nov 1 01:18:21 UTC 2014\n" #|kawa:5|# (define b1 ::bytevector p1) (write b1) #u8(87 101 100 32 74 97 110 ... 52 10)
The display procedure prints the LProcess in "human" form, as a unquoted string:
#|kawa:6|# (display p1) Wed Nov 1 01:18:21 UTC 2014
This is also the default REPL formatting:
#|kawa:7|# &`{date --utc} Wed Nov 1 01:18:22 UTC 2014
We don't have room here to discuss redirection, here documents, pipelines, adjusting the environment, and flow control based on return codes, though I will briefly touch on argument processing and substitution. See the Kawa manual for details, and here for more on text vs. binary files.
Argument processing
To substitute the result of an expression into the argument list is simple using the &[] construct:
(define my-printer (lookup-my-printer)) &`{lpr -P &[my-printer] log.pdf}Because a process is auto-convertible to a string, no special syntax is needed for command substitution:
&`{echo The directory is: &[&`{pwd}]}though you'd normally use this short-hand:
&`{echo The directory is: &`{pwd}}
Splitting a command line into arguments follows shell quoting and escaping rules. Dealing with substitution depends on quotation context. The simplest case is when the value is a list (or vector) of strings, and the substitution is not inside quotes. In that case each list element becomes a separate argument:
(define arg-list ["-P" "office" "foo.pdf" "bar.pdf"]) &`{lpr &[arg-list]}
An interesting case is when the value is a string, and we're inside double quotes; in that case newline is an argument separator, but all other characters are literal. This is useful when you have one filename per line, and the filenames may contain spaces, as in the output from find:
&`{ls -l "&`{find . -name '*.pdf'}"}This solves a problem that is quite painful with traditional shells.
Using an external shell
The sh tag uses an explicit shell, like the C system() function:
&sh{lpr -P office *.pdf}This is equivalent to:
&`{/bin/sh "lpr -P office *.pdf"}
Kawa adds quotation characters in order to pass the same argument values as when not using a shell (assuming no use of shell-specific features such as globbing or redirection). Getting shell quoting right is non-trivial (in single quotes all characters except single quote are literal, including backslash), and not something you want application programmers to have to deal with. Consider:
(define authors ["O'Conner" "de Beauvoir"]) &sh{list-books &[authors]}The command passed to the shell is the following:
list-books 'O'\''Conner' 'de Beauvoir'
Having quoting be handled by the $construct$:sh
implementation
automatically eliminates common code injection problems.
I intend to implement a &sql
form that would avoid
SQL injection the same way.
In closing
Some (biased) reasons why you might choose Kawa over other languages, concentrating on those that run on the Java platform: Java is verbose and requires a compilation step; Scala is complex, intimidating, and has a slow compiler; Jython, JRuby, Groovy, and Clojure are much slower in both execution and start-up. Kawa is not standing still: plans for the next half-year include a new argument-passing convention (which will enable ML-style patterns); full continuation support (which will help with coroutines and asynchronous event handling); and higher-level optimized sequence/iteration operations. I hope you will try out Kawa, and that you will find it productive and enjoyable.
Brief items
Quotes of the week(s)
Aaaaand what instead happened was:
- We announced and set up a Just Solve The Problem Wiki for the first problem.
- A lot of people worked on the Wiki.
- I got very busy.
- People kept working on the Wiki.
- It’s been two years.
GNU LibreJS 6.0.6 released
Version 6.0.6 of the LibreJS add-on for Firefox and other Mozilla-based browsers has been released. LibreJS is a selective JavaScript blocker that disables non-free JavaScript programs. New in this version are support for private-browsing mode and enhanced support for mailto: links on a page where non-free JavaScript has been blocked.
Firefox 34 released
Mozilla has released Firefox 34. This version changes the default search engine, includes the Firefox Hello real-time communication client, implements HTTP/2 (draft14) and ALPN, disables SSLv3, and more. See the release notes for details.QEMU Advent Calendar 2014 unveiled
The QEMU project has launched its own "Advent calendar" site. Starting with December 1, each day another new virtual machine disk image appears and can be downloaded for exploration in QEMU. The December 1 offering was a Slackware image of truly historic proportions.
Rocket, a new container runtime from CoreOS
CoreOS has announced that it is moving away from Docker and toward "Rocket," a new container runtime that it has developed. "Unfortunately, a simple re-usable component is not how things are playing out. Docker now is building tools for launching cloud servers, systems for clustering, and a wide range of functions: building images, running images, uploading, downloading, and eventually even overlay networking, all compiled into one monolithic binary running primarily as root on your server. The standard container manifesto was removed. We should stop talking about Docker containers, and start talking about the Docker Platform. It is not becoming the simple composable building block we had envisioned."
Newsletters and articles
Development newsletters from the past two weeks
- What's cooking in git.git (November 26)
- Haskell Weekly News (November 15)
- LLVM Weekly (November 24)
- LLVM Weekly (December 1)
- OCaml Weekly News (November 25)
- OCaml Weekly News (December 2)
- OpenStack Community Weekly Newsletter (November 21)
- OpenStack Community Weekly Newsletter (November 28)
- Perl Weekly (November 24)
- Perl Weekly (December 1)
- PostgreSQL Weekly News (November 23)
- PostgreSQL Weekly News (November 30)
- Python Weekly (November 20)
- Python Weekly (November 27)
- Ruby Weekly (November 20)
- Ruby Weekly (November 27)
- This Week in Rust (November 24)
- This Week in Rust (December 1)
- Tor Weekly News (November 26)
- Tor Weekly News (December 3)
- Wikimedia Tech News (November 24)
Introducing AcousticBrainz
MusicBrainz, the not-for-profit project that maintains an
assortment of "open content" music metadata databases, has announced
a new effort named AcousticBrainz. AcousticBrainz
is designed to be an open, crowd-sourced database cataloging various
"audio features" of music, including "low-level spectral
information such as tempo, and additional high level descriptors for
genres, moods, keys, scales and much more.
" The data collected
is more comprehensive than MusicBrainz's existing AcoustID database,
which deals only with acoustic fingerprinting for song recognition.
The new project is a partnership with the Music Technology Group at
Universitat Pompeu Fabra, and uses that group's free-software toolkit
Essentia to perform its
acoustic analyses. A follow-up
post digs into the AcousticBrainz analysis of the project's initial
650,000-track data set, including examinations of genre, mood, key,
and other factors.
New features in Git 2.2.0
The "Atlassian Developers" site has a summary of interesting features in the recent Git 2.2.0 release, including signed pushes. "This is an important step in preventing man-in-the-middle attacks and any other unauthorized updates to your repository's refs. git push has learnt the --signed flag which applies your GPG signature to a "push certificate" sent over the wire during the push invocation. On the server-side, git receive-pack (the command that handles incoming git pushes) has learnt to verify GPG-signed push certificates. Failed verifications can be used to reject pushes and those that succeed can be logged in a file to provide an audit log of when and who pushed particular ref updates or objects to your git server."
Page editor: Nathan Willis
Next page:
Announcements>>