|
|
Subscribe / Log in / New account

Development

The Rocket containerization system

By Nathan Willis
December 3, 2014

The field of software-container options for Linux expanded again this week with the launch of the Rocket project by the team behind CoreOS. Rocket is a direct challenger to the popular Docker containerization system. The decision to split from Docker was, evidently, driven by CoreOS developers' dissatisfaction with several recent moves within the Docker project. Primarily, the CoreOS team's concern is Docker's expansion from a standalone container format to a larger platform that includes tools for additional parts of the software-deployment puzzle.

There is no shortage of other Linux containerization projects apart from Docker already, of course—LXC, OpenVZ, lmctfy, and Sandstorm, to name a few. But CoreOS was historically a big proponent of (and contributor to) Docker.

The idea behind CoreOS was to build a lightweight and easy-to-administer server operating system, on which Docker containers can be used to deploy and manage all user applications. In fact, CoreOS strives to be downright minimalist in comparison to standard Linux distributions. The project maintains etcd to synchronize system configuration across a set of machines and fleet to perform system initialization across a cluster, but even that set of tools is austere compared to the offerings of some cloud-computing providers.

Launch

On December 1, the CoreOS team posted an announcement on its blog, introducing Rocket and explaining the rationale behind it. Chief among its stated justifications for the new project was that Docker had begun to grow from its initial concept as "a simple component, a composable unit" into a larger and more complex deployment framework:

Unfortunately, a simple re-usable component is not how things are playing out. Docker now is building tools for launching cloud servers, systems for clustering, and a wide range of functions: building images, running images, uploading, downloading, and eventually even overlay networking, all compiled into one monolithic binary running primarily as root on your server.

The post also highlighted the fact that, early on in its history, the Docker project had published a manifesto that argued in favor of simple container design—and that the manifesto has since been removed.

The announcement then sets out the principles behind Rocket. The various tools will be independent "composable" units, security primitives "for strong trust, image auditing and application identity" will be available, and container images will be easy to discover and retrieve through any available protocol. In addition, the project emphasizes that the Rocket container format will be "well-specified and developed by a community." To that end, it has published the first draft of the App Container Image (ACI) specification on GitHub.

As for Rocket itself, it was launched at version 0.10. There is a command-line tool (rkt) for running an ACI image, as well as a draft specification describing the runtime environment and facilities needed to support an ACI container, and the beginnings of a protocol for finding and downloading an ACI image.

Rocket is, for the moment, certainly a lightweight framework in keeping with what one might expect form CoreOS. Running a containerized application with Rocket involves three "stages."

Stage zero is the container-preparation step; the rkt binary generates a manifest for the container, creates the initial filesystem required, then fetches the necessary ACI image file and unpacks it into the new container's directory. Stage one involves setting up the various cgroups, namespaces, and mount points required by the container, then launching the container's systemd process. Stage two consists of actually launching the application inside its container.

What's up with Docker

The Docker project, understandably, did not view the announcement of Rocket in quite the same light as CoreOS. In a December 1 post on the Docker blog, Ben Golub defends the decision to expand the Docker tool set beyond its initial single-container roots:

While Docker continues to define a single container format, it is clear that our users and the vast majority of contributors and vendors want Docker to enable distributed applications consisting of multiple, discrete containers running across multiple hosts.

We think it would be a shame if the clean, open interfaces, anywhere portability, and robust set of ecosystem tools that exist for single Docker container applications were lost when we went to a world of multiple container, distributed applications. As a result, we have been promoting the concept of a more comprehensive set of orchestration services that cover functionality like networking, scheduling, composition, clustering, etc.

But the existence of such higher-level orchestration tools and multi-container applications, he said, does not prevent anyone from using the Docker single-container format. He does acknowledge that "a small number of vendors disagree with this direction", some of whom have "technical or philosophical differences, which appears to be the case with the recent announcement regarding Rocket."

The post concludes by noting that "this is all part of a healthy, open source process" and by welcoming competition. It also, however, notes the "questionable rhetoric and timing of the Rocket announcement" and says that a follow-up post addressing some of the technical arguments from the Rocket project is still to come.

Interestingly enough, the CoreOS announcement of Rocket also goes out of its way to reassure users that CoreOS will continue to support Docker containers in the future. Less clear is exactly what that support will look like; the wording says to "expect Docker to continue to be fully integrated with CoreOS as it is today", which might suggest that CoreOS is not interested in supporting Docker's newer orchestration tools.

In any case, at present, Rocket and its corresponding ACI specification makes use of the same underlying Linux facilities employed by Docker, LXC containers, and most of the other offerings. One might well ask whether or not a "community specification" is strictly necessary as an independent entity. But as containerization continues to make its way into the enterprise market, it is hardly surprising to see more than one project vie for privilege of defining what a standard container should look like.

Comments (15 posted)

Moving some of Python to GitHub?

By Jake Edge
December 3, 2014

Over the years, Python's source repositories have moved a number of times, from CVS on SourceForge to Subversion at Python.org and, eventually, to Mercurial (aka hg), still on Python Software Foundation (PSF) infrastructure. But the new Python.org site code lives at GitHub (thus in a Git repository) and it looks like more pieces of Python's source may be moving in that direction. While some are concerned about moving away from a Python-based DVCS (i.e. Mercurial) into a closed-source web service, there is a strong pragmatic streak in the Python community that may be winning out. For good or ill, GitHub has won the popularity battle over any of the other alternatives, so new contributors are more likely to be familiar with that service, which makes it attractive for Python.

The discussion got started when Nick Coghlan posted some thoughts on his Python Enhancement Proposal (PEP 474) from July. It suggested creating a "forge" for hosting some Python documentation repositories using Kallithea—a Python-based web application for hosting Git and Mercurial repositories—once it has a stable release. More recently, though, Coghlan realized that there may not be a need to require hosting those types of repositories on PSF infrastructure as the PEP specified; if that is the case, "then the obvious candidate for Mercurial hosting that supports online editing + pull requests is the PSF's BitBucket account".

But others looked at the same set of facts a bit differently. Donald Stufft compared the workflow of the current patch-based system to one that uses GitHub-like pull requests (PRs). Both for contributors and maintainers (i.e. Python core developers), the time required to handle a simple patch was something like 10-15 minutes with the existing system, he said, while a PR-based system would reduce that to less than a minute—quite possibly much less.

Python benevolent dictator for life (BDFL) Guido van Rossum agreed, noting that GitHub has easily won the popularity race. He was also skeptical that the PSF should be running servers:

[...] We should move to GitHub, because it is the easiest to use and most contributors already know it (or are eager to learn thee). Honestly, the time for core devs (or some other elite corps of dedicated volunteers) to sysadmin their own machines (virtual or not) is over. We've never been particularly good at this, and I don't see us getting better or more efficient.

Moving the CPython code and docs is not a priority, but everything else (PEPs, HOWTOs etc.) can be moved easily and I am in favor of moving to GitHub. For PEPs I've noticed that for most PEPs these days (unless the primary author is a core dev) the author sets up a git repo first anyway, and the friction of moving between such repos and the "official" repo is a pain.

GitHub, however, only supports Git, so those who are currently using Mercurial and want to continue would be out of luck. Bitbucket supports both, though, so in Coghlan's opinion, it would make a better interim solution. But Stufft is concerned that taking the trouble to move, but choosing the less popular site, makes little sense.

On the other hand, some are worried about lock-in with GitHub (and other closed-source solutions, including Bitbucket). As Coghlan put it:

And this is why the "you can still get your data out" argument doesn't make any sense - if you aren't planning to rely on the proprietary APIs, GitHub is just a fairly mundane git hosting service, not significantly different in capabilities from Gitorious, or RhodeCode, or BitBucket, or GitLab, etc. So you may as well go with one of the open source ones, and be *completely* free from vendor lockin.

The feature set that GitHub provides is what will keep the repositories there, though, Stufft said: "You probably won’t want to get your data out because Github’s features are compelling enough that you don’t want to lose them". Furthermore, he looked at the Python-affiliated repositories on the two sites and found that there were half a dozen active repositories on GitHub and three largely inactive repositories on Bitbucket.

The discussion got a bit testy at times, with Coghlan complaining that choosing GitHub based on its popularity was anti-community: "I'm very, very disappointed to see folks so willing to abandon fellow community members for the sake of following the crowd". He went on to suggest that perhaps Ruby or JavaScript would be a better choice for a language to work on since they get better press. Van Rossum called that "a really low blow" and pointed out: "*A DVCS repo is a social network, so it matters in a functional way what everyone else is using.*" He continued:

So I give you that if you want a quick move into the modern world, while keeping the older generation of core devs happy (not counting myself :-), BitBucket has the lowest cost of entry. But I strongly believe that if we want to do the right thing for the long term, we should switch to GitHub. I promise you that once the pain of the switch is over you will feel much better about it. I am also convinced that we'll get more contributions this way.

Eventually, Stufft proposed another PEP (481) that would migrate three documentation repositories (the Development Guide, the development system in a box (devinabox), and the PEPs) to GitHub. Unlike the situation with many PEPs, Van Rossum stated that he didn't feel it was his job to accept or reject the PEP, though he made a strong case for moving to GitHub; he believes that most of the community is probably already using GitHub in one way or another, lock-in doesn't really concern him since the most important data is already stored in multiple places, and, in his mind, Python does not have an "additional hidden agenda of bringing freedom to all software".

It turns out that Brett Cannon is the contact for two of the three repositories mentioned in the PEP (devguide and devinabox), so Van Rossum is leaving the decision to Cannon for those two. Coghlan is the largest contributor to the PEPs repository, so the decision on that will be left up to him. He is currently exploring the possibility of using RhodeCode Enterprise (a Python-based, hosted solution with open code, but one that has licensing issues that Coghlan did acknowledge). For his part, Cannon noted his preference for open, Mercurial-and-Python-based solutions, but he is willing to consider other options. There may be a discussion at the Python language summit (which precedes PyCon), but, if so, Van Rossum said he probably won't take part—it's clear he has tired of the discussion at this point.

There are good arguments on both sides of the issue, but it is a little sad to see Python potentially moving away from the DVCS written in the language and into the more popular (and feature-rich, seemingly) DVCS and hosting site (Git and GitHub). While Van Rossum does not plan to propose moving the CPython (main Python language code) repository to GitHub anytime soon, the clear implication is that he would not be surprised if that happens eventually. While it might make pragmatic sense on a number of different levels, and may have all the benefits that have been mentioned, it would certainly be something of a blow to the open-source Python DVCS communities. With luck, those communities will find the time to fill the functionality gaps, but the popularity gap will be much harder to overcome.

Comments (66 posted)

Kawa — fast scripting on the Java platform

December 3, 2014

This article was contributed by Per Bothner

Kawa is a general-purpose Scheme-based programming language that runs on the Java platform. It aims to combine the strengths of dynamic scripting languages (less boilerplate, fast and easy start-up, a read-eval-print loop or REPL, no required compilation step) with the strengths of traditional compiled languages (fast execution, static error detection, modularity, zero-overhead Java platform integration). I created Kawa in 1996, and have maintained it since. The new 2.0 release has many improvements.

Projects and businesses using Kawa include: MIT App Inventor (formerly Google App Inventor), which uses Kawa to translate its visual blocks language; HypeDyn, which is a hypertext fiction authoring tool; and Nü Echo, which uses Kawa for speech-application development tools. Kawa is flexible: you can run source code on the fly, type it into a REPL, or compile it to .jar files. You can write portably, ignoring anything Java-specific, or write high-performance, statically-typed Java-platform-centric code. You can use it to script mostly-Java applications, or you can write big (modular and efficient) Kawa programs. Kawa has many interesting features; below we'll look at a few of them.

Scheme and standards

Kawa is a dialect of Scheme, which has a long history in programming-language and compiler research, and in teaching. Kawa 2.0 supports almost all of R7RS (Revised7 Report on the Algorithmic Language Scheme), the 2013 language specification. (Full continuations is the major missing feature, though there is a project working on that.) Scheme is part of the Lisp family of languages, which also includes Common Lisp, Dylan, and Clojure.

One of the strengths of Lisp-family languages (and why some consider them weird) is the uniform prefix syntax for calling a function or invoking an operator:

    (op arg1 arg2 ... argN)
If op is a function, this evaluates each of arg1 through argN, and then calls op with the resulting values. The same syntax is used for arithmetic:
    (+ 3 4 5)
and program structure:
    ; (This line is a comment - from semi-colon to end-of-line.)
    ; Define variable 'pi' to have the value 3.13.
    (define pi 3.13)

    ; Define single-argument function 'abs' with parameter 'x'.
    (define (abs x)
      ; Standard function 'negative?' returns true if argument is less than zero.
      (if (negative? x) (- x) x)

Having a simple regular core syntax makes it easier to write tools and to extend the language (including new control structures) via macros.

Performance and type specifiers

Kawa gives run-time performance a high priority. The language facilitates compiler analysis and optimization. Flow analysis is helped by lexical scoping and the fact that a variable in a module (source file) can only be assigned to in that module. Most of the time the compiler knows which function is being called, so it can generate code to directly invoke a method. You can also associate a custom handler with a function for inlining, specialization, or type-checking.

To aid with type inference and type checking, Kawa supports optional type specifiers, which are specified using two colons. For example:

    (define (find-next-string strings ::vector[string] start ::int) ::string
      ...)

This defines find-next-string with two parameters: strings is a vector of strings, and start is a native (Java) int; the return type is a string.

Kawa also does a good job of catching errors at compile time.

The Kawa runtime doesn't need to do a lot of initialization, so start-up is much faster than other scripting languages based on the Java virtual machine (JVM). The compiler is fast enough that Kawa doesn't use an interpreter. Each expression you type into the REPL is compiled on-the-fly to JVM bytecodes, which (if executed frequently) may be compiled to native code by the just-in-time (JIT) compiler.

Function calls and object construction

If the operator op in an expression like (op arg1 ... argN)) is a type, then the Kawa compiler looks for a suitable constructor or factory method.

    (javax.swing.JButton "click here")
    ; equivalent to Java's: new javax.swing.JButton("click here")

If the op is a list-like type with a default constructor and has an add method, then an instance is created, and all the arguments are added:

    (java.util.ArrayList 11 22 33)
    ; evaluates to: [11, 22, 33]

Kawa allows keyword arguments, which can be used in an object constructor form to set properties:

    (javax.swing.JButton text: "Do it!" tool-tip-text: "do it")

The Kawa manual has more details and examples. There are also examples for other frameworks, such as for Android and for JavaFX.

Other scripting languages also have convenient syntax for constructing nested object structures (for example Groovy builders), but they require custom builder helper objects and/or are much less efficient. Kawa's object constructor does most of the work at compile-time, generating code as good as hand-written Java, but less verbose. Also, you don't need to implement a custom builder if the defaults work, as they do for Swing GUI construction, for example.

Extended literals

Most programming languages provide convenient literal syntax only for certain built-in types, such as numbers, strings, and lists. Other types of values are encoded by constructing strings, which are susceptible to injection attacks, and which can't be checked at compile-time.

Kawa supports user-defined extended literal types, which have the form:

    &tag{text}
The tag is usually an identifier. The text can have escaped sub-expressions:
    &tag{some-text&[expression]more-text}
The expression is evaluated and combined with the literal text. Combined is often just string-concatenation, but it can be anything depending on the &tag. As an example, assume:
    (define base-uri "http://example.com/")
then the following concatenates base-uri with the literal "index.html" to create a new URI object:
    &URI{&[base-uri]index.html}

The above example gets de-sugared into:

    ($construct$:URI $<<$ base-uri $>>$ "index.html")

The $construct$:URI is a compound name (similar to an XML "qualified name") in the predefined $construct$ namespace. The $<<$ and $>>$ are just special symbols to mark an embedded sub-expression; by default they're bound to unique empty strings. So the user (or library writer) just needs to provide a definition of the compound name $construct$:URI as either a procedure or macro, resolved using standard Scheme name lookup rules; no special parser hooks or other magic is involved. This procedure or macro can do arbitrary processing, such as construct a complex data structure, or search a cache.

Here is a simple-minded definition of $construct$:URI as a function that just concatenates all the arguments (the literal text and the embedded sub-expressions) using the standard string-append function, and passes the result to the URI constructor function:

    (define ($construct$:URI . args)
      (URI (apply string-append args)))

The next section uses extended literals for something more interesting: shell-like process forms.

Shell scripting

Many scripting languages let you invoke system commands (processes). You can send data to the standard input, extract the resulting output, look at the return code, and sometimes even pipe commands together. However, this is rarely as easy as it is using the old Bourne shell; for example command substitution is awkward. Kawa's solution is two-fold:

  1. A "process expression" (typically a function call) evaluates to a Java Process value, which provides access to a Unix-style (or Windows) process.
  2. In a context requiring a string, a Process is automatically converted to a string comprising the standard output from the process.

A trivial example:

   #|kawa:1|# (define p1 &`{date --utc})

("#|...|#" is the Scheme syntax for nestable comments; the default REPL prompt has that form to aid cutting and pasting code.)

The &`{...} syntax uses the extended-literal syntax from the previous section, where the backtick is the 'tag', so it is syntactic sugar for

    ($construct$:` "date --utc")
where $construct$:` might be defined as:
(define ($construct$:` . args) (apply run-process args))
This in turns translates into an expression that creates a gnu.kawa.functions.LProcess object, as you see if you write it:
    #|kawa:2|# (write p1)
    gnu.kawa.functions.LProcess@377dca04

An LProcess is automatically converted to a string (or bytevector) in a context that requires it. This means you can convert to a string (or bytevector):

    #|kawa:3|# (define s1 ::string p1) ; Define s1 as a string.
    #|kawa:4|# (write s1)
    "Wed Nov  1 01:18:21 UTC 2014\n"
    #|kawa:5|# (define b1 ::bytevector p1)
    (write b1)
    #u8(87 101 100 32 74 97 110 ... 52 10)

The display procedure prints the LProcess in "human" form, as a unquoted string:

    #|kawa:6|# (display p1)
    Wed Nov  1 01:18:21 UTC 2014

This is also the default REPL formatting:

    #|kawa:7|# &`{date --utc}
    Wed Nov  1 01:18:22 UTC 2014

We don't have room here to discuss redirection, here documents, pipelines, adjusting the environment, and flow control based on return codes, though I will briefly touch on argument processing and substitution. See the Kawa manual for details, and here for more on text vs. binary files.

Argument processing

To substitute the result of an expression into the argument list is simple using the &[] construct:

    (define my-printer (lookup-my-printer))
    &`{lpr -P &[my-printer] log.pdf}
Because a process is auto-convertible to a string, no special syntax is needed for command substitution:
    &`{echo The directory is: &[&`{pwd}]}
though you'd normally use this short-hand:
    &`{echo The directory is: &`{pwd}}

Splitting a command line into arguments follows shell quoting and escaping rules. Dealing with substitution depends on quotation context. The simplest case is when the value is a list (or vector) of strings, and the substitution is not inside quotes. In that case each list element becomes a separate argument:

    (define arg-list ["-P" "office" "foo.pdf" "bar.pdf"])
    &`{lpr &[arg-list]}

An interesting case is when the value is a string, and we're inside double quotes; in that case newline is an argument separator, but all other characters are literal. This is useful when you have one filename per line, and the filenames may contain spaces, as in the output from find:

    &`{ls -l "&`{find . -name '*.pdf'}"}
This solves a problem that is quite painful with traditional shells.

Using an external shell

The sh tag uses an explicit shell, like the C system() function:

    &sh{lpr -P office *.pdf}
This is equivalent to:
    &`{/bin/sh "lpr -P office *.pdf"}

Kawa adds quotation characters in order to pass the same argument values as when not using a shell (assuming no use of shell-specific features such as globbing or redirection). Getting shell quoting right is non-trivial (in single quotes all characters except single quote are literal, including backslash), and not something you want application programmers to have to deal with. Consider:

    (define authors ["O'Conner" "de Beauvoir"])
    &sh{list-books &[authors]}
The command passed to the shell is the following:
    list-books 'O'\''Conner' 'de Beauvoir'

Having quoting be handled by the $construct$:sh implementation automatically eliminates common code injection problems. I intend to implement a &sql form that would avoid SQL injection the same way.

In closing

Some (biased) reasons why you might choose Kawa over other languages, concentrating on those that run on the Java platform: Java is verbose and requires a compilation step; Scala is complex, intimidating, and has a slow compiler; Jython, JRuby, Groovy, and Clojure are much slower in both execution and start-up. Kawa is not standing still: plans for the next half-year include a new argument-passing convention (which will enable ML-style patterns); full continuation support (which will help with coroutines and asynchronous event handling); and higher-level optimized sequence/iteration operations. I hope you will try out Kawa, and that you will find it productive and enjoyable.

Comments (18 posted)

Brief items

Quotes of the week(s)

the unix philosophy: do 90% of one thing, and barely do it adequately
Adam Jackson

The original plan when I cooked up Just Solve The Problem Month was that there was a set of problems out there that just needed a few hundred people to contribute time and effort, and some otherwise seemingly insurmountable problems could be solved or really, really beaten down into a usable form.

Aaaaand what instead happened was:

  • We announced and set up a Just Solve The Problem Wiki for the first problem.
  • A lot of people worked on the Wiki.
  • I got very busy.
  • People kept working on the Wiki.
  • It’s been two years.
Jason Scott, who was, for the record, ultimately pleased with the resulting File Formats Wiki.

Special Black Friday deal for software developers: switch to open source tools for 100% off.
Jeff Atwood

Comments (1 posted)

GNU LibreJS 6.0.6 released

Version 6.0.6 of the LibreJS add-on for Firefox and other Mozilla-based browsers has been released. LibreJS is a selective JavaScript blocker that disables non-free JavaScript programs. New in this version are support for private-browsing mode and enhanced support for mailto: links on a page where non-free JavaScript has been blocked.

Full Story (comments: none)

Firefox 34 released

Mozilla has released Firefox 34. This version changes the default search engine, includes the Firefox Hello real-time communication client, implements HTTP/2 (draft14) and ALPN, disables SSLv3, and more. See the release notes for details.

Comments (18 posted)

QEMU Advent Calendar 2014 unveiled

The QEMU project has launched its own "Advent calendar" site. Starting with December 1, each day another new virtual machine disk image appears and can be downloaded for exploration in QEMU. The December 1 offering was a Slackware image of truly historic proportions.

Comments (2 posted)

Rocket, a new container runtime from CoreOS

CoreOS has announced that it is moving away from Docker and toward "Rocket," a new container runtime that it has developed. "Unfortunately, a simple re-usable component is not how things are playing out. Docker now is building tools for launching cloud servers, systems for clustering, and a wide range of functions: building images, running images, uploading, downloading, and eventually even overlay networking, all compiled into one monolithic binary running primarily as root on your server. The standard container manifesto was removed. We should stop talking about Docker containers, and start talking about the Docker Platform. It is not becoming the simple composable building block we had envisioned."

Comments (9 posted)

Newsletters and articles

Development newsletters from the past two weeks

Comments (none posted)

Introducing AcousticBrainz

MusicBrainz, the not-for-profit project that maintains an assortment of "open content" music metadata databases, has announced a new effort named AcousticBrainz. AcousticBrainz is designed to be an open, crowd-sourced database cataloging various "audio features" of music, including "low-level spectral information such as tempo, and additional high level descriptors for genres, moods, keys, scales and much more." The data collected is more comprehensive than MusicBrainz's existing AcoustID database, which deals only with acoustic fingerprinting for song recognition. The new project is a partnership with the Music Technology Group at Universitat Pompeu Fabra, and uses that group's free-software toolkit Essentia to perform its acoustic analyses. A follow-up post digs into the AcousticBrainz analysis of the project's initial 650,000-track data set, including examinations of genre, mood, key, and other factors.

Comments (none posted)

New features in Git 2.2.0

The "Atlassian Developers" site has a summary of interesting features in the recent Git 2.2.0 release, including signed pushes. "This is an important step in preventing man-in-the-middle attacks and any other unauthorized updates to your repository's refs. git push has learnt the --signed flag which applies your GPG signature to a "push certificate" sent over the wire during the push invocation. On the server-side, git receive-pack (the command that handles incoming git pushes) has learnt to verify GPG-signed push certificates. Failed verifications can be used to reject pushes and those that succeed can be logged in a file to provide an audit log of when and who pushed particular ref updates or objects to your git server."

Comments (none posted)

Page editor: Nathan Willis
Next page: Announcements>>


Copyright © 2014, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds