LWN.net Logo

LWN.net Weekly Edition for July 4, 2013

Merging lock elision into glibc

By Nathan Willis
July 3, 2013

Glibc, the GNU C library, is a foundational free software project, and one that has served its important role for well over two decades. Recently, though, it has undergone a shift in its development model, with an increase in contributions from volunteers outside of the core team of committers. That sort of change can bring about strife, but it can also inspire important new features. The latter is certainly true—and the former might also be true—with the case of lock elision, which is set to be included for the first time in the just-frozen 2.18 branch. 2.18 will enable developers to test the glibc implementation of lock elision for some lock types but not others—primarily since there is not full consensus within the project on how to enable such experimental new features.

Lock elision, in general, is a technique for speeding up multi-threaded programs that may contend for the same lock. As more and more cores become available, synchronization between threads becomes increasingly expensive. One way to avoid the synchronization overhead is to use transactional memory instead. A memory transaction buffers all of the possible results of an operation; in the common case where nothing interrupts the operation, the transaction is committed, but if something does interfere, the transaction can be rolled back atomically.

For locks, this means that multiple threads can acquire locks on the same object, and if they are modifying different parts of the object, both transactions ought to succeed. For example, one thread can acquire a lock on an object (say, tablefoo), update the row it is interested in (tablefoo(N)), then release the lock. Meanwhile, a different thread can also acquire a lock on tablefoo for the purpose of updating tablefoo(M). Using memory transactions, both threads can afford to be conservative about locking the whole object, but cavalier about updating the part of the object they need—most of the time, both transactions will go through; only when both threads try to update tablefoo(N) is there a collision, at which point one transaction must be rolled back.

But the real trick is that, in this common case where the two threads do not collide, the locks themselves are unnecessary. If the program is smart enough to recognize this, acquiring and releasing the locks can simply be skipped. This is lock elision. Unfortunately, up until recently, real-world implementations of transactional memory (almost always in software) have been too slow to offer a real advantage over manipulating the locks in the traditional manner. That changed, however, with the debut of Intel's Transactional Synchronization Extension (TSX), a hardware implementation of transactional memory for the Haswell generation of CPUs. Consequently, building TSX support into the lock implementations of common libraries would allow existing programs to take advantage of lock elision speed-ups without even recompiling.

Intel's Andi Kleen has been working on a TSX-based lock elision implementation for glibc, which he wrote about back in January 2013. The 14-part patch set has been through many iterations, but in late June the deadline was fast approaching for the glibc 2.18 freeze, and the status of the patches was still a matter of debate.

The patch set adds elision capabilities to both POSIX thread (pthread) mutexes and read/write locks (rwlocks), and uses an adaptive algorithm to decide when to elide locks in a given code path. Essentially, the algorithm keeps track of whether each mutex succeeds at eliding a lock, and if it fails, elision is suspended for a period of time. Not all lock variants are supported; in particular, locks are never elided for recursive mutexes. Elision is automatically attempted when the CPU supports transactional memory, but developers can also explicitly enable or disable it in their code. Kleen's patch set also offers two environment variables, GLIBC_PTHREAD_MUTEX and GLIBC_PTHREAD_RWLOCK, which users can use to explicitly enable or disable each flavor of elision when starting a program.

Lock down?

The general consensus on the libc-alpha mailing list was that the mutex patches (patches 1 through 5, plus 7) were ready for inclusion. However, there was less agreement on the other patches, for three reasons. First, the glibc team was wary of the rwlock patches due to a disagreement over how to interpret the POSIX standard. According to the standard, a "normal" (i.e., non-recursive, non-errorcheck, non-timed) mutex is required to deadlock if the owner of the lock attempts to re-lock it while already holding the lock. However, if the lock in question is elided, this required deadlock does not occur. It is certainly debatable whether or not avoiding a deadlock is really a bad thing (after all, deadlocks are bugs), but the glibc project decided to follow the standard to the letter, and elide only non-"normal" mutexes.

But what is not clear from the specification is whether or not the same behavior is required for rwlocks. Carlos O'Donell has contacted the Austin Group to ask for clarification; if the official answer is that rwlocks are required to deadlock on re-locks, then the rwlock patches for glibc will not be merged as-is.

Second, the definition of "normal" for mutexes is not a simple affair. The POSIX standard in fact requires specific behavior of "normal" mutexes (such as the deadlock-on-re-lock behavior mentioned above). But the standard also allows for a different type of mutex termed "default" mutexes, in which the implementation is allowed more freedom of behavior. In previous versions of glibc, PTHREAD_MUTEX_NORMAL and PTHREAD_MUTEX_DEFAULT were defined to be identical. Kleen's patch set splits them, in order to allow "default" mutexes to be elided. Technically, not deadlocking on re-lock would violate the standard, even though not deadlocking because the lock had been elided would often be seen as a preferable outcome. But splitting PTHREAD_MUTEX_NORMAL and PTHREAD_MUTEX_DEFAULT could be seen as an ABI change, and, even though it would not affect old binaries, several in the project (such as Roland McGrath) felt that more consideration is needed before making the split, since it would be difficult to reverse after the fact.

Finally, there was also a lack of consensus about whether or not environment variables are ultimately the most appropriate mechanism with which to tune optional runtime features like lock elision. In addition to the coarse-grained enable-or-disable-elision functionality of the new environment variables, Kleen's patch set also adds several parameters that can be used to tune the behavior of the adaptive algorithm. Some would prefer adding a "tunables" API, while others see no problem with adding new environment variables under a well-known namespace (namely, GLIBC_) as long as there is sufficient discussion. Plus, since the elision algorithm is brand-spanking-new, with little or no testing outside of the confines of the glibc project, it is still possible that the algorithm itself could undergo a major overhaul before it is ready for real-world use. Offering tunable parameters is one way for real-world tests to help refine the algorithm, but if users become dependent on the specifics exposed, swapping in a different algorithm later becomes trickier.

Testing is a related matter, albeit one not currently holding up inclusion of the lock elision patch set. IBM's Dominik Vogt has been testing the patch set on the company's System z platform, which is the only other widely available processor architecture to offer its own implementation of hardware transactional memory. So far, his tests have produced as many questions as answers (in part because he is still working his way through the internal processes required to publicly release the test suite as free software). But in the long term, providing a test suite is a vital step for the project—doubtless in coming years there will be more processor architectures to add their own implementations of transactional memory.

Friends of the library

O'Donell declared glibc frozen for 2.18 on July 2, after Kleen checked in the approved mutex patches. As of press time, the project was still waiting to hear back from the Austin Group for clarification on the rwlock deadlock-on-re-lock question, and it appears that decisions on the normal/default split, tunables, and other patches may get deferred until later. O'Donell has assembled the contributor input on the tunables question in a page on the project's wiki. That discussion is expected to take longer than anyone wishes to keep glibc 2.18 in freeze.

Apart from the technical details of adding lock elision, Kleen's work to add the feature to glibc can be seen as a case study of the project's new, consensus-driven development style. For many years, development was overseen by Ulrich Drepper, who earned a reputation as a prickly gatekeeper past whom few outside contributions ever made it to land in the codebase. Other longtime project members (including McGrath) formed a "steering committee" to try and work around the practical problems that arose from Drepper's management style. But Drepper left the project in 2010, and in March 2012, McGrath announced that the steering committee had voluntarily dissolved, to make way for a more open, community-driven model.

Kleen's lock elision patch set, coming as it does largely from outside, is proof that the hard-to-persuade gatekeeper model is indeed gone. In practice, the glibc community arrived at rough consensus on the patch set in a series of epic-length list threads, and it could certainly be argued that the eventual consensus was incomplete. In fact, O'Donell was even a bit apologetic toward Kleen about how difficult the process had been, encouraging him to stick around and continue to contribute. Or, as he put it in a separate message, "We haven't merged something of this complexity in a long time. Please bear with us as we get better with the process."

But, ultimately, at least part of the mutex lock elision implementation has been merged, which might not have happened at all if one project manager were to have made the call in isolation. The consensus model still defers greatly to the experienced contributors (like McGrath), but that is certainly appropriate. In the end, patches were merged, with more or less full agreement, and the remaining issues are largely debates about the long term impact of the implementation—not vehement opposition.

Comments (32 posted)

"Good enough" is good enough

July 3, 2013

This article was contributed by Martin Michlmayr


EuroPython 2013

In a EuroPython keynote (slides [PDF]), Alex Martelli, a founder of the Italian Python association and author of several Python books, shared his thoughts on software development and more generally on the path toward perfection. He observed that there's a cultural assumption that we should always be striving for perfection at all times. Martelli argued instead that "good enough" is often good enough and that this approach will, in fact, lead to better results in the long run.

Worse is Better

[Alex Martelli]

Martelli opened his talk by recounting a debate that was started in 1989 by Richard Gabriel. Gabriel contrasted two approaches to software design and implementation: the New Jersey style, also known as "Worse is Better", and the MIT/Stanford approach, known as "the Right Thing". These approaches can be contrasted according to four core values: simplicity, correctness, consistency, and completeness. Martelli observed that "it's hard to argue against any of these values", but that the two styles weigh the importance of the four values in different ways, which is important when there's a conflict between them.

The "Worse is Better" approach puts strong emphasis on simplicity. Simplicity pertains to both the implementation and the interface, and is a crucial consideration in the design of a system. Martelli gave Unix as an example where this approach can be observed. The question to ask, according to Martelli, is "can I think of a simple implementation of this design concept?" Correctness is obviously important, but it's more important to be simple than to be correct. In terms of consistency, the expectation is not to be overly inconsistent. Finally, completeness can be sacrificed in favor of any of the other values and it must be sacrificed if simplicity is threatened. This can be seen in the Unix philosophy "just do one thing really well", said Martelli, explaining that "well means simple". In the MIT/Stanford approach, or "the Right Thing", correctness is a top priority, as is consistency. The focus of simplicity is on the interface. The back end can be complex as long as the interface is simple. Completeness is roughly as important as simplicity.

What this means in practice is that "the Right Thing" philosophy is dominated by experts — experts who have to make the system perfect before users can access it. On the other hand, the "Worse is Better" approach makes use of incremental development. Martelli paraphrased G. K. Chesterton's quote "if a thing is worth doing, it is worth doing badly", explaining that by doing it "badly", you get there earlier — and you can work on improving it.

Martelli went on to compare Gabriel's model with Eric Raymond's The Cathedral and the Bazaar. While the former covers the software design process, the latter focuses on the development process, but there are many parallels. Martelli observed that the "Cathedral" development style is close to "the Right Thing" approach — a defining characteristic of both models is that experts are in charge. There are also many similarities between the "Bazaar" and the "Worse is Better" model: it's a chaotic, iterative process in which a crowd is in charge. Raymond's mantra "given enough eyeballs, all bugs are shallow" emphasizes that bugs are found and fixed much faster in a crowd-sourced system.

Perfect as a verb

Martelli explained the problem with "perfection", which is that releasing a "perfect" system implies BDUF — Big Design Up Front. Everything must proceed top-down: you need perfect identification of requirements, a perfect architecture, perfect design, and perfect implementation. The problem is that this approach takes forever. Martelli observed that there's always something to improve. The real world also interferes with this approach, as users or customers don't want to wait forever for a new release.

In the real world, requirements change all the time, architecture varies with design choices, design varies with implementation technologies, and the implementation always has some bugs. In fact, most bugs are only discovered in real world deployment. Martelli argued that iterative development is therefore the only viable approach: deploy something, fix bugs, and improve the system based on feedback.

Summing up his thoughts on achieving perfection, Martelli suggested that "perfect" should be understood as a verb rather than an adjective. Perfecting your software is a laudable goal, but it's a process rather than a state as you never reach perfection — the goalposts keep shifting all the time.

Bugs are a normal part of this perfection process. While many programmers believe that they write perfect code, that's not how it actually works. In 1974, Martelli, then a university student of hardware design, and two colleagues had to write a Fortran program together. They used punch cards and the program had to be perfect as they only had one chance to run it. "You know about pair programming", Martelli remarked, "this was pair punching". As it turns out, the program ran perfectly the first time. Unfortunately, this was the only perfect program he wrote in his 40 year career, so "don't count on it as your mode of development".

The main question to consider with bugs is whether they cause irrecoverable losses. As long as your software only causes problems one can recover from, you're okay, especially if the software is clearly tagged as beta. If your bug could kill someone, for example because you work on medical device control software, "a bug could easily cause irrecoverable losses" and a different approach may be required. However, this is not the case in most situations.

The other aspect to consider is whether your reputation can recover from the damage your bugs cause. Martelli explained that the key is how you respond to bug reports — a courteous, speedy response to issues is vital, even when you're not paid by the user. Users spend their time evaluating your software and reporting issues to you, so they should be respected. "The person who points out the bug is not my enemy but my best friend", Martelli noted.

In a weird way, bugs may even be seen as a feature. Martelli mentioned the service recovery paradox — there is some evidence that the customers with the highest level of satisfaction are not those who never had any problem at all, but those who successfully have had a problem resolved. While this should not encourage programmers to introduce bugs on purpose, it shows that bugs — if properly dealt with — are not the end of the world.

What not to skimp on

While Martelli encouraged a "good enough" approach to software development, he noted that there are some things you cannot skimp on. You absolutely need a lightweight, agile process. Martelli doesn't care which, but the process has to include revision control (which one doesn't matter as they are all "good enough"), code reviews, and testing. Proper release engineering practices are also crucial, so you know what was released as which version. Additionally, you must promote good coding style, clarity, and elegance. Finally, documentation cannot be skipped: "if you're not documenting what you're releasing, you're essentially asking your users to reverse engineer what you've done". Summarizing these requirements, Martelli explained that "there's no condition under which cowboy coding is acceptable".

Martelli added that security must be a concern from the start as it's very difficult or impossible to add this later. He means security in a general sense, including aspects such as privacy and auditability. On the other hand, some features that would be nice to have from the beginning can usually be added by refactoring the code later. Such features include modularity and a plug-in architecture, an API, and scalability. "You can incur some technical debt", Martelli suggested, as long as you do it with care.

Conclusion

Toward the end of the talk, Martelli gave examples from other areas of life and explained that his "good enough" philosophy is not restricted to software design and implementation. For example, he asked whether it makes senses to hire the "perfect employee". It's quite likely that such a person would not be available for hire anyway, that they would exceed the budget, or that they simply don't exist. Instead, he suggested finding a good (not perfect) candidate who is a good match, in terms of personality and company culture, and to provide training for missing skills.

Finally, Martelli clarified that his aim is not to lower expectations. You should dream big, but the best way to achieve those dreams remains the "release early, release often" paradigm and to learn from real users' interactions. The abstract of Martelli's talk noted that "this talk is probably not perfect, but I do think it's good enough" — in my opinion, and judging from the reaction of the audience, it was certainly "good enough" to provoke a lot of interesting thoughts and discussions.

Comments (9 posted)

Philosophy and "for" loops — more from Go and Rust

July 3, 2013

This article was contributed by Neil Brown

Once upon a time, a new programming language could be interesting because of some new mechanism for structured flow control. An if statement that could guard a collection of statements would be so much easier than one which just guarded a goto. Or a for statement which took control of the loop variable could simplify matrix multiplication significantly. An illuminating insight into this earlier age can be found in Knuth's "Structured Programming with go to statements" [PDF]. Many of the issues that seemed important in 1974 seem very dated today, but some are still fresh and relevant.

The work of these early pioneers has left us with five basic forms that appear to be common to most if not all procedural languages: two conditional constructs, if and switch/case; two looping constructs, while and for; and one encapsulation construct: the function or procedure.

While interesting new control flow is unlikely to be a headline item on a newly developed language these days, each language must embody concrete choices concerning these structures and it is quite clear that, while there is similarity, we are far from uniformity. Exploring how a language handles control flow can provide interesting insights into the philosophy behind the language. In this article, we will continue our explorations of Go and Rust by looking at various control-flow structures, but particularly focusing on the "for" loop.

The background of for loops

The for loop first appeared in programming languages as an easy way to step through a fixed list of values. We can see this in Fortran, which used the word do rather than for (here 10 is the label of the statement after the loop):

    do 10 i = 1, 100, 2

and in Algol58:

    for i := 1(2)100 do

Algol60 adds some syntactic sugar

    for i := 1 step 2 until 100 do

while Pascal dropped the step clause so you would need:

    for j := 0 to 49 do

and then set i := j * 2 + 1 inside the loop.

The Algol60 for loop was actually quite rich as can be seen by the examples here. It is a richness that probably seems excessive by today's standard. In C, which came a decade later, several of the ideas in Algol were generalized and simplified to encapsulate all the interesting possibilities in just three expressions: initializer, test, and step, thus:

    for (i = 1; i < 100; i += 2)

As the three expressions can be almost arbitrarily complex, very rich looping constructs can be created from this simple form. The effect is that the head of the for forms a coroutine that is executed in concert with the body of the for loop. Control alternates between one and the other, so that together they achieve the desired result.

The coroutine nature of the for loop's head is made particularly obvious by the many (over 150) for_each macros that appear in the Linux kernel. With these the code for one routine is physically quite separate from the other, emphasizing the separate roles of the two pieces of code. An example of such a for_each macro, from include/linux/radix-tree.h is

    #define radix_tree_for_each_slot(slot, root, iter, start)               \
            for (slot = radix_tree_iter_init(iter, start) ;                 \
                 slot || (slot = radix_tree_next_chunk(root, iter, 0)) ;    \
                 slot = radix_tree_next_slot(slot, iter, 0))

This example is interesting for a couple of reasons.

First, the middle expression — the loop-continue condition — is not simply a condition, but contains an assignment and is sometimes used to find the next value. This makes it clear that they aren't simply expressions with fixed purposes, but rather three separate entry points into a coroutine.

Secondly, it contains two variables that change throughout the loop: slot and iter. The slot variable is the regular loop variable that any for loop would have, while iter contains extra state for tracking the path through the list and is largely of internal interest.

While it is primarily internal, it needs to be visible externally, and, in fact, needs to be declared externally. The for statement has some properties of a coroutine, but cannot define local variables for use throughout the loop.

So we see in the C for loop, particularly when combined with other features of C such as the rich expressions and the macro preprocessor, a very powerful, though not completely satisfactory, for loop mechanism. One that will serve as a basis for examining others.

Go for — broke or beautiful?

The for loop in Go comes in three different forms — not quite the range of Algol60, but seemingly more than C. One form is superficially very similar to that in C: the parentheses are not required, and the loop body must be a "block" rather than a simple statement. But these are syntactic differences which don't affect expressiveness. The earlier iterative example looks much the same in Go as in C:

    for i := 1; i < 100 ; i += 2 { .... }

The parallel ends there, however. Simple for loops will look much the same, but complex for loops will have to look quite different. This is partly because Go has no macro preprocessor and partly because Go expressions are not as rich as C expressions. While the C for loop simply contains three expressions, the Go for loop contains a "simple statement", an "expression", and another "simple statement", where "simple statement" specifically includes assignments and increments/decrements.

Were we to try a literal translation of the radix tree for_each loop into Go, we would have mixed success. Go allows the declaration of local variables inside a for loop head, so there would be no need to declare slot and iter separately. However, as the condition in a Go for statement cannot contain assignments, we find a complete literal translation is impossible. Of course measuring a language by how literal translations from another language fare is far from reasonable — we may not be using the best tool for the job and, as already noted, there are other forms of the for loop in Go.

The second form is really a reduced version of the first, with the two simple statements missing and, thus, their semicolons discarded:

    for i < 100 { ... }

That form is essentially what many other language would call a while loop.

This leaves the final form — the for/range loop.

    for x := range expression { ... }

will iterate though members of the result of the expression in various ways depending on the type of the result. This makes explicit a difference from the for loops in the earlier languages. For Fortran, ALGOL, and Pascal, the for loop dealt with sequences of numbers, or possibly "enumerated constants" which are very number-like. As we have seen, C can work with arbitrary values and the Go range clause make it clear that this loop is for much more than just numbers.

The value can be an array, a slice (part of an array), a string (of Unicode characters), a map (also known as a "hash", "associative array", or "dictionary" in other languages), or a "channel" (used for IPC). In the first four cases the for loop steps through the components of the value in a fairly obvious way. Channels are a bit different and will be examined shortly. As range does not work with user-defined types at all, we cannot translate our "radix_tree" loop directly into for/range and so must look elsewhere.

A reasonable place to look might be some existing body of Go code to see how such things are done. Though the Go compiler is not written in Go, the Go language source distribution includes many tests, libraries, examples, and tools written in Go, with a total of 2418 .go source files, all of which were presumably written by people quite familiar with the language. Altogether, there are over 7000 for loops to consider.

Of these, 1200 are of the while loop form, nearly 2800 are for/range loops, and the remaining 3000 are in the three-part form, the vast majority of which have a numeric loop variable (demonstrating that the numeric loops of yesteryear are very much alive and well). So there are not a lot of examples of iterating user-defined data structures — a fact which itself might be significant.

One example of interest is in src/pkg/container/list/list_test.go:

    for e := l.Front(); e != nil; e = e.Next() {
            le := e.Value.(int)
            ....

This example is not vastly unlike the for_each macros we saw written in C. The syntax is clearly different, but the idea of having a very simple "head" on the for loop, with the actual code for the coroutine being off in a different file, is represented quite clearly. The for loop fragment given could easily be for almost any data structure. If there was a desire to keep the value (le above) more distinct from the iterator (e above), a construct like:

    for slot, iter, ok := l.Front(); ok; slot, ok = iter.Next() {

could return a sequence of slots using an iterator much like the radix_tree_for_each_slot loop we saw earlier. This construct is really quite elegant and extremely general.

Another interesting example occurs in various files in src/pkg/net, such as src/pkg/net/hosts.go and takes the form:

    for line, ok := file.readLine(); ok; line, ok = file.readLine() {

This is very similar to the Front/Next example, except that Front and Next are identical. This could be considered to violate the DRY principle: Don't Repeat Yourself.

In C, this sort of loop is regularly written as:

    while ((line = fgets(buf, sizeof(buf), file)) != NULL) {

but that cannot be used in Go, as expressions do not include assignments.

This issue of expressions excluding assignments has clearly not gone unnoticed by the Go designers. If we look at the if and switch statements, we see that, while they can be given a simple expression, they can also be given a simple statement as well, such as:

    if i := strings.Index(s, ":"); i >= 0 {

which includes both an assignment and a test. This would work quite nicely for the readLine loop:

    while line, ok := file.readLine(); ok {

except that Go does not provide a while loop — only a for loop. Though the for loop does include two simple statements, neither are executed at a convenient place to make this loop work as expected. So if we are to remove the repetition of the readLine call, we must look elsewhere.

One possibility is to explore the fact that while expressions do not include assignments, they do include function calls and functions can include assignments. Go supports function literals. This means that the body of a function can be given anywhere the name of a function can be used. The body of a function may be assigned to a variable, or it may be called in place. Further, the function so defined can access any variables that are in the same scope as the function. So:

    for line := "";
        func() (ok bool) {
            line, ok = file.readLine()
        }(); {

is a for loop in the three-part form which behaves much the same as the example above from hosts.go but without repetition.

The "initialize" part of the for loop (line := "") declares a new variable, line which is initialized to the empty string (it syntactically needs to be initialized to something, though the value won't be used).

The "condition" part of the loop is an immediate call to a function literal which calls file.readLine(), returns the ok part of the result and has a side effect of assigning the line part of the result to the line variable. The = form of assignment is needed in the function, rather than the := form, so that it does not declare a new line variable, which is local to the function, but instead uses the one local to the for loop.

The "next" part of the loop is empty, and appears between the second ; and the {.

While this does remove the unfortunate repetition of the readLine call, the cure turns out to be much worse than the disease, as the loop is close to unreadable. While function literals certainly have their place, this is not that place.

This leaves one more possibility to explore — it is time to examine that "range channel" construct hinted at earlier.

Channels

Concurrency and multiple threads (known as goroutines) are deeply embedded in Go, and the preferred mechanism for communicating between goroutines is the "channel". A channel is somewhat like a Unix pipe. It conceptually has two ends, and data written to one end can be read from the other. While a pipe can only pass characters or strings of characters, a channel can pass any type known to Go, including other channels.

    for i := range my_channel {

will repeatedly assign to i each value received from my_channel and then run the body of the for loop. This is a lot like our readLine example — if only we could make lines appear on a channel. And, of course, we can.


    func lines (file *file) (<- chan string) {
        ch := make(chan string)
        go func () {
            for {
                line, ok := file.readLine()
                if !ok { break }
		ch <- line
	    }
	    close(ch)
	}()
	return ch
    }

This lines function creates a channel (the make function) and starts a goroutine (the function literal after the go keyword) that sends lines back over the channel. This could be called as:

    for line := range lines(file) {

which will very cleanly iterate over all the lines in the file with no violation of the DRY principle.

However, further examination shows that this isn't really ideal. It certainly works in the simple case, but problems arise when you break or return out of the for loop. When you do that, the channel is not destroyed and the goroutine remains in existence trying to write to it, though no one will ever read it again. Go has built in garbage collection that will reclaim unreferenced memory, but not unreferenced goroutines.

In order to clean up properly here, we would need to close the channel after breaking out of the for loop. Strangely only the write end of a channel can be closed and, since the return value of our lines function is currently the read end (<- chan string), we need to change it to return the double-ended channel. We also need to declare a variable to hold the channel:


    func lines (file *file) (chan string) {
        ch := make(chan string)
        go func () {
            for {
                line, ok := file.readLine()
                if !ok { break }
		ch <- line
	    }
	    close(ch)
	}()
	return ch
    }
    ...
    c := lines(file)
    defer close(c)
    for line := range c { ... }

Now we have a for loop that iterates over lines in a file, but that we can break out of without leaking channels or goroutines. However it isn't really elegant any more. Needing to return both ends of the channel, needing to declare a separate variable to hold that channel, and the explicit defer close are all warts which tarnishes the elegant:

    for line := range lines(file)

The conclusion is that despite the repetition, the form used in the net package of:

    for line, ok := file.readLine(); ok; line, ok = file.readLine() {

does seem to be the best way to implement the task. All of the alternatives fall short.

From loops to philosophy

It is in that last observation that part of the philosophy of Go seems to show itself. While Go offers a lot of functionality, it often seems quite restrictive in how this functionality is accessed. This is reminiscent of the 13th aphorism from the Zen of Python:

There should be one — and preferably only one — obvious way to do it.

We see this restrictiveness in for loops where the range syntax is only available for built-in types, and where the first/next structure is really the only way to do other for loops, even if it involves repeating yourself.

We can see a similar pattern with inter-goroutine communication, where channels have a privileged status. There are several language facilities that only work with raw channels much like for/range only works with internal data types. Send (ch <- v), receive (v <- ch), and the select statement (which is a bit like switch but chooses which of several blocking operations is ready to run) are completely unavailable to user-defined types.

Where Python provides a default implementation for "maps", but allows a class to provide an independent implementation using the same syntax, Go provides a built in "map" data type and permits no substitutes. The Go FAQ makes it clear that this is a conscious decision and not an oversight:

We believe that Go's implementation of maps is strong enough that it will serve for the vast majority of uses.

This is probably why we found so few examples of iterating user-defined data structures in the Go code — maps are used instead.

Finally, even the syntax has an element of restrictiveness. We saw this briefly in a previous article where the handling of semicolons impose certain style choices on the programmer. We can see it also in the go fmt command, which will reformat the code in a .go file to follow a particular standard. While this is not imposed on programmers, the language designers recommend the use of go fmt to ensure that code follows the one true layout.

This philosophy certainly has a lot to recommend it. By removing options from the programmer, the language removes the need to make choices and so frees the programmer to focus on the actual functionality that they need. It is a philosophy that also imposes heavy requirements on the language and support environment. If there is only one way to do something, then that one way had better work extremely well. Given the vibrant community that has been built up around Go, and the strong emphasis on performance shown in the recent release of Go 1.1, it seems likely that Go does live up to this requirement

Rusty loops

Turning to Rust we see a very different style of for loop. The example loop we started with which iterates over odd values from 1 to 99 would look like:

    for uint::range_step(1, 100, 2) |i| { ... }

Here the:

    |i| { ... }

piece is a function literal, similar to those we saw when exploring Go, though with a very different syntax and a different name. Rust like many other languages calls it a lambda expression. It consists of a list of formal parameters between vertical bars, and a statement block.

The

    uint::range_step(1, 100, 2)

is a reference to a function called range_step in the uint module. The uint::range_step() function actually takes 4 arguments: start, stop, step, and function. The behavior of range_step() is to call function, repeatedly passing values from start up to the stop, incrementing by step each time. Consequently our for loop could be realized simply by:

    uint::range_step(1, 100, 2, |i| {
            ...
    })

There are two problems with this. A minor point is that the syntax is arguably less pleasing than the first version. More importantly, constructs like break and continue don't have any meaning inside a function literal, so they could not affect the flow of this second loop.

The for statement addresses both of these. It provides syntax for writing the function literal outside the normal list of function parameters and it gives meaning to break, loop (the Rust equivalent of continue), and return.

By convention, the function in the head of for should stop looping when the function argument that it calls returns false. The for statement uses this by effectively translating break to return false and loop to return true. If any return statement appears in the body of the for loop, it is also translated to something that will "do the right thing".

This seems like a fairly complex set of transformations, but the end result is extremely flexible. It allows a very clear separation of the two coroutines that make up a for loop, with the head routine having the full power of a regular function that is able to declare local variables and to communicate in arbitrary ways with the body routine.

Both the "iterate over all the lines in a file" loop which we struggled with in Go, and the radix tree loop from the Linux kernel, would be trivial to implement as an iterator routine in Rust. The first of these would look like:

    pub fn every_line(f: @io::Reader, it: &fn(&str) -> bool) {
        while !f.eof() {
            let line = f.read_line();
            if !it(line) { break }
        }
    }

and could be called as:

    let f = io::file_reader(&Path("/etc/motd")).get();
    for every_line(f) |line| {
        io::println(fmt!("Line is %s", line));
    }

This power to write elegant iterators is not without its cost. While Rust allows an arbitrary function to provide the head of the for loop, it also requires the head of the for loop to be some function. The simple initialize, test, increment form of C and Go cannot be used.

If we go back and look at the nearly 3000 for loops in the Go source code that use a numeric loop variable, we find that the vast majority of them could be implemented using uint::range_step() or even the simple uint::range(). But not all. Some examples include:

    for ; i > 0; i /= 10 {

    for (mid = (bot+top)/2; mid < top; mid = (bot+top)/2) {

    for n := 1; n <= 256; n *= 2 {

    for rate := 0.05; rate < 10; rate *= 2 {

    for parent := ".."; ; parent = "../" + parent {

(the last one does not have a numeric variable of course, but is still a useful example).

Several of these could be supported by adding a very small number of extra iterators to the standard library, the rest could just as easily be implemented with a while loop. So this limitation doesn't really limit Rust a significant amount.

A Rusty philosophy?

We see, in the for loops of Rust, a very different philosophy to that of Go. While Go forces you into a particular mold, Rust lets you build your own mold with enormous freedom. You could even modify the exact behavior of break inside your for loops if that seems like a useful thing to do.

This freedom and flexibility extends to other parts of Rust too. In last month's article, we saw that Rust does not draw a distinction between expressions and statements, so it allows if and match constructs (the latter being similar to switch) deeply inside expression, whereas Go does not permit such things.

Rust goes even further with a rich macro language that can declare which syntactic elements (e.g. identifier, expression, type) may replace each macro parameter, and can repeat the body of the macro if the parameter is a list. This leaning towards extreme flexibility seems to pervade Rust and is reminiscent of the Perl programming motto: There is more than one way to do it.

Summary

There will always be a tension in language design between allowing the programmer freedom of expression and guiding the programmer toward clarity of expression. In a previous article, we saw how the type system of Rust prefers clarity over freedom. Go is not such a stickler, and is satisfied with run-time type checks in places where Rust would insist on compile-time checks. Here, when we look at the structuring of statements and expressions, we find Rust prefers freedom while Go seems more focused on clarity by eliminating unnecessary flexibility.

Which of these is to be preferred is almost certainly a very personal choice. Some people rebel against a constraining environment, others relish the focus it allows them. Both provide room for creativity and productivity. Go and Rust provide very different points in the spectrum of possibilities and it is good to have that choice ... except that it does mean that you have to choose.

Comments (29 posted)

The new CFP deadline calendar

For those who might be interested in putting a talk proposal in for an upcoming conference, we have added a new feature to the weekly Announcements page. The LWN events calendar has long been a feature of the site, but the CFP deadlines calendar was added more recently. Now the information from that calendar will also be posted to the Announcements page in tabular form. Hopefully that will help everyone keep track of those deadlines and lead to more submissions of interesting talks to the numerous conferences in our communities.

Comments (2 posted)

Page editor: Jonathan Corbet

Inside this week's LWN.net Weekly Edition

  • Security: Mayhem finds 1200 bugs; New vulnerabilities in ffmpeg, openstack, ruby, wordpress, ...
  • Kernel: The 3.11 merge window opens; KVM on ARM; When the kernel ABI has to change.
  • Distributions: Fedora 19 and Apple hardware; DoudouLinux, Fedora, GNU Linux-libre, ...
  • Development: FOSSology 2.2.0; Qt 5.1; Google and XMPP; FirefoxOS; ...
  • Announcements: AMD joins TDF advisory board, Bruce Schneier Joins EFF Board of Directors, Doug Engelbart RIP, ...
Next page: Security>>

Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds