|
|
Subscribe / Log in / New account

An interview with Joey Hess

January 19, 2016

This article was contributed by Lars Wirzenius

In 1992 I interviewed Linus Torvalds for my little Linux newsletter, Linux News. That was traumatic enough that I haven't interviewed anyone since, but a few months ago I decided that it would be fun to interview Joey Hess, who kindly agreed to it.

Joey is known for many things. He wrote, alone and with others, debhelper, its dh component, and the debian-installer, as well as other tools like debconf. He wrote the ikiwiki wiki engine; and co-founded (with me) the Branchable hosting service for ikiwiki. He wrote git-annex, and ran a couple of successful crowdfunding campaigns to work on it full time. He lives in the eastern US, off-grid, in a solar-powered cabin, conserving power when necessary. He has retired from Debian.

The interview was done over email. This is a write-up of the messages into a linear form, to make it easier to read. All questions are by me, all answers by Joey, except in some places edited by me. The interview took several months, because I had a lot of other things I was doing, so sometimes it took me weeks to ask my next question. At one point, Joey pointed out that "the interviewee may become a different person than at the beginning".

Most of the credit for this interview goes to Joey. I merely asked some questions and wrote up the answers.

Lars: You were one of the most productive and respected Debian developers for a very long time. What made you want to leave the project?

Joey: A hard question to start with! Probably you didn't mean for it to be a hard question, but I guess you've read my blog post on leaving and my resignation, and it seems they didn't answer the question well enough. Perhaps they hint vaguely at problems I saw without giving enough detail, or suggest I had some ideas to solve them. And so, I guess, you (and others) ask this question, and I feel I should do my best to find an answer to it.

Thing is, I don't know if I can answer it well. Our experience of big problems can seem vague (recall the blind men and the elephant). Where [Joey Hess] I had good ideas, I had a very long time indeed to try to realize them, and firing all my dud ideas off as parting shots on the way out is not likely to have achieved much.

I do have the perspective now for a different kind of answer, which is that if I'd known how bothersome the process of leaving Debian turns out to be, I might not have bothered to formally leave.

Perhaps it would be easier to stop participating, just let things slide. Easier to not need to worry about my software going unmaintained in Debian; to not worry about users (or DNS registrars) who might try to contact me at my Debian email address and get a ugly "Unrouteable address" bounce; to not feel awkward when I meet old friends from Debian.

But, if I'd have gone that route, I'd lack the perspective I have now, of seeing Debian from the outside. I'd not have even the perspective to give this unsatisfying answer.

Lars: From the blog post, I understand that you prefer to work on smaller projects, where it's easier to make changes. Or perhaps I'm over-interpreting, since that's a feeling I have myself. I have, from time to time, spent a bit of thought on ways to make things better in Debian in this regard. My best idea, mostly untried, is to be able to branch and merge at the distro level: any developer (preferably anyone, not just Debian developers) could do what is effectively "git checkout -b my/feature/branch", make any changes they want in as many packages as they want, have an easy, effective way to build any .debs affected by the changes, and test. If the changes turn out to be useful, there would be a way to merge the source changes back. Do you have any thoughts on that?

Joey: I'm fairly addicted to that point in development of a project where it's all about exploring a vast solution space, and making countless little choices that will hopefully add up to something coherent and well thought out and useful. Or might fail gloriously.

Some projects seem to be able to stay in that state for a long time, or at least re-enter it later; in others it's a one-time thing; and in less fun areas, I hear this may never happen in the whole life cycle of an enterprise thingamajig.

Nothing wrong with the day-to-day work of fixing bugs and generally improving software, but projects that don't sometimes involve that wide-open sense of exploration are much less fun and interesting for me to work on.

Feels like a long time since I got much of that out of working on Debian. It certainly happened back in debian-installer days, and when I added dh to debhelper (though on a smaller scale), but I remember it used to all seem much more wide open.

I don't think this is entirely a social problem; technology is very important too. When I can make changes to data types and a strong type system lets me explore the complete ramifications of my my changes, it's easier to do exploratory programming in an established code base than when I'm stumbling over technical debt at every turn. But I feel in the case of Debian, a lot of it does have to do with accumulated non-technical debt.

Lars: You mention a strong type system, and you're known as a Haskell programmer. Previously you used Perl a lot. How would you compare programming in Haskell versus Perl? Especially on non-small programs, such as debhelper and ikiwiki versus git-annex? All are, by now, quite mature programs with a long history.

Joey: It's weird to be known as a Haskell programmer, since I still see myself as a beginner, and certainly not an exemplar. Indeed, I recently overheard someone complaining about some code in git-annex not being a good enough example of Haskell code, to merit whatever visibility it has on GitHub.

And they were right, this code is bad code in at least 3 ways; it's doing a lot of imperative I/O work, it's complicated by a hack that was put in to improve behavior without breaking backwards compatibility, and it implements half of an ad-hoc protocol, with no connection to the other half. There should be a way to abstract it out to higher level pure code, something more like this code.

So, I can write bad code in either language. But, I couldn't see so many of the problems with my bad Perl code. And, it's a lot more sane to rework bad Haskell code into better code, generally by improving the types, to add abstractions and preventing whole classes of problems from happening, and letting that change seep out into the code. And I continue to grow as a Haskell programmer, in ways that just didn't happen when I was writing Perl.

A couple other differences that I've noticed:

When I get a patch to a Haskell program, it's oh so much easier to tell if it's a good patch than when I get a patch in some other language.

My Haskell code often gets up to a high enough level of abstraction that it's generally reusable. Around 15% of the code in git-annex is not specific to it at all, and I hope to break it out into libraries.

For example, here is a library written for the code I linked to, and then reused in two other places in git-annex. Maybe three places if I get around to fixing that bad code I linked to earlier. Debconf contains an implementation of basically the same thing, but being written in Perl, I never thought to abstract it for reuse this way.

Lars: Speaking of Haskell, what got you interested in it initially? What got you to switch?

Joey: I remember reading about it in some blog posts on Planet Debian by John Goerzen and others, eight or nine years ago. There was a lot of mind-blowing stuff, like infinite lists and type inference. And I found some amazing videos of Simon Peyton Jones talking about Haskell. So I started to see that there were these interesting and useful areas that my traditional Unix programming background barely touched on or omitted. And, crucially, John pointed out that ghc can be used to build real world programs that are as fast and solid as C programs, while having all this crazy academic stuff available.

So, I spent around a year learning the basics of Haskell — very slowly. Didn't do much with it for a couple of years because all I could manage were toy programs and xmonad configurations, and I'd get stuck for hours on some stupid type error.

It was actually five years ago this week that I buckled down and wrote a real program in Haskell, because I had recently quit my job and had the time to burn, even though it felt like I could have dashed off in Perl in one day what took me a week to write in Haskell. That turned out to be git-annex.

After around another three years of writing Haskell, I finally felt comfortable enough with it that it seemed easier than using other languages. Although often mind-blowing still.

Lars: Haskell has a strong, powerful type system. Do you feel that does away with the need for unit testing completely? Do you do any unit testing, yourself? How about integration testing of an entire program? If you do that, what kind of tool do you use? Have you heard of my yarn tool and if so, what are your opinions on that?

Joey: It's a myth that strongly typed or functional programs don't need testing. Although they really do sometimes work correctly once you get them to compile, that's a happy accident, and even if they do, so what — some future idiot version of the savant who managed that feat will be in the code later and find a way to mess it up.

Often it's easier to think of a property that some data would have, and write a test for it, than would be to refine the data's type to only allow data with that property. Quickcheck makes short work of such tests, since you can just give it the property and let it find cases where it doesn't hold.

My favorite Quickcheck example is where I have two functions that serialize and deserialize some data type. Write down:

    prop_roundtrip val = deserialize (serialize val) == val

and it will automatically find whatever bugs there are in edge cases of the functions. This is good because I'm lazy and not good at checking edge cases. Especially when they involve something like Unicode.

Most of my unit testing is of the Quickcheck variety. I probably do more integration testing overall though. My test infrastructure for git-annex makes temporary git repositories and runs git-annex in them and checks the results. I'm not super happy with the 2000 lines of Haskell code that runs all the tests, and it's too slow, but it does catch problems from time to time and by now checks a lot of weird edge cases due to regression tests.

I generally feel I'm quite poor at testing. I've never written tests that do mocking of interfaces, all that seems like too much work. I don't always write regression tests, even if when I don't manage to use the type system to close off any chance of a bug returning. I probably write an average of one to five tests a month. Propellor has twelve-thousand lines of code that runs as root on servers and not a single test. I'm not really qualified to talk about testing, am I?

I've read the yarn documentation before, and it's neat how it's an executable human readable specification. I'd worry about bugs in the tests themselves though, without strong types.

The best idea I ever had around testing is: put the test suite in your program, so it can be run at anytime, anywhere. Being able to run "git annex test" or ask users to run it is really useful for testing how well git-annex gets on in foreign environments.

Lars: One of the things you're known for, and which repeatedly is remarked on by Hacker News commenters, is that you live off the grid on the middle of the wilderness, relying on a dial-up modem for Internet. You've blogged about that. What led you on this path? What is your current living situation? Why do you stay there? Do you ever think about going somewhere to live in a more mainstream fashion? What are the best and worst things about that lifestyle?

Joey: I seem to have inverted some typical choices regarding life and work...

Rather than live in a city and take vacations to some rustic place in the country, I live a rustic life and travel to the city only when I want stimulation. This gives me a pleasant working environment with low distractions, and is more economical.

Rather than work for some company on whatever and gain only a paycheck and a resume, I work because I want to make something; the resulting free software is my resume, and the money somehow comes when someone finds my work valuable and wants it to continue. (Dartmouth College at the moment.)

Right now I'm renting a nice house with enough woods surrounding it to feel totally apart, located in a hole in the map that none of the nearby states of Tennessee, Virginia, or Kentucky have much interest in, so it's super cheap. It's got grape arbors and fruit trees, excellent earth-sheltered insulation, ancient solar panels and a spring and phone line and not much else by way of utilities or resources. I haul water, chop firewood, and now in the waning days of the year, have to be very careful about how much electricity I use.

I love it. I'm forced to get out away from keyboard to tend to basic necessities, and I feel in tune with the seasons, with the light, with the water, with everything that comes in and goes out. Even the annoying parts, like a week of clouds that mean super low power budget, or having to hike in food after a blizzard, or not being able to load a bloated web page in under an hour, seem like opportunities to learn and grow and have more intense experiences.

I kind of fell into this, by degrees. Working on free software was a key part, and then keeping working on it until I'd done things that mattered. Also, being willing to try a different lifestyle and keep living it until it became natural. Being willing to take chances and follow through, basically.

I've done this on and off for over ten years, but it still seems it could fall apart any time. I'm enjoying the ride anyway, and I feel super lucky to have been able to experience this.

Lars: What got you started with programming? When? What was your first significant program?

Joey: I bought an Atari computer with 128KB of RAM and BASIC. It came with no interesting programs, so provided motivation to write your own. I think that some of the money to pay for it, probably $50 or so, was earned working on the family tobacco farm. I was ten.

I have a blog post with some other stories about that computer. And I still have the programs I wrote, you can see them at http://joeyh.name/blog/entry/saved_my_atari_programs/.

But "significant" programs? That's subjective. Writing my own Tetris clone seemed significant at the time. The first program that seems significant in retrospect would be something from much later on, like debhelper.

Lars: What got you into free software?

Joey: I got into Linux soon after I got on the Internet at college, and from there learned about the GNU project and free software. I started using the GPL on my software pretty much immediately, mostly because it seemed to be what all the cool kids were doing.

Took me rather longer to really feel free software was super important in its own right. I remember being annoyed in the late 90's to be stereotyped as a Debian guy and thus a free-software fanatic, when I was actually very much on the pragmatic side. Sometime since then, free software has come to seem crucially important to me.

These days feel kind of like when the scientific method was still young and not widely accepted. Crucial foundational stuff is being built thanks to free software, but at the same time we have alchemists claiming to be able to turn their amassed user data into self-driving cars. People are using computers in increasingly constrained ways, so they are cut off from understanding how things work and become increasingly dis-empowered. These are worrying directions when you try to think long-term, and free software seems the only significant force in a better direction.

Lars: What advice would you give to someone new to programming? Or to someone new to free software development?

Joey: Programming can be a delight, a font of inspiration, of making better things and perhaps even making things better. Or it can be just another job. Give it the chance to be more, even if that involves quitting the job and living cheap in a cabin in the woods. Also, learn a few quite different things very deeply; there's too much quick, shallow learning of redundant stuff.


Index entries for this article
GuestArticlesWirzenius, Lars


to post comments

An interview with Joey Hess

Posted Jan 19, 2016 23:24 UTC (Tue) by smoogen (subscriber, #97) [Link] (6 responses)

It is a bit long:

It's a myth that strongly typed or functional programs don't need testing. Although they really do sometimes work correctly once you get them to compile, that's a happy accident, and even if they do, so what — some future idiot version of the savant who managed that feat will be in the code later and find a way to mess it up.

but for me this is my quote of the week (and possibly the month). Thank you both Lars and Joey.

An interview with Joey Hess

Posted Jan 20, 2016 0:00 UTC (Wed) by Sesse (subscriber, #53779) [Link] (4 responses)

Seriously, did anybody ever believe that? I know functional programming is surrounded by a certain aura of… mystique, but nobody in their right mind would ever say that C or Java code doesn't need testing just because it's strongly typed? :-)

/* Steinar */

An interview with Joey Hess

Posted Jan 20, 2016 0:05 UTC (Wed) by smoogen (subscriber, #97) [Link]

I have seen programmers say it about pretty much language... usually about their own code but extending it to the language or coding stance or "well that person was just stupid, if he knew the language that would never have happened."

An interview with Joey Hess

Posted Jan 20, 2016 15:05 UTC (Wed) by nybble41 (subscriber, #55106) [Link]

"Strongly typed" isn't enough, you also need an expressive type system, like in Haskell. Of course, you also need to make use of it properly. It's possible to write dynamically-typed (or unityped) code in almost any language.

It's not so much that the code doesn't need testing, as that the type system renders many common errors impossible, and thus does quite a bit of the testing for you in the form of static proofs at compile time. There are usually other properties not captured in the types which still need to be proved through traditional runtime tests.

In a dynamically-typed language, for example, you ought to have tests to show that a function which expects an integer responds correctly when passed a string, or some other non-integer type. In C you wouldn't write such a test, because the compiler won't let the user pass anything but an integer. Haskell is similar, except that its type system can encode much more interesting properties, so if you use it properly there are many more things that the compiler can prove about the code. As a semi-advanced example, say you have this GADT representing a simple language for managing the state of some variable:

data StateA s r where {
SetState :: a -> StateA a ();
GetState :: StateA a a;
}

The first type parameter represents the type of the state, and the second the type of the result from the action. The SetState constructor captures a value of the same type as the state, and the result type of GetState must match the state type. If you also have a function of type `runStateA :: StateA s r -> s -> (r, s)`, the compiler will take care of proving that if `runStateA (SetState 3) (5 :: Int)` terminates, it produces a value of type `((), Int)`, where the first element must be `()` and the second value can only be 3 or 5—as those are the only values of type `s` that runStateA has access to. The expression `runStateA GetState "Hello"` can only evaluate to `("Hello", "Hello")`; aside from non-termination, the runtime behavior is fully constrained by the types. The compiler will also prove that there are no side-effects or dependencies on anything other than the parameters to the function.

This does not eliminate all testing; you still need to show that `runStateA (SetState y) x == ((), y)`, since the types allow for SetState to be implemented as a no-op. It is also necessary to show that the function terminates, since the types won't prevent unbounded recursion or runtime exceptions. However, there is very little else that can actually go wrong in the implementation of runStateA.

An interview with Joey Hess

Posted Jan 28, 2016 12:00 UTC (Thu) by HelloWorld (guest, #56129) [Link] (1 responses)

People don't say that because C is not strongly typed. In fact C's type system is among the worst ones around.

An interview with Joey Hess

Posted Jan 28, 2016 16:43 UTC (Thu) by nybble41 (subscriber, #55106) [Link]

> People don't say that because C is not strongly typed.

Exactly. For example, C has implicit conversions between any two integer types, even when the conversion may lose information or lead to undefined behavior, a null value which can inhabit any pointer type, and implicit conversions to/from pointer-to-void. Java at least requires an explicit cast to narrow the type. It still suffers from the major weakness that null is treated as a valid reference value, and its implicit integer conversions can still lose information. They are both more strongly typed than, say, Perl or Bash, but even Java's type system is much weaker than Haskell's, or most of the ML family for that matter.

An interview with Joey Hess

Posted Feb 4, 2016 13:43 UTC (Thu) by dvandeun (guest, #24273) [Link]

You could say that the myth is partially true: it is very likely that a refactored Haskell program is correct as soon as you get it too compile (assuming that the original was correct), and that is thanks to type checking.

An interview with Joey Hess

Posted Jan 21, 2016 14:57 UTC (Thu) by bytelicker (guest, #92320) [Link]

Great interview!

I really like the part were Joey talks about being away from the "big city". I'd like to retire early and move to the woods.

An interview with Joey Hess

Posted Jan 21, 2016 16:46 UTC (Thu) by yxejamir (subscriber, #103429) [Link] (1 responses)

My best idea, mostly untried, is to be able to branch and merge at the distro level:
More people need to start looking into distros like GuixSD and NixOS, where one can do exactly that. Everyone should be moving in that direction, especially anyone who is currently using the likes of Puppet and SaltStack for configuration management.

An interview with Joey Hess

Posted Jan 21, 2016 22:36 UTC (Thu) by oever (guest, #987) [Link]

be able to branch and merge at the distro level
I did not look at it in those terms, but mostly it is true. NixOS allows branching at almost the distribution level. Every distribution can be branched — look at the distribution family tree — but in Nix(OS) you can install packages which differ from the main system, even starting at libc, by just forking the NixPkgs repo. And switching between the branches is easy without rebooting or logging out.


Copyright © 2016, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds