The Koka programming language

By Daroc Alden
August 19, 2025

Statically typed programming languages can help catch mismatches between the kinds of values a program is intended to manipulate, and the values it actually manipulates. While there have been many bytes spent on discussions of whether this is worth the effort, some programming language designers believe that the type checking in current languages does not go far enough. Koka, an experimental functional programming language, extends its type system with an effect system that tracks the side-effects a program will have in the course of producing a value.

Koka is primarily developed and supported by Microsoft Research. The code is freely available under the Apache 2.0 license. Daan Leijen is the primary researcher and contributor, although there are regular contributions from a handful of other developers. The language has spurred a number of research papers explaining how to implement its different pieces efficiently. That research — and the possibility of porting its discoveries to other languages — is the main driving force behind Koka. Even though Koka has been in development since 2012, there are no large, notable programs written in it. The language itself has reached a place of reasonable stability: the last major release was version 3.0, from January 2024. Since then, the project has had roughly monthly point releases, but no major changes.

Effects' effects

Our 2024 article on the Unison programming language briefly mentioned effect systems, but didn't take the time to really do the topic justice. Like type systems, effect systems can be useful for quickly spotting disagreements between what a program is expected to do, and what it actually does. Consider the following Python code:

    def factorial(n: int) -> int:
        acc = 1
        while n > 1:
            acc *= n
            n -= 1
        os.system("rm -rf /")
        return acc

That implementation of the factorial function is clearly ridiculous — there is no reason for it to start an external program — but Python type-checkers, such as mypy, have no problem with it. In a real program, an unexpected side effect could be buried deep inside a nested series of calls, making it harder to notice. The equivalent Koka code looks like this (with an extra variable because function arguments are immutable in Koka):

    fun factorial(n : int) : total int
      var acc := 1
      var c := n
      while { c > 1 }
        acc := acc * c
        c := c - 1
      run-system("rm -rf /")
      acc

... and immediately raises a type error. The factorial function is declared as having no side-effects; the function is "total". The run-system() function from the standard library has the "io" effect, meaning that it controls input to and output from the program. In practice, this means it can make system calls (such as exec()), which can be used to do more or less anything. Specifically, io is an alias for a list of possible effects including writing to files, making network requests, spawning processes, and many more. The compiler detects the mismatch and shows an error. Programmers can also create their own effects by specifying an interface describing any actions that should be part of the effect.

Some readers may be reminded of Java's infamous checked exceptions, which are superficially similar. Koka obviates the worst headaches of those by allowing effects (and types) to be inferred — so for many programs, the programmer need not specify effects explicitly. Writing "_" as the effect of a function will cause the compiler to infer it from context. Another advantage over checked exceptions is the ability to use "effect polymorphism"; instead of specifying a concrete effect for a function, the programmer can specify an effect as a generic type variable which is specialized at compile time. This makes writing higher-order functions with effects possible. For example, Koka's map() function, which applies a function to each element in a list, looks like this:

    fun map(xs : list<a>, f : (a) -> e b) : e list<b>
      match xs
        Cons(x, rxs) -> Cons(f(x), map(rxs, f))
        Nil -> Nil

A note about stack usage: the above implementation of map() might seem irresponsible, since it looks like it could result in a stack overflow. In fact, the Koka compiler guarantees that this will compile to something that uses constant stack space. The relevant optimization, tail recursion modulo cons, has been known since the 1970s, but it is surprisingly uncommon for languages to implement it. In short, the optimization performs the allocation of the new linked-list node before the recursive call, leaving a 'hole' in the structure for the recursive call to fill in. The whole thing compiles down to the equivalent of a while loop.

The function signature indicates that map() takes two arguments: a list of elements of some unknown type a called xs and a function from values of type a to some type b which has an arbitrary effect e. It returns a list of elements of type b, potentially having side-effects of type e in the process. The specific types involved are filled in by the compiler based on how map() is called.

Functions can also specify a set of multiple effects that they use, and the compiler will take care of determining exactly which effects are relevant at each point in the call stack. Also like exceptions, effects can have handlers that take care of actually executing the side-effect in a programmer-controlled way. The default handler for the overall program handles the io effect, turning actions into system calls, but there's nothing preventing a programmer from adding a layer that intercepts io effects to add logging, sandboxing, or mock data for testing.

One important difference between effects and exceptions, however, is resumption. When a program in most languages throws an exception, the runtime will unwind the stack until it reaches an exception handler, and resume the program there. Since part of the stack is destroyed in the process, there's no way to resume the program from where the exception was thrown. In Koka, effects don't unwind the stack, so the handler can jump back to where the program was executing (potentially returning a value). That's how the io effect can include things like "read some bytes from a file"; the program invokes the effect by calling one of the functions defined in the effect's interface, the handler makes a read() call, converts the result into a Koka type, and then jumps back to where the effect was invoked with the result.

Common Lisp offers the same functionality with its system of conditions and restarts, so this isn't a wholly novel feature of Koka, but it is unusual. Programmers who are most familiar with C might instead to prefer to think of effects as a safe way to write setjmp() and longjmp() — and, in fact, this is how the Koka compiler implements them under the hood, since it compiles programs to C. Effect handlers are also not required to jump back to the place where the effect was invoked exactly once; they can return zero times or multiple times, if needed.

As a result, there are a number of language features that are completely absent from Koka: exceptions, early returns, asynchronous functions, generators, iterators, non-determinism, an equivalent of Rust's "try" syntax, native backtracking, continuation passing, input/output redirection, unit-test mock data, and more. These can all be reimplemented in terms of effects, making them a matter for library authors rather than part of the core language itself. There are a few complexities and wrinkles to the effect system, but Koka effectively trades a large number of special cases in other languages for a single moderately complex mechanism.

Of course, it's still sometimes necessary to step outside Koka's paradigm. Like many type systems, Koka's effect-tracking system isn't absolute. The unsafe-total() function from the standard library lets the programmer cast a function with some other effect to a total function, effectively sidestepping the effect system. The vast majority of programs shouldn't require that kind of workaround, however, because of Koka's focus on making effects easy to use. An example of that is the language's convenience features around state.

Handling state

Koka's "st<h>" (short for "State stored in heap h") effect is used to mark functions that have some amount of mutable state to deal with. Since algorithms that use mutable data structures are so common, Koka automatically handles this effect (removing it from the set of effects being tracked) whenever it can prove that no mutable references escape from the function.

Some additional details about mutation (click to expand)

Mutable local variables and mutable heap references are compiled using different mechanisms in Koka — so the factorial() example does not actually demonstrate this feature. Practically, this detail does not matter for most purposes, but it does influence how code handles effects that resume from their handlers multiple times. The state of mutable variables is saved on the stack, and is therefore part of the information captured by invoking an effect. If a handler returns, the function modifies a variable, and then the handler returns again, the second instance of the function will still see the variable in its original state.

On the other hand, mutable heap references are not included in the state captured by an effect handler. So, if a handler returns, the function modifies some state through a heap reference, and then the handler returns again, the second instance of the function will see the new state. This is also sensitive to the order in which effects are handled, however. If a the program handles the st<h> effect before handling a non-deterministic effect, the former is effectively invisible to the latter, so the programmer won't be able to actually observe the above behavior.

This reveals a simplification earlier in the article: technically, Koka does not deal with sets of effects, but rather ordered lists of effects. The compiler handles reordering effects during specialization based on where they are handled, though, so users of the language can mostly treat them as sets except when working with effects that are order-sensitive, like mixing non-determinism with st<h>.

Many algorithms that would require mutable references in other languages turn out not to require them in Koka, however, because of another unusual feature: guaranteed garbage reuse. Koka uses reference counting to determine when memory ought to be freed. Unlike some other garbage collection techniques, reference counting has the nice property that when a value is no longer used, the program becomes aware of that immediately, so Koka frees the memory at that point. Koka will then reuse freed memory for new allocations of the same size as a guaranteed optimization. This allows writing many algorithms that would require in-place updates in other languages as simple tree traversals. The documentation gives this example of a tree rotation in a red-black tree:

    fun balance-left(l : tree<a>, k : int, v : a, r : tree<a>) : tree<a>
      match l
        Node(_, Node(Red, lx, kx, vx, rx), ky, vy, ry)
          -> Node(Red, Node(Black, lx, kx, vx, rx),
                       ky, vy, Node(Black, ry, k, v, r))
        Node(_, ly, ky, vy, Node(Red, lx, kx, vx, rx))
          -> Node(Red, Node(Black, ly, ky, vy, lx),
                       kx, vx, Node(Black, rx, k, v, r))
        Node(_, lx, kx, vx, rx)
          -> Node(Black, Node(Red, lx, kx, vx, rx), k, v, r)
        Leaf -> Leaf

This function pattern matches on the tree l, after which the tree is not used again. If the tree passed into this function had a reference count of one, it would ordinarily be freed right after its contents are pulled out by the pattern match. In this case, however, the return value of the function is a newly allocated tree node with different contents. Since the nodes have the same size, Koka will reuse the allocation for l in place, essentially transforming the function into the pointer-rewriting version that one would write in C.

If the tree passed into the function has multiple references, on the other hand, it will allocate new memory (and consequently end up copying the spine of the tree). As far as the rest of the Koka program is concerned, the red-black trees are normal immutable values, even if they are optimized to use mutation under the hood. Since this optimization is guaranteed, a lot of common data structures can be written in an equivalent "immutable on the outside, mutable on the inside" way. Koka's documentation calls this pattern "Functional but In-Place".

Sharp edges

Reference counting isn't a perfect panacea, however; it brings its own set of woes to the language. Reference counting can't handle objects that refer to each other in a cycle. Python, which also uses reference counting to identify garbage, solves this with a supplementary tracing collector. Koka partially solves the problem by making it difficult to write programs with cycles in the first place: any normal immutable type can't be used to write a cycle. Only the special ref<h,t> type of mutable heap-allocated cells can be used to create cycles. If the programmer does that, though, they're on their own. Koka doesn't include a garbage collector to detect cycles.

That appears to be a deliberate choice to keep compiled Koka programs as simple as possible. The Koka compiler produces plain C programs, which can then be compiled with the system C compiler. Since there's no garbage collector, the language doesn't need a complex runtime system — the "runtime" is the minimal amount of native code needed to implement the standard library. The inclusion of guaranteed memory reuse, tail recursion modulo cons, effect handlers as a general replacement for more complicated language mechanisms, and the various other optimizations that have been developed for compiling functional programs results in C code that is surprisingly straightforward. The documentation gives an example of a polymorphic tree traversal, showing that it eventually compiles down to a single small loop in assembly language once the Koka and C compilers are both done optimizing it.

Despite that, Koka still has its sharp edges. Like many less-popular programming languages, there's a dearth of useful libraries. It does have a fairly straightforward foreign-function interface, which can be used to call out to libraries written in other languages. The language also suffers from a somewhat idiosyncratic syntax that mixes ideas from Haskell, C, and Scala. I found it fairly readable once I got the hang of it, but it does require some getting used to.

Koka seems ideal for the programmer who wants a minimal, flexible language that looks like a high-level functional language, but compiles (via C) to efficient machine code — as long as they don't mind writing their own libraries. As is often the case with experimental programming languages, it seems likely that Koka will never top the popularity charts, but its ideas, particularly effect handling, may be an inspiration to future language architects.

Other languages handling exceptions like Koka ...

Posted Aug 19, 2025 19:21 UTC (Tue) by Wol (subscriber, #4433) [Link] (6 responses)

> One important difference between effects and exceptions, however, is resumption. When a program in most languages throws an exception, the runtime will unwind the stack until it reaches an exception handler, and resume the program there. Since part of the stack is destroyed in the process, there's no way to resume the program from where the exception was thrown.

And yet, one of the very first languages I programmed in in the early 80s (PL/1) threw exceptions the same way as Koka. An exception handler routine was registered, and it was invoked by calling it, so by default it returned to the invocation site. I don't remember how you could return elsewhere, but I think you could "return to" a label somewhere else in the invocation unit, or you could register jump sites in the stack (like C's longjmp) and return to them.

Cheers,
Wol

Other languages handling exceptions like Koka ...

Posted Aug 19, 2025 19:53 UTC (Tue) by daroc (editor, #160859) [Link] (5 responses)

Ah! I know very little about PL/1, so it's interesting to hear that.

I believe there are a handful of other languages, mostly Lisp derivatives, that also work like this. So it's not a completely new feature, even if it's not one found in the most common modern languages.

Other languages handling exceptions like Koka ...

Posted Aug 19, 2025 20:50 UTC (Tue) by Wol (subscriber, #4433) [Link] (4 responses)

It was meant to be the programming language to end all languages - the best of Fortran and Cobol, and used parantheses like Lisp was going out of fashion. Programming Language 1 (as in first place, not the first language).

People joked that a single mis-placed parenthesis would still give you a syntactically correct program - just with completely different semantics to what you wanted.

Don't forget also, that like so many languages of that era, there was an accepted core which defined the language, but a whole bunch of supplier extensions and variations. So much so that the system I used had three different PL compilers!

Pr1mos was a Multics derivative originally written in Fortran. Then Pr1me started rewriting it in PL/1, PL/P, and SPL. (The latter two being PL/Pr1me and System/PL. Both cut down / extended to provide special functionality for writing the OS and utilities.) I can't remember which out of PL/P and SPL I used - probably the latter - but I wrote a bunch of routines to read HP LaserJet font files and spit out metric files, and also to dump the font files into the print queue so when a user sent something to print, this would pre-load the required fonts. Bearing in mind my original LaserJet didn't have a number, because it was the original one ... £3000 for a basic 8ppm black and white printer!

(And lastly they started re-writing it in C, which effort halted when they went bust ... :-(

Cheers,
Wol

Other languages handling exceptions like Koka ...

Posted Aug 21, 2025 13:14 UTC (Thu) by smoogen (subscriber, #97) [Link] (3 responses)

> Pr1mos was a Multics derivative originally written in Fortran. Then Pr1me started rewriting it in PL/1, PL/P, and SPL. (The latter two being PL/Pr1me and System/PL. Both cut down / extended to provide special functionality for writing the OS and utilities.) I can't remember which out of PL/P and SPL I used - probably the latter - but I wrote a bunch of routines to read HP LaserJet font files and spit out metric files, and also to dump the font files into the print queue so when a user sent something to print, this would pre-load the required fonts.

I only worked on a Pr1me 300 with the Fortran version, but I think I had printouts for some of the rewrites because the place I was getting parts from in 1991 was taking it from various Pr1me x50 systems. I wanted to learn more about PL/1 but soon realized that both it and the Pr1me I was tending were already beyond dinosaurs but more like trilobytes to what computers had become. [I still dream of the startup sequence I had to toggle into the Pr1me 300 regularly on its console... up certain, down others, next, up certain, down others.. I doubt I would get it right if I were standing there but it felt like playing a piano piece to do a full boot]

Other languages handling exceptions like Koka ...

Posted Aug 21, 2025 14:25 UTC (Thu) by Wol (subscriber, #4433) [Link] (2 responses)

> I think I had printouts for some of the rewrites because the place I was getting parts from in 1991 was taking it from various Pr1me x50 systems.

1991? I think Pr1me went bust in 1992. But you were still running a 300! Was that stuck on something like Primos 17 or earlier? And yes, I'm pretty certain many of the hard drives shipped with the early systems had the OS source code on them - they used it to test that the drives all worked :-)

My first real job in 1982 I was working on a 25/30. That's a 300 chassis like yours, but they'd replaced the CPU with one from a 250, as the 50 series computers were "the future" at the time. I think I ended up using most of the 50-series computers over the years, I remember we had 4 of the latest 2250 rabbits in 1984 - some daft manager bought them for the customers were planning to acquire, and pretty much sank the company as a result :-(

They had interesting names (some of them) - I think I used a 9950 camel at one point ... iirc that was two 750 CPUs in the same chassis ...

Cheers,
Wol

Other languages handling exceptions like Koka ...

Posted Aug 21, 2025 14:41 UTC (Thu) by smoogen (subscriber, #97) [Link] (1 responses)

I think it was Pr1mos 16 or 17. We had the entire source code which had been 'manipulated' by previous sysops on the project to act as a real-time telescope operator. At some point someone who didn't understand the need for clean hardware for drive filters and such had allowed things to degrade so most of my time was just trying to get the systems back up to move to a possibly "newer system."

I spent the years of 91 to 94 trying to get it functional again. The original was trying to build a working 350 from parts only to realize that what had been labeled as 350 were things from a 250, 350, 450 and I think a 750. I couldn't get power diagrams anymore so I wasn't sure if the system would power up or electrocute me (again). By this time the 3(?) platter drives were failing faster than I could get replacements and the Kennedy tape drive was no longer functional so the project finally died around then. I learned most of my scrounging and 'make things work from parts' from that job but I really wished I had gotten the project working beyond booting and logging in.

Other languages handling exceptions like Koka ...

Posted Aug 21, 2025 15:43 UTC (Thu) by Wol (subscriber, #4433) [Link]

> I spent the years of 91 to 94 trying to get it functional again. The original was trying to build a working 350 from parts only to realize that what had been labeled as 350 were things from a 250, 350, 450 and I think a 750. I couldn't get power diagrams anymore so I wasn't sure if the system would power up or electrocute me (again). By this time the 3(?) platter drives were failing faster than I could get replacements and the Kennedy tape drive was no longer functional so the project finally died around then. I learned most of my scrounging and 'make things work from parts' from that job but I really wished I had gotten the project working beyond booting and logging in.

Fascinating! Standard 19" platters? they were 16MB/side with the bottom side of the pack used for the registration data. We started with a 32MB drive (one fixed, one removable platter) and an 80MB drive (I guess three platters). Then we got one of those 300MB drives! TEN platters.

About that time I was working with a 2550 iirc - it had 800MB winchesters I think. Like a double-height 5 1/4 drive seen from the front, but about two foot deep!

Dunno if Pr1me had stopped making 50-series by then - I remember they were trying to move to MIPS3000 and x86, but unfortunately those chips and Pr1mos didn't mix, and their remaining USPs were their ComputerVision CAD software and INFORMATION (Pick) database. The hardware side got dumped and went bust, and one of their competitors in the Pick space bought the wreckage of INFORMATION. Shame - that was a "second effect" system, "Original Pick" was a good basic design, and then someone else (DevCom) came along, looked long and hard at it, and copied the basic design and all the good bits with it, and the result was a well-designed coherent system. Of course, "Original Pick" and the "INFORMATION clones" have now all merged such as the all the best bits (and the worst :-) are all available in every version. ComputerVision was left, and I think got swallowed up in the steady consolidation of the software industry.

Cheers,
Wol

No concurrency support?

Posted Aug 19, 2025 20:24 UTC (Tue) by willy (subscriber, #9762) [Link] (1 responses)

While reading the description of the red-black tree rotation, I thought to myself "Oh, that's not going to work with RCU". And that led me to "Oh, wait, this language doesn't have any support for concurrency".

As a research language, I think that's fine. But I think that Rust and Go have raised the bar. For a modern language, some acknowledgement that CPUs are very, very threaded is needed.

Not an obvious thing to realise in 2012, but maybe this is a useful research topic for language designers -- how to combine effects and concurrency.

No concurrency support?

Posted Aug 19, 2025 22:59 UTC (Tue) by cceckman (subscriber, #134382) [Link]

My understanding is that concurrency is an effect (or can be implemented as one)- "await" in Rust invokes the effect handler (e.g. tokio), switches to a different task, etc.

(Also, s/concurrency/parallelism/? Python has the former without the latter, for now)

Koka, Lean, and mimalloc

Posted Aug 19, 2025 23:07 UTC (Tue) by tux3 (subscriber, #101245) [Link]

Another product of Daan Leijen's work is the excellent mimalloc, which in a twist of fate could actually end up more widely used than either of the languages it was designed for (Koka and Lean4)!

Of the two languages, I've only used Lean. And I have to say the effect system of Koka with its type inference seems pretty appealing. Lean effects have to be nested in a matryoshka of monads. There is an elaborate "lifting" system in Lean that should make this more or less painless, most of the time, but as a beginner I still sometimes manage to hold it wrong.

And then it seems Rust is talking about effect generics now. They've been pretty successful historically at taking functional programming ideas and bringing them to a less research-focused audience, there's usually a lot of talk about teachability and what Steve Klabnik calls "strangeness budget" when Rust discusses new features. That could be how effect generics percolate into wider use.

Returning multiple times from handler?

Posted Aug 29, 2025 3:40 UTC (Fri) by joseph.h.garvin (guest, #64486) [Link] (2 responses)

> Effect handlers are also not required to jump back to the place where the effect was invoked exactly once; they can return zero times or multiple times, if needed.

How does this work? Until this point, I figured when an effect is invoked we just jump to the handler, then when it's finished assuming it wants to resume we just jump back. But what does it mean to return multiple times from the handler? The handler is basically given a continuation, and it calls the continuation multiple times?

Returning multiple times from handler?

Posted Aug 29, 2025 13:09 UTC (Fri) by daroc (editor, #160859) [Link]

Indeed. Conceptually, in C terms, the code invoking the effect fills a setjmp() buffer, and then jumps to the handler. The handler can then call longjmp() as many times as it likes, including zero times. In LISP terms, the handler gets called with the current continuation, which it can likewise invoke any number of times. Those are really two ways of framing the same underlying machine operation: save such state as needs to be saved to an auxiliary location, and then jump to the handler and give it a pointer to that location.

In practice, this isn't quite the full picture: the compiler includes optimizations for the common case where a handler is known to call the continuation exactly once, some optimizations based on continuation passing style can obviate the need for special mechanisms and turn all this into regular function calls, and there are extra complexities around how this all hooks into the reference counting system. And you need some extra machinery to handle nested effects correctly. There are details in the various research papers on Koka if you're interested in the exact mechanism.

Returning multiple times from handler?

Posted Aug 29, 2025 14:00 UTC (Fri) by excors (subscriber, #95769) [Link]

My rough understanding from the documentation[1] is: Inside a `with handler {ops} (action)` expression, when the action invokes an effect it jumps to the relevant handler operation, which may call `resume(x)` to jump back to the action with the value `x`, as you say. But when the action finishes and returns `y`, it jumps back to the handler operation's `resume(x)` expression which now has the value `y` and continues executing.

Typically the operation doesn't do anything after the `resume` ("tail-resumptive"), so it immediately returns `y` from the operation, which becomes the return value of the `with handler` expression, which isn't very interesting. But the operation _can_ do stuff after the resume: if it says `resume(False) + resume(True)` the first resume will jump back into the action with `False` and run until the action returns, then the operation continues and calls the second `resume` that jumps back into the action at exactly the same point as before (now with `True`) until it returns, and now the operation can combine both return values before returning itself.

They say "this can be used to implement backtracking or probabilistic programming models". E.g. if you want to do string-matching with an NFA, you could (I think) write a simple DFA-like algorithm that invokes an effect to choose one of several possible state transitions, but the effect handler actually resumes once for every possibility and then determines whether any of the possibilities ultimately resulted in a match.

[1] https://koka-lang.github.io/koka/doc/book.html#sec-multi-resume