The accelerating adoption of Julia
The Julia programming language has seen a major increase in its use and popularity over the last few years. We last looked at it two years ago, around the time of the Julia 1.0 release. Here, we will look at some of the changes since that release, none of which are major, as well as some newer resources for learning the language, but the main focus of this article is a case study that is meant to help show why the language has been taking off. A follow-up article will introduce a new computational notebook for Julia, called Pluto, that is akin to Jupyter notebooks.
Julia is a programming language that was first released in 2012; its implementation is released under the MIT license. It is a general-purpose language, but with a particular suitability for scientific programming and numerical work. Julia is a dynamic language, with an interactive mode and easy-to-learn syntax that is simple for novice programmers; it also has deeper layers of sophistication for the expert. The language allows introspection and metaprogramming, with Lisp-like macros, an optional Lisp syntax, and access to syntax-tree and assembly-language views of functions. It features a rich type system with performant user-defined types, multiple dispatch of functions, and several flavors of concurrent programming built in.
Julia recently passed a kind of popularity milestone, breaking into the top 20 in the IEEE Spectrum list of programming languages. Beyond that, the language is being adopted in many new research projects, such as: the Climate Machine, the computational engine used by the Caltech Climate Modeling Alliance; a new space weather forecasting initiative, funded by the NSF; quantum machine learning; drug development; and a computational collaboration called Celeste to create a massive star map of the universe.
Professor Mykel Kochenderfer is the creator of an international standard aircraft collision avoidance system, ACAS X. In an email interview, he told me that the Julia version of his system runs as fast as a previous version he wrote in highly optimized C++. Since he wrote the Julia version intending it to merely document the algorithm, this was a surprise. He was able to replace the C++ version with the easier to read and maintain Julia code.
The recently concluded annual Julia conference, online this year, naturally, was a good indicator of the audience that Julia is attracting. The presentations (YouTube videos) that one would expect of various computer science topics were outweighed by talks about applications to scientific research in an impressive variety of fields. A recurring theme was the way that the language facilitated collaboration and code reuse, giving scientists an opportunity to take advantage of the packages and algorithms of others.
Case study: the power of combining libraries
Julia's organization around the concept of multiple dispatch (described in part two of our Julia introduction), combined with a certain care taken by package authors to write extensible code, creates an environment where it is unusually easy to combine the features of several packages. Multiple dispatch means that a function can be defined with a variety of available methods, each one operating on a different set of argument types; the particular method used is chosen at run time, based on the types of all of the arguments. Crucially, the user of a library can define new methods for functions in the library, without having to modify the existing library code.
Our case study involves an activity fundamental to most computational science: the numerical solution to a differential equation. The Julia package DifferentialEquations has become the standard for this purpose in the ecosystem. Packages can be added to Julia from the the read-eval-print loop (REPL) using the language's package manager. It will download the package from GitHub, along with any dependencies, and store the result in the user's .julia directory.
Once it is installed, we need to import the functions provided by DifferentialEquations, so that we can use them in calculations. In a program, we would use an import statement to pull in just what we need to use, and keep everything namespaced. But, for convenience in the REPL, we'll use the command:
julia> using DifferentialEquations
That will import everything into the top level and allow us to use the package's functions as bare names. The first time this command is issued on a previously unseen package, it will lead to a spate of pre-compiling. For a complex and large package such as DifferentialEquations, this could take significant time. True story: I entered this command and went to take a shower. When I returned, it was still working. After a total of about 20 minutes, the pre-compilation was complete. The compiled code is stored on disk, however, so starting up the REPL and using DifferentialEquations in future sessions does not incur this delay.
We will want to use the packages Plots and Measurements in our demonstration as well, so we install them and repeat the using commands on them.
Our example will involve the numerical solution to the differential equation:
f' = f - f²
where f' is the time derivative of f. This is sometimes called the logistic equation; it represents the initial exponential growth of a population, followed by its saturation, as space or food becomes scarce.
We begin by defining the differential equation in a form that can be used by the numerical solver:
julia> r(f,p,t) = f - f^2
This uses Julia's concise notation for defining a function on one line. To use the differential equation solver, we need to create a function of three arguments, whose value is the derivative of the function, which appears as the first argument. The second, p, is for optional parameters that we are not using here, and the third is the independent variable, which we are calling time. The function, which will be passed to the solver, can be named anything.
Next, we need to define a time interval (tint) over which we want the solution, and an initial value (r0), to nail down a particular solution from the infinitely many possible solutions to the equation:
julia> tint = (0.0, 8.0)
julia> r0 = 0.05
Now we give this information to a constructor that creates a "problem". This is a convenience, because it is often the case that we will want to experiment with, for example, different numerical methods, while the definition of the problem itself doesn't change. The constructor, called ODEProblem() (ODE stands for ordinary differential equation), is provided by the DifferentialEquations package:
julia> prob = ODEProblem(r, r0, tint)
Now to calculate a solution:
julia> sol = solve(prob)
That's all there is to it. Julia will respond with some numbers summarizing the numerical solution that it found. The solve() function is also provided by DifferentialEquations; a detailed tutorial is worth looking at for another example of solving a differential equation using ODEProblem() and solve(). The solve() function accepts many optional arguments, including one that selects from a myriad of available approximation methods; but I am making this example as streamlined as possible, and the default, in this case, does an excellent job. I know this, because this differential equation has an analytic solution, so I was able to check Julia's work.
The Plots package is already imported, so we can have a look at the solution:
julia> plot(sol.t, sol.u, key=false)
This command immediately pops up a plot window:
Note that sol is a data type with various bits of information packed into it about the solution; in order to plot it, we passed arrays of the independent (sol.t) and dependent (sol.u) variables.
Above I mentioned that we also wanted to import a package called Measurements. This library, among other things, extends floating-point numbers to include errors, or uncertainties in their values. Once you have imported it, you can write a number as, for example:
1.0 ± 0.1
That attaches an error value to it, which makes nice use of Julia's embrace of Unicode. These numbers behave as normal floats, except that they carry along error ranges with them. Arithmetic with these types treats the error ranges according to linear error propagation theory.
Suppose there was some uncertainty attached to our knowledge of the initial condition in the above problem. We could then write the initial value as:
julia> r0m = 0.05 ± 0.01
Now, logically, there would be some uncertainty in the solution, and to be consistent, in everything else. So everywhere where we had exact numbers, we replace them with uncertain numbers, even if we assume the errors to be zero:
julia> rm(f,p,t) = (1 ± 0.0)*(f - f^2)
julia> tintm = (0.0*1 ± 0, 8.0*1 ± 0)
Can we use the DifferentialEquations package to solve this equation? It may seem unreasonable to expect this to work. DifferentialEquations only claims to deal with floating-point numbers for initial values and time intervals, and it is a package for solving for functions that map numbers to numbers. Different people created the Measurements package, and DifferentialEquations knows nothing of this exotic new data type. Let's ignore all of this and forge ahead:
julia> probm = ODEProblem(rm,r0m,tintm)
julia> solm = solve(probm)
Julia not only proceeds without complaining, but it prints out the solution in the form of numbers with ± values appended. If DifferentialEquations can somehow handle this data type, can we push our luck with Plots?
julia> plot(solm.t, solm.u, key=false)
Typing that in the REPL gets us this figure:
We see here the same solution plotted as before, but this time with the uncertainties rendered as error bars. I've verified that the errors were propagated correctly, and that they are plotted accurately.
We had no right to expect that any of this would work, yet by some magic, it all does. Chris Rackauckas, the MIT professor who is the chief author of the DifferentialEquations package, and who suggested that I look at combining it with Measurements data types, tells me that his package was written with no knowledge of these data types. It was a delightful surprise for him when he discovered that his code handles numbers with error intervals in the correct way. It works because these libraries are written in a generic fashion, with functions that can be extended to work on new data types. This is made possible by Julia's type system and its organization around multiple dispatch. This is Julia's solution to the expression problem, allowing one to extend and combine existing libraries without having to modify those libraries, or even having to look at their source code.
Another example is the plot() function. It is designed to plot arrays of floating-point numbers, yet is able to produce sensible renderings of other data types, such as the uncertain numbers used in the example. This is because the Measurements package comes with a simple recipe that tells the plotting routines what to do when they encounter this data type; and that, again, is only possible because of Julia's multiple function dispatch.
One can, for example, use quaternions instead of floating-point numbers. These are like complex numbers, but with four components instead of two, and there is a Julia package that defines the data type. DifferentialEquations will happily solve equations for quaternion-valued functions (although, again, it knows nothing about them). They can be combined with Measurements to produce quaternions with error intervals attached to the components, and DifferentialEquations will propagate all of the errors correctly. A plot of the solution using the Plots package will display four lines, one for each component, and each one will have its own error bars.
Language changes
There have been no major breaking changes since version 1.0. There are some minor tweaks to the syntax, which generally make things more convenient, that are detailed in the the release notes for v1.5, v1.4, and v1.3. Here I'll describe several more significant changes.
The latest release brings a substantial change to scoping rules when using the REPL. In Julia, blocks such as for loops create their own lexical scopes. Variables that appear in such blocks are local to the block, unless explicitly declared to be global. In Julia 1.5, when using the REPL, an assignment inside a block to an undeclared variable that already exists in global scope will use that global variable. For code in files, the same code will get you a warning, because it is ambiguous; this change does not break any code in files. This was thought to be more convenient for interactive use in the REPL, where it can be tedious, and surprising for new users, to have to track global variables. Nevertheless, this change generated some controversy, because now the same code in a file and in the REPL will behave differently; that returns to the behavior of pre-1.0 Julia versions.
This version also introduces a syntax option that reduces visual noise and typing when using keyword arguments. If one's local variables have the same name as a function's named parameters, which is often the case, then a function call winds up looking like function(var1=var1, var2=var2); this is the normal syntax in Python and previous versions of Julia. Now this can be streamlined to function(; var1, var2).
The most common perennial complaint about Julia is the long startup time when pre-compiling a package. The latest version introduces an experimental module-level setting for the compiler optimization level. Turning off optimizations for code that is not crucial for performance can significantly reduce compilation time and, thus, latency.
Task-based parallelism, introduced in v1.3, is a convenient way to fire off long-running processes. Briefly, a simple @spawn macro allows the programmer to execute a function in a separate thread. The calls can be nested, with the function using @spawn to run other functions, while the language runtime coordinates everything.
Learning resources
In the last few years, some new books and useful websites have appeared, which makes it easier to get started with the language.
A few of the notable books are: Hands-On Design Patterns and Best Practices with Julia; Think Julia; The Little Book of Julia Algorithms, which is unusual in that it is aimed at middle- and high-school students; and Julia Programming Projects.
The official, online documentation is quite comprehensive, and may be all that an experienced programmer will require. But it mixes deep discussions of advanced topics together with introductory sections in a way that would be likely to bewilder a relative beginner.
Some websites and online resources that provide a gentler approach are Julia By Example, a Wikibook introduction to Julia, a series of classes at the JuliaAcademy, a set of Julia exercises, a Coursera course on Julia for scientists, and an introduction to the language for data scientists. There has been an explosion of learning resources—this is just a tiny sample.
I hope that the small example above helps demonstrate, in miniature, what's special about Julia, and why it is making such a splash in the scientific community. Julia is not the first language to make a multiple dispatch system available to programmers; Common Lisp included this 40 years ago. But it was a breakthrough to combine this with Fortran-class numerical performance and with a syntax as easy to pick up and read as Python's.
Julia is certainly worth considering for a scientist or engineer who is beginning a new project. Although it still doesn't have the vast coverage of the Python libraries, it is particularly easy to call Python (and Fortran or C) functions from Julia code—and the Julia code will, generally, run fast. For everyone else, I can recommend Julia simply as an interesting, fun, and instructive language to play around in.
| Index entries for this article | |
|---|---|
| GuestArticles | Phillips, Lee |
