|
|
Log in / Subscribe / Register

Out of order execution

Out of order execution

Posted Jun 28, 2023 17:19 UTC (Wed) by SLi (subscriber, #53131)
Parent article: JupyterLab 4.0: a development environment for education and research

That out of order execution is the reason why I, as a data scientist, don't use these notebooks a lot. I may be opinionated, but it's horrible.

It's not even like it wouldn't be possible to do this right (though I grant it's likely nontrivial with a language like Python). In my ideal world, notebooks would behave a lot like any modern spreadsheet. Cells have (implicit) inputs and outputs, and when you update a cell, everything that depends on it either gets automatically updated—which is not necessarily a great idea if you have huge datasets and non-deterministic behavior (it might make sense to do this at a sub-cell granularity)—or at least get marked out of date.

Cell order absolutely should matter. Inserting a cell that changes a variable value should affect any cells below it that refer to that variable, and (generally) only those.


to post comments

Out of order execution

Posted Jun 28, 2023 19:04 UTC (Wed) by summentier (subscriber, #100638) [Link] (2 responses)

You may want to take a look at Pluto, which comes fairly close to what you are describing, albeit only works with Julia. A Pluto notebook is presented to you as a sequence of cells, but internally it is a directed acyclic graph, where the cells are the vertices and dependencies are the edges. Changing one cell triggers recomputation of all dependent cells along the graph, regardless of the order.

Having seen my students flounder around with the "hidden state" and out-of-order execution problems this to me seemed like godsend: with Jupyter, we have to impress onto the students that if something looks weird, first try to restart the kernel and execute everything. Even nbgrader, the auto-grading engine we use to grade submissions, does this before handing in the notebook such that students don't submit non-working code by mistake.

However, having worked with Pluto for a while, I have to say that the beautiful concept does not translate that well to the real world: (1) I find I often have heavy calculation in one of the cells, and I don't always want to recompute even if I change one of it's dependencies; (2) one of the big advantages of out-of-order execution is that I can watch an iterative procedure converge by, e.g., repeatedly executing a sequence of cells; (3) reordering cells works reasonably well for functions, where you can "hide details" down the file, but I find it is a poor match for my mental model when you are describing a sequence of steps; (4) Pluto must be extremely restrictive with global state, which I find does get in the way of experimentation.

Out of order execution

Posted Jun 29, 2023 1:13 UTC (Thu) by rsidd (subscriber, #2582) [Link]

I haven't even tried pluto precisely because of these worries. I use jupyter+julia heavily (I never "got" jupyterlab but maybe I will give 4.0 a try). Jupyter is both a testing ground for new module functions, and a place to import and run the module. My system is to load the module on the top of the notebook, do testing and development (currently it's a clustering program and I have multiple notebooks open doing different benchmarks), and as and when I write a new function that works and is needed in the module, I put it there. I can also edit existing functions in the module and Revise.jl gets the notebook to "do the right thing" magically.

Out of order execution

Posted Jun 29, 2023 14:03 UTC (Thu) by ballombe (subscriber, #9523) [Link]

Indeed, pluto system should use versioning of both states and results, so that you could browse the full history of what was computed and get back/forward as needed.
(see each cells as a file in a fit repository and do 'git commit' each time a result is computed)

Out of order execution

Posted Jun 29, 2023 9:44 UTC (Thu) by spacefrogg (subscriber, #119608) [Link] (5 responses)

I use emacs, org-mode and org-babel for this. It can use the concept of dependent cells forming a graph and has a nice general concept of defining which cells to update when.

It is also language agnostic, so you can use output of one language as input to a different one. No need to premeditate which language to use in an notebook.

Out of order execution

Posted Jun 30, 2023 0:58 UTC (Fri) by intelfx (subscriber, #130118) [Link] (4 responses)

> I use emacs, org-mode and org-babel for this. It can use the concept of dependent cells forming a graph and has a nice general concept of defining which cells to update when.
>
> It is also language agnostic, so you can use output of one language as input to a different one. No need to premeditate which language to use in an notebook.

Sounds like this mechanism ought to be limited to feeding text output from cells into other cells? I guess that's something, but it’s absolutely inadequate for most non-trivial uses of Jupyter notebooks.

Out of order execution

Posted Jun 30, 2023 5:00 UTC (Fri) by rsidd (subscriber, #2582) [Link] (1 responses)

I was wondering this too. A jupyter notebook runs a single language kernel (python, julia, whatever), and that is a feature not a bug. You can have a global state, function definitions, etc, not just piping outputs to inputs. I see that babel has "session-based evaluation" for Python and some other languages which presumably allows the same thing.

The other thing is that emacs org-mode may be very useful to some people but will never take over the world: it's just too non-standard and non-intuitive if you haven't already lived much of your life inside emacs. (I use emacs but purely as a code editor. Most people younger than me don't use emacs.) Jupyter has arguably already taken over from proprietary platforms like mathematica and matlab. That's a win. Cf Economics Nobel laureate Paul Romer's article from 2018.

Out of order execution

Posted Jun 30, 2023 5:34 UTC (Fri) by SLi (subscriber, #53131) [Link]

This made me think of PowerShell. I'm not too familiar with it, being a Linux nerd, but what I've seen and tried out—it's open source and works on Linux too—it feels like it's does something right, allowing piping richer content than just text. Maybe something like that could be combined with a spreadsheet style auto update philosophy.

Out of order execution

Posted Jun 30, 2023 13:59 UTC (Fri) by spacefrogg (subscriber, #119608) [Link] (1 responses)

Strong words for somebody who didn't even take a minute to take a look at it. None of this is true. You can, of course, share code between snippets. Let all your code run in a single session, such that later snippets can implicitly use results from previous ones. Just like Jupyter. So, when staying within a single language it runs as a single program. But you can also pass values to and from other languages. Sure, it's not a complete byte-level interface, but it is much more flexible than Jupyter will ever be.

Out of order execution

Posted Jun 30, 2023 14:12 UTC (Fri) by intelfx (subscriber, #130118) [Link]

> Let all your code run in a single session, such that later snippets can implicitly use results from previous ones. Just like Jupyter. So, when staying within a single language it runs as a single program.

I can only assume the whole "concept of dependent cells" goes out of the window once you start using implicit state? Because if so, then it's no better than Jupyter.

(At best, I guess you could define dependencies manually (and maintain them alongside the code), which is nothing but one more place to inevitably mess up. I'm not sold, not in the slightest.)


Copyright © 2026, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds