|
|
Subscribe / Log in / New account

An alternate pattern-matching conditional for Elisp

By Jake Edge
March 1, 2024

One of the outcomes of the (extremely) lengthy discussion about using Common Lisp features in Emacs Lisp (Elisp), which we looked at back in November, was an effort to start removing some of those uses from Emacs. The rewrite of some of the Elisp in Emacs that uses the Common Lisp library (cl-lib) was started by Richard Stallman as a way to reduce the cognitive load needed for maintaining Emacs itself. Since then, he has broadened his efforts to simplify Elisp by adding a new pattern-matching conditional that would be a competitor to pcase, which is a longstanding macro that he finds overly complex.

Complexity

Back in mid-November, Stallman noted that he found the "little language" that pcase defines to be "so concise it is downright cryptic". He recognizes that trying to solve the same set of problems combining simpler Elisp constructs, such as cond and let, is "long-winded and cumbersome", but pcase has taken the desire for conciseness to an undesirable extreme. That imposes a cost on all Emacs developers who have to maintain code using pcase, he said, so he decided to adapt some pcase features in other constructs.

Predictably, that led to a long thread—standard fare for the emacs-devel mailing list—discussing whether there is a need for a pcase alternative, what one might look like, and more. Stallman started a new sub-thread to investigate his ideas for a new macro, cond*, which would provide a simpler pattern-matching construct that is still more concise than using "old-fashioned Lisp". His new macro is meant to combine the conditional cond form, which handles checking for multiple different values—something like switch constructs in other languages—with let, which temporarily binds values to variables within a limited scope. pcase and cond* are both designed to provide an ML-style pattern-matching conditional mechanism for Elisp.

A simple pcase example may give the general flavor of these constructs, but there is a great deal more that both can do, including pattern matching and pulling lists and other data structures apart, which is known as "destructuring". This example, taken from the pcase documentation, handles several different types (e.g. string, symbol) for a return code, producing an appropriate message for each:

    (pcase (get-return-code x)
      ;; string
      ((and (pred stringp) msg)
       (message "%s" msg))

      ;; symbol
      ('success       (message "Done!"))
      ('would-block   (message "Sorry, can't do it now"))
      ('read-only     (message "The schmilblick is read-only"))
      ('access-denied (message "You do not have the needed rights"))

      ;; default
      (code           (message "Unknown return code %S" code)))

Stallman characterized his approach as trying to "avoid the kludginess of pcase's bells-and-whistles-for-everything approach". Naturally, that led to further arguments in favor of pcase. For example, Michael Heerdegen said: "In my opinion `pcase' comes very close to the optimal solution for its task."

By mid-December, Stallman was asking about features needed for cond* "so that it is rare to encounter a pcase that can't be replaced cleanly". That conversation took place in another branch of the original pcase-replacement discussion, which makes it a little hard(er) to follow. In that part of the thread, others were working with Stallman on cond*; once again, there were numerous helpful suggestions and clarification queries, amidst a few grumbles. Stallman was clearly making progress on the feature, however.

In mid-November, Alan Mackenzie had raised an issue that has seemingly lingered with pcase since it was added in 2010: documentation. He pointed to a post he had made in 2015 that described the problems that existed in the pcase documentation; many of those were addressed at the time, but there is an ongoing effort, led by Jim Porter, to improve the documentation for the macro.

Porter listed multiple areas that need attention, including moving the presentation of the backquote ("`") operator, which is used for pattern-matching and destructuring, earlier in the pcase doc string. As Mackenzie had noted, one of the confusing things about pcase is its use of two punctuation marks, backquote and comma (","), that already have established uses in Elisp macro definitions: "pcase complicated the meaning of ` and ,. Before pcase these had definite meanings. Afterwards, they became highly context dependent."

There were a few responses to Porter's message, largely in favor of his ideas; eventually, Emacs co-maintainer Stefan Kangas copied the post to Stefan Monnier, who is a former Emacs maintainer and the developer of pcase. Monnier was generally in favor as well; he thought that the doc string was not really the place for detailed backquote information, though some reorganization made sense. Stallman took exception to Porter's other suggestion to possibly mention pcase in "An Introduction to Programming in Emacs Lisp". "We should not encourage people learning Emacs Lisp to use pcase."

First draft

In mid-January, Stallman posted a first draft of cond*, asking for more testing, "constructive comments, bug reports, patches, and suggestions". Andrea Corallo asked about one particular feature of cond*:

what is the reason for some of these cond* clauses to keep the binding in effect outside the clause itself and for the whole cond* construct? At first glance it doesn't look very idiomatic in Lisp terms to me.

Corallo is referring to the bind* sub-clause that can appear anywhere in the body of the cond* to bind variables with a scope that does not end with the bind* that contains them. Instead, the scope of those variables is the body of the cond* from that point onward, which is a bit of an oddity from the usual expectation in Lisp.

    (cond*
         (CONDITION FORM)
         ((bind* (x 42)))  ; create a binding of 42 to x
         (CONDITION-using-x FORM-using-x)
         ...)

As João Távora noted, others had already asked about that behavior, but he is "not sure we eventually clarified it". He also wondered if cond* could be built using pcase; if it is a strict subset of the features of pcase, it might help to do so. It "could actually facilitate the adoption path for 'cond*' (as questionable as that path may still be, at least for some parties)". But Stallman does not see things that way; he listed the advantages he sees with cond* and said that he plans to add the new macro to Emacs:

cond* has four basic advances over pcase: making bindings that cover the rest of the body, matching patterns against various data objects (not forcibly the same one), use of ordinary Lisp expressions as conditions in clauses, and the [ability] to make bindings and continue with further clauses.

I'm going to do some more testing and then install cond*.

Adam Porter asked Stallman to reconsider installing cond* into Emacs proper, suggesting that he should consider enhancing pcase rather than add a whole new facility that developers will need to learn. One of the complaints about pcase is that it is a burden to learn; "How will that burden be helped by having to learn both Pcase and cond*?" (Adam) Porter pointed out that pcase could handle many of the advances Stallman had listed, so there may be a path to enhance pcase:

Your stated reasons for writing cond* were various shortcomings of Pcase. Some of those, e.g. the documentation, have already had volunteers step up to address. The others could also be addressed in various ways. I've suggested a few, but you haven't explained the reasons for rejecting them.

As might be guessed, Stallman did not agree:

If pcase lacked features for certain specific jobs, it would be [easy] to fix that by adding a few features. However, the problem with pcase is that it has too many features for the job it does. cond* does the same jobs with fewer features because they work together better.

ELPA?

Kangas said that he was not closely following the discussion, but was surprised to hear that "there was a plan to install `cond*', or I would have spoken up sooner". Adding cond* will necessarily make the job of maintaining Emacs harder, since pcase is not going away, thus there will be "not one, but two relatively complex macros" that he will have to know and understand. That might be worth doing if cond* offered substantial benefits, but he does not see that; "What I see instead is a mere _version_ of `pcase'." So, he recommended making a new package for the GNU Emacs Lisp Package Archive (ELPA), which will provide "a good way of exploring an alternative version of an existing macro".

The lack of a plan to wholesale replace pcase with cond* was seen as a good thing by Kangas. That avoids a bunch of code churn and bugs that would naturally result. But Mackenzie disagreed; he thinks that fully replacing pcase would be a good goal for "improving the readability and maintainability of our code at a stroke". He thinks that putting it into an ELPA library is "a way of ensuring it never comes to anything and just gets forgotten about"; a feature branch with an eventual merge into the Emacs mainline would be a better course.

Like several others, Monnier believes that cond* is simply further complicating Elisp. On the other hand, he said, it would be a bit hypocritical for him to oppose it:

So, I'm not super enthusiastic about adding such a new form, but being responsible for the introduction of several such new constructs in ELisp over the years, I do feel a bit like the pot calling the kettle black.

So, rather than oppose it, I'll just point out some things which I think could be improved (or which offend my taste, as the case may be).

What followed was an extended back-and-forth between Stallman and Monnier about cond*, its overlaps with pcase (and how the two could perhaps "meet in the middle"), some deficiencies in the regular cond form with respect to binding inside of the conditional, and more. Much of the discussion revolved around Monnier's concern that the two constructs had overlapping, but still different, pattern languages:

[...] I see a fairly extensive but hardcoded pattern language, which seems like a regression compared to Pcase where the pattern language is built from a few primitives only (with the rest defined on top via `pcase-defmacro`). The worst part, tho, is that the two pattern languages are very similar, and I can't see any good reason for the differences.

Monnier asserted that Stallman could just reuse the low-level pattern language of pcase for cond*, which would have a number of benefits: "Less work, less code duplication, less documentation duplication, less to learn for coders. And presumably you'd then improve Pcase, so everyone wins."

The discussion continues as of this writing, though it is unclear which direction Stallman will take with the pattern language. That language, which is implemented by the match* form that would be added for cond*, could easily be changed to use the pcase machinery, Monnier said; a patch to do so has already been sent. Mackenzie objected to cond* using the pcase pattern handling, in part because of a lack of documentation of the low-level pcase machinery. Alfred M. Szmidt hoped that eventually pcase could be written to use cond*, instead of the reverse, but Monnier said that would not be technically feasible.

Adding cond*

At the end of January, Emacs co-maintainers Eli Zaretskii and Kangas announced that cond* would be installed in the Emacs core as an alternative to pcase; there would be no effort to switch away from pcase, however, as it is "to be considered a matter of stylistic preference". In the post, Kangas makes it clear that political, rather than strictly technical, considerations were part of the decision-making process:

Our responsibility as maintainers is first and foremost to ensure that we can all work together, and unite under a common banner. Our success as a project depends on it. Thus, the last thing we want to do is to alienate any group of contributors, big or small.

We believe that this is a more important concern than the arguments for or against cond* or pcase. The simple fact is that we have different backgrounds and experiences, which have tended to land us on either side of this discussion. This diversity is a strength, and not a weakness.

Even that has not satisfied everyone as there are apparently still some scars from pcase being installed in Emacs 14 years ago, at least for Mackenzie. He believes that the middle path chosen by the maintainers will not help resolve the problems that he and others have in understanding some pcase uses; he would still advocate a wholesale replacement using cond*. Stallman, on the other hand, does not think that pcase should be replaced, but does "hope to discourage its use inside Emacs in favor of cond*". And, of course, the decision did not stop yet another discussion of "cond* versus pcase" from breaking out and, naturally, continuing at length.

At the end of January, Stallman said that he was close to ready to commit the code to the Emacs Git repository, though that has not happened as of this writing. Zaretskii asked that documentation be added to the Elisp reference manual at the same time, so that may have slowed the process some. Before long, though, cond* should make an appearance in Emacs, so there will be two options for conditionals with pattern-matching and destructuring capabilities—for good or ill.



to post comments

An alternate pattern-matching conditional for Elisp

Posted Mar 1, 2024 22:26 UTC (Fri) by NYKevin (subscriber, #129325) [Link] (9 responses)

I have no actual knowledge of Emacs or Elisp. But every time I read an LWN article about them, I feel like I'm reading an article about the Québécois trying to convince people to stop writing things in English.

An alternate pattern-matching conditional for Elisp

Posted Mar 1, 2024 22:35 UTC (Fri) by Sesse (subscriber, #53779) [Link]

My feeling is more like: I am glad I am not supposed to maintain a program where the original author occasionally pops in and just overrides everything.

An alternate pattern-matching conditional for Elisp

Posted Mar 2, 2024 0:54 UTC (Sat) by Wol (subscriber, #4433) [Link] (5 responses)

It's the difference between a hand-saw, a chain-saw, and some mad inventor's idea of a saw.

I get the impression that Lisp (a functional language?) is the equivalent of a chain saw compared to the typical procedural language being your hand saw.

As such, lisp is typically more powerful, but harder to use. And as with so many things, people can't or won't put the effort into learning how to use powerful (dangerous?) tools. If you do it's often well worth while.

Cheers,
Wol

An alternate pattern-matching conditional for Elisp

Posted Mar 2, 2024 2:11 UTC (Sat) by mgb (guest, #3226) [Link] (4 responses)

Lisp is a simple and expressive procedural language. The beauty of Lisp is that programs are data.

There are only a couple of dozen basic elements to the language. Everything else is built from those elements using macros - real macros, not #defines.

This makes Lisp easy to compile, easy to extend, and easy to reason about. And if macros are used in moderation Lisp programs are also easy to understand.

Don't be put off by all the parens. With automatic indenting ("pretty printing") you barely notice them.

An alternate pattern-matching conditional for Elisp

Posted Mar 2, 2024 15:28 UTC (Sat) by rsidd (subscriber, #2582) [Link] (3 responses)

It is widely accepted that Lisp is a functional language like ML, though not "pure functional" like Haskell. And is the granddaddy of both of those. Most functional paradigms originated in Lisp.

The key point is the second in this well-known article by Paul Graham. In lisp, functions are first class objects. Also point 6, programs are compositions of expressions that return values.

Disclaimer: I have dabbled a little in scheme, less in common lisp, and though I use emacs I haven't written anything in emacs-lisp (only cut-paste from other sources).

An alternate pattern-matching conditional for Elisp

Posted Mar 2, 2024 18:14 UTC (Sat) by mgb (guest, #3226) [Link] (1 responses)

Function pointers are not the same as functional programming. Functional programming languages can perform higher-order functionality such as currying, but that is not a LISP (or C) feature.

Functional programming languages discourage or prohibit side-effects. You can write pure LISP (or pure C) but production LISP (and C) code uses side-effects. LISP and C are at roughly the same place on the imperative-functional spectrum.

LISP's ability to manipulate it's own source code - it's all just lists - makes it easy to interpret new languages with different semantics. That is why many object-orientated languages, functional languages, and inference engines all came from the LISP direction, rather than FORTRAN or COBOL.

An alternate pattern-matching conditional for Elisp

Posted Mar 4, 2024 10:52 UTC (Mon) by jem (subscriber, #24231) [Link]

Functional programming is usually defined a a programming paradigm where functions are first-class citizens: functions can be bound to names, they can be passed as arguments to other functions and be returned from functions. Lisp checks all these boxes; to demote Lisp to the level of C on the imperative vs. functional scale you will have to come up with some more examples than just support for currying.

It is true that Lisp code often is sprinkled with calls to setq (like I showed in my other comment). At least one dialect of Lisp, Clojure, prohibits side effects. Clojure is not "pure", though: it does have escape hatches, but ordinary variables are immutable. This also includes data structures: lists, vectors, sets and maps. Changing the value of an element in a collection, no matter how large, always produces a new collection object. Looping is done using recursion.

Clojure does not support currying, but it does provide the function 'partial', which takes a function 'f' as argument plus a number of arguments to 'f', and returns a new function that takes the rest of the arguments normally supplied to 'f'. Clojure could have supported automatic currying, but that would have clashed with Clojure's multi-arity functions.

An alternate pattern-matching conditional for Elisp

Posted Mar 2, 2024 18:19 UTC (Sat) by jem (subscriber, #24231) [Link]

In practice, Emacs Lisp is not functional. Typical Emacs Lisp code looks like this:

(setq i 10)

(while (> i 0)
  (print i)
  (setq i (- i 1)))

(Taken from Elisp examples)

An alternate pattern-matching conditional for Elisp

Posted Mar 2, 2024 1:35 UTC (Sat) by dskoll (subscriber, #1630) [Link]

I use Emacs as my text editor (and have done so for 34 years... yikes!). I also spent two years working in Lisp, though Common Lisp and not Emacs Lisp.

Mostly, I watch this drama from the sidelines because I use Emacs... I've never done any Lisp customization of it. In general, I think moving to Common Lisp as much as possible is a good idea, but I'm sure it'll be as painful and drama-filled as the Python 2->3 move.

An alternate pattern-matching conditional for Elisp

Posted Mar 2, 2024 18:18 UTC (Sat) by atai (subscriber, #10977) [Link]

your 2 cents?

An alternate pattern-matching conditional for Elisp

Posted Mar 2, 2024 13:01 UTC (Sat) by gray_-_wolf (subscriber, #131074) [Link]

The bindings being valid for the rest of cond* does seem a bit weird, I am curious how much adoption this will get compared to pcase.

I personally prefer the Scheme's approach (taken from GNU Guile's documentation):
    (let ((l '(hello (world))))
      (match l           ;; <- the input object
        (('hello (who))  ;; <- the pattern
         who)))          ;; <- the expression evaluated upon matching
    ⇒ world
It does not require ` for destructuring, which I think is cleaner. While it does suffer from the same problem of having (somewhat) complex mini-language, it seems to work reasonably in practice. I wonder if you could write it in elisp.

PS: On a somewhat separate note, I feel like lately LWN is having a bit wider scope (no hard data, maybe I am just paying more attention). I like it. Especially given the fact that the articles tend to be written in the very typical LWN style full of backlinks into the discussion making it easy to get wider context. So, thank you for all the work the editors and writers do :)

An alternate pattern-matching conditional for Elisp

Posted Mar 2, 2024 19:31 UTC (Sat) by Phantom_Hoover (subscriber, #167627) [Link] (1 responses)

The weird clunky design of cond*, and particularly the inexplicably crippled way match* works, is frustratingly bad. Case statements with destructuring pattern matching are an extremely common and useful feature of modern programming languages, and Stallman’s perverse efforts to avoid implementing them is hard to understand except by assuming the same mentality that drove the designers of Go: ‘every idea that I hadn’t heard of by 1979 is overcomplicated bloat and I won’t allow it to pollute my language’. I guess I’m glad I never have to use Emacs Lisp.

An alternate pattern-matching conditional for Elisp

Posted Mar 2, 2024 19:49 UTC (Sat) by Phantom_Hoover (subscriber, #167627) [Link]

(I think I might have misread his spec and so this entire comment is in error, but there’s no delete comment button…)


Copyright © 2024, Eklektix, Inc.
This article may be redistributed under the terms of the Creative Commons CC BY-SA 4.0 license
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds