|
|
Log in / Subscribe / Register

Useability is not good

Useability is not good

Posted Dec 17, 2025 13:48 UTC (Wed) by mbunkus (subscriber, #87248)
In reply to: Useability is not good by pizza
Parent article: Conill: Rethinking sudo with object capabilities

> Comments in Javascript (ie the 'J' in 'JSON') are already well-defined. There was no need to reinvent any wheel here, and whether or not the syntax supports comments is orthogonal to how well-tested it is.

The JSON standard that's actually called JSON (aka ECMA 404 aka RFC 8259) does _not_ allow for comments and several other things that would make using it easier for humans (e.g. trailing comma after last array/object element, multi-line strings with \n instead of \\n) Most JSON parsers out there do fail to parse JSON documents that contain JavaScript-style comments.

There are other standards/projects with similar names such as JSON5 that _do_ have those features, or jsonc which is base JSON with comments. But that's not JSON.

And neither is JavaScript.

Sorry for being pedantic here; this is just a pet peeve of mine. In my opinion there's simply no really good text configuration format. Some of my objections:

- JSON: lack of support for comments, multi-line strings, trailing commas
- JSON: no (consistent) support for 64-bit integers (or larger) in a lot of parsers
- YAML: huge complexity due to all the features
- YAML: security implications due to object type thingies
- YAML: way too lenient with string values & trying to auto-guess data types resulting in a lot of surprising conversions that highly depend on the parser used
- YAML: incredibly easy to screw up the format for inexperienced authors
- YAML: sub-par tooling support due to lack of structural information in the format itself
- TOML: nested hashes requires repeating all upper-level key names over & over again (e.g. "[settings]" → "[settings.auth]", "[settings.database]" etc.)
- TOML: lack of wide-spread language support
- XML: attributes vs data in child elements
- XML: easy to create structures that don't map 1:1 into array/hash hierarchies
- XML: without external type information it's impossible to know what's supposed to be array-like & what isn't
- XML: even less type information than any of the others

Even despite all of its drawbacks I tend to use YAML a lot more than JSON for anything that a human has to touch semi-regularly (e.g. Ansible stuff), simply due to basic JSON being so anti-maintenance.


to post comments

Useability is not good

Posted Dec 17, 2025 14:09 UTC (Wed) by pizza (subscriber, #46) [Link]

> Even despite all of its drawbacks I tend to use YAML a lot more than JSON for anything that a human has to touch semi-regularly (e.g. Ansible stuff), simply due to basic JSON being so anti-maintenance.

Yep, that level of "anti-[human-]maintenance" is my fundamental beef with JSON...

Useability is not good

Posted Dec 18, 2025 9:05 UTC (Thu) by taladar (subscriber, #68407) [Link] (1 responses)

Where YAML really completely breaks is when you have to do any kind of templating of the file. Significant whitespace is just a really bad fit for that. Not only is it basically impossible to read but it is also very close to unwritable to anyone who isn't a masochist.

Useability is not good

Posted Dec 18, 2025 12:09 UTC (Thu) by mbunkus (subscriber, #87248) [Link]

100% agree. I really hated that back when I was still using SaltStack. In SaltStack, unlike Ansible, the templating happens on the "whole file" level of the YAML rules/roles, meaning it'll be interleaved with regular YAML stuff — often completely breaking formatting, linting etc., as you said.

For example, in SaltStack you could do something like this to distribute two files:

{% set file_list = [ 'vimrc', 'bashrc' ] %}
{% for item in file_list %}
/etc/{{ item }}:
  file.managed:
    - source: salt://{{ item }}
{% endfor %}

In Ansible templating can only be used in YAML values, though, meaning they're always part of a YAML string. In order to provide basic loops & conditions Ansible itself has special hash keys it recognizes, evaluating the template code in the corresponding values & making the decision based on it. For example:

- ansible.builtin.file:
    src: "files/{{ item }}"
    dest: "/etc/
  loop: "{{ file_list }}"
  vars:
    file_list: [ 'vimrc', 'bashrc' ]
Whole-file is obviously much more powerful as you have a Turing-complete templating language at your disposal, but dealing with all but the simplest cases becomes a real pain.

Useability is not good

Posted Dec 18, 2025 16:03 UTC (Thu) by NYKevin (subscriber, #129325) [Link] (1 responses)

It frustrates me that nobody seems to know that textproto exists. That's the text serialization format of protocol buffers.[1] In general, it has the following advantages:

* Bindings for quite a few major languages.
* No semantic type ambiguities (schemas are mandatory, so the type of every field is known in advance). This is also used to propagate static type information into the language bindings (by generating per-schema serialization code).
* Few/no syntactic type ambiguities (strings are quoted, floats have decimal points or an "f" suffix, etc.).
* Comments are supported.
* Losslessly (and easily) converts into a binary format for efficient on-wire representation.
* Supports a JSON-like syntax, which should feel familiar to most people.

Disadvantages:

* Not opinionated enough. Several things can be spelled in multiple ways, and these spellings can be freely mixed.
* Quite a few bits of the linked spec say things like "depending on the implementation...," although this is mostly confined to edge cases where you write something silly and the implementation has to guess what you mean.
* Many enterprisey features that are not required for simple use cases. Expect to see "com.google" etc. show up a lot.
* Similar to TOML, it has support in several languages, but not in every language under the sun (contrast with JSON).
* If your build system is... less than ideal, then you probably think that generating code is scary and problematic. As you might expect, it works fine under Bazel, because that's how Google uses it internally.

Disclaimer: I'm a Google engineer, and Google invented protobufs.

[1]: https://protobuf.dev/reference/protobuf/textformat-spec/

Useability is not good

Posted Dec 19, 2025 10:26 UTC (Fri) by mbunkus (subscriber, #87248) [Link]

You're spot-on that I did not know about textproto until now. Thanks for mentioning it. It definitely looks interesting & I'll take a serious look at it.

Useability is not good

Posted Dec 18, 2025 19:49 UTC (Thu) by Cyberax (✭ supporter ✭, #52523) [Link] (3 responses)

> - XML: even less type information than any of the others

That's not quite correct. There's XSD that even allows you to restrict numbers to a specific subset.

Useability is not good

Posted Dec 19, 2025 10:25 UTC (Fri) by mbunkus (subscriber, #87248) [Link] (2 responses)

You're correct that with additional, out-of-band information you can map it unambiguously, but there's out-of-band info for other formats, too, e.g. JSON schema.

What I meant was that the following XML cannot be read by naive parsers & converted into hash-array structures without an additional information such as a stylesheet or hints to the parser:

<xml>
  <settings>
    <something>42</something>
  </settings>

  <auth>
    <option/>
  </auth>
</xml>

First of all, the program might expect <settings> to be either a hash or an array; it't not obvious just from a stylesheet-less XML alone. Here are two possible corresponding JSON representations:

{
  "settings": {
    "something": 42
  }
}

or even

{
  "settings": [
    { "something": "42" }
  ]
}

Second, there's no info about the type of <option> either. It might be: an empty string; a None/undefined/null type of value…

Of course this is because XML is capable of expressing more complex structures than nested hash-arrays, but most programs nowadays use nested hash-arrays for any kind of configuration information — because it's more or less natural to build such structures, they're trivial to implement in most programming languages, they map cleanly to all kinds of binary & text representations. XML's flexibility & capabilities are to its detriment when considering to use it as a human-maintainable configuration format.

For me an ideal human-maintainable format has a couple of properties:

  • allows for comments (JSON loses here)
  • has in-band structural information to make tooling a viable option (pretty printing, structure validation, auto-indenting in an editor, easy navigation with "jump to key XYZ…" functionality; YAML loses here)
  • makes it harder to mess up the format (YAML & XML lose here)
  • does have little repetition in what I have to type all the time (XML & TOML lose here)
  • is optically easy to grasp for us meatbags, not just agile computers (YAML loses here, but so does JSON when you have to deal with long strings)

I did not actually know about textproto which NYKevin has just mentioned. I will definitely look into that.

Useability is not good

Posted Dec 19, 2025 21:27 UTC (Fri) by Cyberax (✭ supporter ✭, #52523) [Link] (1 responses)

Yeah, if you want to carry all the information in-band then there's no real good option. Integers/booleans are especially trippy. E.g. is 129387 an int32 or int64? Then what about JavaScript?

Useability is not good

Posted Dec 20, 2025 9:28 UTC (Sat) by Wol (subscriber, #4433) [Link]

Until the target system importing (or exporting) the JSON just doesn't care - it could be a straight infinite precision integer (or number) ...

Cheers,
Wol

Useability is not good

Posted Dec 19, 2025 10:23 UTC (Fri) by gioele (subscriber, #61675) [Link] (4 responses)

I'd add:

- TOML: no difference between "this field is set to null" and "this field is not set and thus it has the default value". (See https://lobste.rs/s/h50lml/toml_1_1_0_released#c_wiibz1 for a longer discussion.)

And I'd remove:

- XML: attributes vs data in child elements

Why is that a problem? Many IDLs offer you a way to specify non-structured metadata about an object (attributes) or encapsulated structured/non-structured data (child elements).

Useability is not good

Posted Dec 19, 2025 11:13 UTC (Fri) by mbunkus (subscriber, #87248) [Link] (3 responses)

>> - XML: attributes vs data in child elements

> Why is that a problem? Many IDLs offer you a way to specify non-structured metadata about an object (attributes) or encapsulated structured/non-structured data (child elements).

Due to how the distinction between data & metadata isn't actually adhered to or, god forbid, enforced somehow. Just check the Apache Tomcat server.xml configuration files. While you can map that data into hash-array structures, you cannot do the reverse without additional information provided to the writer (stylesheet, explicit writer configuration etc.).

Again, I'm only arguing from a standpoint of having a format that humans can easily maintain. In this situation the duality or overlap of functionality of attributes & child elements is a clear detriment. Not only do you have to remember which options exist, but also whether the parser/application expects those to be an attribute or a child element. Furthermore, element & attribute names are often spelled differently, again placing more cognitive load on us humans.

Sure, good tooling & stylesheets fix some of those concerns. And no, XML isn't unusable, of course. I'm just… frustrated by the lack of a format I can consider really easy to use for us humans with only very minor drawbacks.

Useability is not good

Posted Dec 22, 2025 9:15 UTC (Mon) by taladar (subscriber, #68407) [Link] (2 responses)

Where XML really loses is its ridiculous escaping system with giant lookup tables, making every XML library comparatively huge and escaping support a pain to implement.

Useability is not good

Posted Dec 22, 2025 9:37 UTC (Mon) by gioele (subscriber, #61675) [Link] (1 responses)

> Where XML really loses is its ridiculous escaping system with giant lookup tables

What are you referring to exactly? XML's escaping system defines exactly 5 entities (lt, gt, amp, apos, quot), a way for DTD authors to define their own entities (sadly known for the "billion laughs" attack), and a generic mechanism for referring to Unicode codepoints (&#248; for ø). None of that requires "giant lookup tables".

Maybe you're mixing XML with HTML, whose predefined list of char entities is quite long and contains unusual things like " &npart; (partial differential, combining long solidus overlay)"? https://en.wikipedia.org/wiki/List_of_XML_and_HTML_charac...

Useability is not good

Posted Dec 22, 2025 10:24 UTC (Mon) by taladar (subscriber, #68407) [Link]

Well, HTML and pretty much any concrete XML format I have ever had the displeasure of dealing with. The basic design of the entity system is just deeply flawed when compared to much simpler escaping mechanism in other text formats. I'd much rather deal with five levels of backslash escaping in a template that generates shell code that uses regular expression parameters to modify some other stuff with backslash escaping than with XML and that is not because nested backslash escaping is fun to deal with (especially when each level has slightly different things that need backslash escaping).


Copyright © 2026, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds