LWN: Comments on "The XArray data structure" https://lwn.net/Articles/745073/ This is a special feed containing comments posted to the individual LWN article titled "The XArray data structure". en-us Thu, 09 Oct 2025 01:03:00 +0000 Thu, 09 Oct 2025 01:03:00 +0000 https://www.rssboard.org/rss-specification lwn@lwn.net The XArray data structure https://lwn.net/Articles/982909/ https://lwn.net/Articles/982909/ dafnaf <div class="FormattedComment"> <span class="QuotedText">&gt;&gt; The XArray is intended to have performance and usage characteristics as close as possible to a C native array. Unlike, say, a hash table.</span><br> <p> could you clarify that? since afaik a hash table also suppose to have the same complaxity as an array, that is avarage of O(1) for get/set as the linked list in each bucket suppose to be "short enough". So when should we use hash table and when xarray?<br> </div> Tue, 23 Jul 2024 12:58:17 +0000 (XArray) C++ → C naming conversions https://lwn.net/Articles/784451/ https://lwn.net/Articles/784451/ nybble41 <div class="FormattedComment"> <font class="QuotedText">&gt; How would "std::map&lt;unsigned long, std::tuple&lt;void*, bool[3]&gt;&gt;, Alloc&gt; …" even map to C as a name?</font><br> <p> This is a solved problem. After all, G++ already generates C-compatible names which encode their C++ argument types.<br> <p> Just call it St3mapImSt5tupleIJPvA3_bEESt4lessImESaISt4pairIKmS3_EEE_array. :-)<br> </div> Sun, 31 Mar 2019 16:22:21 +0000 (XArray) C++ → C naming conversions https://lwn.net/Articles/784436/ https://lwn.net/Articles/784436/ zenaan <div class="FormattedComment"> <font class="QuotedText">&gt;&gt; (That said, it's totally a std::map&lt;unsigned long, std::tuple&lt;void*, bool[3]&gt;&gt;, Alloc&gt; with internal locking, stable iterators, and atomic value operators...except for the parts where it's different from that)</font><br> <p> <font class="QuotedText">&gt; I'm not a C++ programmer, nor a theoretician. I'm happy for the API to not match C++, as long as it matches people's expectations.</font><br> <p> How would "std::map&lt;unsigned long, std::tuple&lt;void*, bool[3]&gt;&gt;, Alloc&gt; with internal locking, stable iterators, and atomic value operators..." even map to C as a name?<br> <p> Discluding the behaviour characteristics, and taking the first letter of each C++ 'word' → smulstvbA_array perhaps?<br> </div> Sun, 31 Mar 2019 00:49:06 +0000 The XArray data structure https://lwn.net/Articles/746281/ https://lwn.net/Articles/746281/ andresfreund <div class="FormattedComment"> Sorry to be that blunt: Could you perhaps consider commenting a little less frequently? You comment on nearly everything - not infrequently the majority of new comments are yours. Even ignoring them reduces the usability of the "unread comments" page noticeably.<br> </div> Fri, 02 Feb 2018 11:07:37 +0000 The XArray data structure https://lwn.net/Articles/746273/ https://lwn.net/Articles/746273/ Wol <div class="FormattedComment"> No. The USERS don't give a monkeys. The implementers care deeply ...<br> <p> Cheers,<br> Wol<br> </div> Fri, 02 Feb 2018 09:25:34 +0000 The XArray data structure https://lwn.net/Articles/746271/ https://lwn.net/Articles/746271/ Wol <div class="FormattedComment"> That's a sparse array ... :-)<br> <p> Cheers,<br> Wol<br> </div> Fri, 02 Feb 2018 09:24:49 +0000 The XArray data structure https://lwn.net/Articles/745621/ https://lwn.net/Articles/745621/ nix <div class="FormattedComment"> That's exactly what I was thinking. This is not a tree. It's an array that happens to have a tree as its internal storage representation currently. You can't have an array element with no value! (Sure, it's sparse, but all that means is that null elements are cheaper than non-null ones).<br> <p> </div> Mon, 29 Jan 2018 13:50:51 +0000 The XArray data structure https://lwn.net/Articles/745613/ https://lwn.net/Articles/745613/ willy <div class="FormattedComment"> You'd be amazed how few places actually need the preloading functionality. We had 42 calls to various forms of the preload functions.<br> <p> I've entirely removed the radix tree preload from my tree and I have only seven calls to xa_reserve() (in four files). Until a few weeks ago, I hadn't found anywhere with sufficiently complex locking requirements to need xa_reserve(). The general pattern that needs xa_reserve() is:<br> <p> radix_tree_preload();<br> spin_lock(some_other_lock);<br> spin_lock(tree_lock);<br> radix_tree_insert();<br> ...<br> <p> Sometimes you can invert the order of the locks, or combine the two locks, or resort to the advanced API in order to explicitly handle unlocking before allocating memory, but converting it to:<br> <p> xa_reserve();<br> spin_lock(other_lock);<br> xa_store();<br> <p> is easier.<br> <p> If the out-of-tree lustre code has developed new and exciting ways to use the radix tree API, then they'll get to fix it up ... maintaining out of tree patches is expensive and drivers/staging/ is there to lower the cost.<br> </div> Mon, 29 Jan 2018 09:54:13 +0000 The XArray data structure https://lwn.net/Articles/745609/ https://lwn.net/Articles/745609/ willy <div class="FormattedComment"> Hi Zygo, great to hear from you again<br> <p> The XArray is intended to have performance and usage characteristics as close as possible to a C native array. Unlike, say, a hash table. I don't want anyone to be surprised by a mismatch between the naming and the behaviour.<br> <p> You're right that there's no right name for this data structure. The closest thing I've found in the literature is the Judy Array. So I sidestepped the question of what the _data structure_ is called and focused on what it provides (an array abstraction). That enables us to change the underlying implementation as we see fit. There were a number of ways the old API leaked the internal details of the data structure to the callers, and I've eliminated those.<br> <p> I'm not a C++ programmer, nor a theoretician. I'm happy for the API to not match C++, as long as it matches people's expectations.<br> <p> </div> Mon, 29 Jan 2018 08:58:31 +0000 The XArray data structure https://lwn.net/Articles/745607/ https://lwn.net/Articles/745607/ willy <div class="FormattedComment"> Thanks for writing up my talk, Jon!<br> <p> I have a quibble with your article,<br> <p> <font class="QuotedText">&gt; The return value on success is the previous value (if any) that was stored at index.</font><br> <p> This is the kind of radix tree thinking that I'm trying to eradicate. Every index has a value stored at it. An "empty array" has a NULL pointer stored at every index; and a freshly-initialised array is empty. Just deleting the parenthetical would make me happy :-)<br> <p> If you can suggest an improvement to the documentation that would help people get to this mindset, I'd be more than happy to incorporate it. It may just be a matter of time and people getting used to the new way of thinking.<br> <p> </div> Mon, 29 Jan 2018 04:27:46 +0000 The XArray data structure https://lwn.net/Articles/745447/ https://lwn.net/Articles/745447/ ncultra <div class="FormattedComment"> The api is similar to the existing, though sparsely used, flex_array library, which has pre-allocation but not locking. I have found it useful but am concerned that no upstream kernel code seems to be using flex_arrays. <br> </div> Fri, 26 Jan 2018 22:09:28 +0000 The XArray data structure https://lwn.net/Articles/745328/ https://lwn.net/Articles/745328/ zblaxell <div class="FormattedComment"> Everyone working in a new programming environment, even if the language is familiar, has to figure out whether each thing with the word "array" in its name is some flavor of btree, dict, hash table, map, rbtree, set, vector, constant pointer + index * size, "other", or "not implemented here." Once that's known for "array", one then has to learn the local names for all the other things too. They all have different capabilities, limitations, performance, and unimaginable quirks that make the wrong choice wrong for at least one situation.<br> <p> The wrong word was used here, but that was inevitable, as there's always someone who joins the party late and complains "but in my tribe we always called that by another name." There could never have been a correct name for this structure--only a previously unused one, preferably something with a short unique acronym to use as a prefix.<br> <p> (That said, it's totally a std::map&lt;unsigned long, std::tuple&lt;void*, bool[3]&gt;&gt;, Alloc&gt; with internal locking, stable iterators, and atomic value operators...except for the parts where it's different from that)<br> </div> Thu, 25 Jan 2018 21:27:55 +0000 The XArray data structure https://lwn.net/Articles/745323/ https://lwn.net/Articles/745323/ rweikusat2 <div class="FormattedComment"> <font class="QuotedText">&gt; Yes, but I think "a map structure keyed by an integer" is essentially the definition of "array" (at least as an abstract data type), and </font><br> <font class="QuotedText">&gt; it's generally easier to use the shorter name.</font><br> <p> Abstractly, an 'array' is an ordered set of values which can be accessed by position in constant time. That's something entirely different from a map associating members from a set of keys with members from a set of values. Eg, they key set of a map using integer keys might be {-60, -1, 1434, 131072}.<br> <p> <p> </div> Thu, 25 Jan 2018 20:19:08 +0000 The XArray data structure https://lwn.net/Articles/745317/ https://lwn.net/Articles/745317/ bof <div class="FormattedComment"> <font class="QuotedText">&gt; (Except in Javascript, where an array is actually a map structure keyed by the decimal string representation of an integer.)</font><br> <p> Or in PHP, where it can be both, at the same time...<br> <p> <font class="QuotedText">&gt; The concrete implementation of that data type is a separate issue; it could be a series of consecutive memory addresses, or a hash table, or a red-black tree, or a radix tree, or something that changes depending on the density of its contents, etc. But users of an API shouldn't care about those implementation details - they just care that it behaves like an abstract array with certain performance characteristics, so it seems sensible to call it an array.</font><br> <p> In kernel space, users probably do care about such details very much. Otherwise the kernel would have been rewritten in PHP a long time ago...<br> </div> Thu, 25 Jan 2018 18:00:43 +0000 The XArray data structure https://lwn.net/Articles/745310/ https://lwn.net/Articles/745310/ excors <div class="FormattedComment"> Yes, but I think "a map structure keyed by an integer" is essentially the definition of "array" (at least as an abstract data type), and it's generally easier to use the shorter name.<br> <p> (Except in Javascript, where an array is actually a map structure keyed by the decimal string representation of an integer.)<br> <p> The concrete implementation of that data type is a separate issue; it could be a series of consecutive memory addresses, or a hash table, or a red-black tree, or a radix tree, or something that changes depending on the density of its contents, etc. But users of an API shouldn't care about those implementation details - they just care that it behaves like an abstract array with certain performance characteristics, so it seems sensible to call it an array.<br> </div> Thu, 25 Jan 2018 17:20:22 +0000 The XArray data structure https://lwn.net/Articles/745302/ https://lwn.net/Articles/745302/ skitching <div class="FormattedComment"> Isn't this equivalent to a map (aka dictionary) structure keyed by an integer? That metaphore works better for me than a "dynamic array".<br> </div> Thu, 25 Jan 2018 14:42:48 +0000 The XArray data structure https://lwn.net/Articles/745244/ https://lwn.net/Articles/745244/ LawnGnome <div class="FormattedComment"> Apparently there have been issues with video recording lockups at past conferences not being noticed until a talk has already been underway, so there needs to be movement in the frame for a quick sanity check as the speaker is preparing to begin. Some conference venues have clocks, but the linux.conf.au venue doesn't this year, so they've deployed a small clowder of maneki-neko instead.<br> </div> Wed, 24 Jan 2018 21:20:04 +0000 Preallocation https://lwn.net/Articles/745242/ https://lwn.net/Articles/745242/ corbet Indeed, preallocation is not entirely removed. It's not visible in the normal API, though, and has become something that most callers need not worry about. Wed, 24 Jan 2018 21:16:39 +0000 The XArray data structure https://lwn.net/Articles/745241/ https://lwn.net/Articles/745241/ meyert <div class="FormattedComment"> What is the reason for the maneki-nekos at conferences these days?<br> </div> Wed, 24 Jan 2018 20:48:17 +0000 The XArray data structure https://lwn.net/Articles/745202/ https://lwn.net/Articles/745202/ Paf <div class="FormattedComment"> Yeah, the Lustre filesystem (possibly not in the upstream kernel version) has a tricky interaction with a radix tree where it has to do a preload or it can deadlock. I didn’t see how it was possible to not have the preload in some form. Relieved to hear it’s still there.<br> </div> Wed, 24 Jan 2018 13:54:49 +0000 The XArray data structure https://lwn.net/Articles/745199/ https://lwn.net/Articles/745199/ cborni <div class="FormattedComment"> The preloading is used in some cases with tricky locking. Looking at the patches, preloading is not removed, it is now replace by a different scheme with a call names xa_reserve.<br> </div> Wed, 24 Jan 2018 12:37:49 +0000