LWN: Comments on "Git v2.24.1 and others" https://lwn.net/Articles/806972/ This is a special feed containing comments posted to the individual LWN article titled "Git v2.24.1 and others". en-us Mon, 20 Oct 2025 18:44:25 +0000 Mon, 20 Oct 2025 18:44:25 +0000 https://www.rssboard.org/rss-specification lwn@lwn.net Git v2.24.1 and others https://lwn.net/Articles/807999/ https://lwn.net/Articles/807999/ geert <div class="FormattedComment"> Looks like we're still lacking a few characters to run a C64 emulator in a terminal window? ;-)<br> </div> Mon, 23 Dec 2019 15:13:33 +0000 Git v2.24.1 and others https://lwn.net/Articles/807795/ https://lwn.net/Articles/807795/ flussence <div class="FormattedComment"> The bulk of existing emojis came from a set already used among Japanese phone carriers around the turn of the century. That's why U+1F5FC is labelled the Tokyo (not Eiffel) Tower, and most of that block is similarly laden with cultural artifacts most people wouldn't be familiar with.<br> <p> We're actually running out of things to add to Unicode. New emoji proposals are in short supply and most of the recent additions have been ancient scripts and increasingly obscure precomposed CJK glyphs. Maybe of more relevance to people reading this, Unicode 13 is adding characters from ancient computer systems (Spectrum, Teletext, C64 and the like): <a href="https://www.unicode.org/charts/PDF/Unicode-13.0/">https://www.unicode.org/charts/PDF/Unicode-13.0/</a><br> </div> Fri, 20 Dec 2019 00:33:19 +0000 Git v2.24.1 and others https://lwn.net/Articles/807576/ https://lwn.net/Articles/807576/ jezuch <div class="FormattedComment"> With things like this, there usully comes a time when you can say that the problem is basically solved. It was not in 2002, but it pretty much is now. Just be glad that Unicode took upon itself the very ungrateful task of developing a universal standard for text encoding, and that you didn't have to do that yourself. It's now done. We can finally use it to clean up the royal mess of all the previous attempts.<br> <p> It's like with date handling: do not *ever* write your own date handling library; now that even Java got a decent support for it in its standard library, you don't have to be stupid like this anymore ;) And one standard is enough. We've got it solved, let's move on already.<br> </div> Wed, 18 Dec 2019 12:24:28 +0000 Git v2.24.1 and others https://lwn.net/Articles/807575/ https://lwn.net/Articles/807575/ NAR <div class="FormattedComment"> Aren't emojis using up UTF-8 codepoints? I can very well imagine the human race fill up the UTF-8 codepoints, unfortunately...<br> </div> Wed, 18 Dec 2019 12:06:14 +0000 Git v2.24.1 and others https://lwn.net/Articles/807551/ https://lwn.net/Articles/807551/ Cyberax <div class="FormattedComment"> UCS-2 would have been an improvement over the binary garbage. And anyway, UTF-8 is now universal for actual languages and there's plenty of space for new ones.<br> <p> Pretty much the only case where UTF-8 won't be enough is if the Earth join the Galactic Federation with FTL communications. But in this case I think that migration off UTF-8 would be a good problem to have.<br> </div> Tue, 17 Dec 2019 22:26:33 +0000 Git v2.24.1 and others https://lwn.net/Articles/807549/ https://lwn.net/Articles/807549/ flussence <div class="FormattedComment"> Somehow I doubt the human race will fill up the other 90% of UTF-8 codepoints in less than 0.1% the time it took to invent the first 10%.<br> </div> Tue, 17 Dec 2019 22:02:25 +0000 Git v2.24.1 and others https://lwn.net/Articles/807348/ https://lwn.net/Articles/807348/ zlynx <div class="FormattedComment"> I am glad we have arbitrary binary garbage.<br> <p> Otherwise Linux / Unix would have ended up like the others in love with "the future" and we'd be stuck with UCS-2 circa 2002. Which is neither big enough for every character, nor space efficient.<br> <p> Won't it be fun in the year 2100 when users have to create wild WTF-8 hacks to work around the encoding limitations hard coded into their virtual storage backend.<br> </div> Mon, 16 Dec 2019 00:13:03 +0000 Git v2.24.1 and others https://lwn.net/Articles/807330/ https://lwn.net/Articles/807330/ nix <div class="FormattedComment"> One hardware page? Oh, that's interesting -- if there is no lower fixed constraint, you can construct filesystems on systems with big page sizes containing files that you can't access at all on systems with a smaller page size! Most unexpected.<br> </div> Sun, 15 Dec 2019 16:52:45 +0000 Git v2.24.1 and others https://lwn.net/Articles/807323/ https://lwn.net/Articles/807323/ Cyberax <div class="FormattedComment"> Yet the filesystems fail even the trivial part..<br> </div> Sun, 15 Dec 2019 11:45:57 +0000 Git v2.24.1 and others https://lwn.net/Articles/807322/ https://lwn.net/Articles/807322/ adobriyan <div class="FormattedComment"> UTF-8 implies Unicode. People doing Unicode say that UTF-8 part is trivial and the hard part is at glyph/grapheme level.<br> </div> Sun, 15 Dec 2019 11:43:48 +0000 Git v2.24.1 and others https://lwn.net/Articles/807312/ https://lwn.net/Articles/807312/ Cyberax <div class="FormattedComment"> <font class="QuotedText">&gt; Unicode is big, there are 8 newlines.</font><br> Unicode? What Unicode? Unix file names need not be Unicode in any encoding. They can be arbitrary binary garbage.<br> <p> Getting filenames to be valid UTF-8 would be an awesome improvement over the status quo.<br> </div> Sun, 15 Dec 2019 00:11:30 +0000 Git v2.24.1 and others https://lwn.net/Articles/807311/ https://lwn.net/Articles/807311/ adobriyan <div class="FormattedComment"> The fix is to not use shell scripts for anything serious.<br> <p> David Wheeler's logic is like this:<br> most shell scripts are buggy because shell authors make it easy to make mistakes despite knowing perfectly well that Unix allows whitespace in filenames, therefore OS kernel should accomodate shell users.<br> <p> We've seen this pattern before:<br> shell can't do system calls therefore OS kernel should interface in text which is inferior in nearly any way.<br> <p> Real programming languages (say Python) don't have the problem with whitespace (subprocess.call()).<br> <p> Maybe it is the Unix shells that should be fixed?<br> <p> Even if whitespace and other characters are banned where will they stop? Unicode is big, there are 8 newlines.<br> </div> Sat, 14 Dec 2019 23:37:59 +0000 Git v2.24.1 and others https://lwn.net/Articles/807170/ https://lwn.net/Articles/807170/ epa Another plug for <a href="https://dwheeler.com/essays/fixing-unix-linux-filenames.html">Fixing Unix/Linux/POSIX Filenames</a>, for those who have not yet read it. Thu, 12 Dec 2019 13:30:13 +0000 Git v2.24.1 and others https://lwn.net/Articles/807134/ https://lwn.net/Articles/807134/ pabs <div class="FormattedComment"> Please file a feature request about detecting this. There is a large backlog of such requests, but they will get to it eventually.<br> </div> Wed, 11 Dec 2019 23:59:03 +0000 Git v2.24.1 and others https://lwn.net/Articles/807103/ https://lwn.net/Articles/807103/ JoeBuck <p> That's a cool tool. I did notice one flaw: it reports the common problem of piping find ... -print output to xargs, recommending use of -print0. But it didn't flag the issue that for <pre> find [path] [patterns] -print0 | xargs ... </pre> xargs must be given the -0 option. It said "no problems found" when -print was changed to -print0. Wed, 11 Dec 2019 18:56:34 +0000 Git v2.24.1 and others https://lwn.net/Articles/807101/ https://lwn.net/Articles/807101/ cesarb <div class="FormattedComment"> Besides the bytes 0x00 (NUL) and 0x2f (slash), filenames consisting of exactly zero, one, or two 0x2e (dot) bytes and nothing more are also reserved. Other than these, and a maximum length limit of one hardware page, there are AFAIK no other restrictions on filenames on the Linux VFS.<br> </div> Wed, 11 Dec 2019 18:50:48 +0000 Git v2.24.1 and others https://lwn.net/Articles/807094/ https://lwn.net/Articles/807094/ nix <div class="FormattedComment"> And, of course, Unicode lookalikes are still possible:<br> <p> nix@loom 1179 ~/oracle/tmp/luci% ls -l foo*bar<br> -rw-rw-r-- 1 nix nix 0 Dec 11 17:13 fooâ§žbar<br> <p> (That's U+29F8 BIG SOLIDUS.)<br> </div> Wed, 11 Dec 2019 17:14:35 +0000 Git v2.24.1 and others https://lwn.net/Articles/807040/ https://lwn.net/Articles/807040/ abatters <div class="FormattedComment"> I find this useful to find some common problems in shell scripts:<br> <p> <a href="https://www.shellcheck.net/">https://www.shellcheck.net/</a><br> </div> Wed, 11 Dec 2019 14:37:39 +0000 Git v2.24.1 and others https://lwn.net/Articles/807029/ https://lwn.net/Articles/807029/ Vorpal <div class="FormattedComment"> They can also contain newlines. In fact the only disallowed characters by Linux are \0 (null bytes) and / (the path separator). Of course, certain file systems (such as vfat) can have additional restrictions.<br> </div> Wed, 11 Dec 2019 14:27:30 +0000 Git v2.24.1 and others https://lwn.net/Articles/807012/ https://lwn.net/Articles/807012/ bovinespirit <div class="FormattedComment"> "Filenames on Linux/Unix can contain backslashes." <br> <p> $ touch " eat:\ this Bash! "<br> $ for f in $(ls); do echo $f; done<br> eat:\<br> this<br> Bash!<br> <p> So they can! I love/hate shell scripting!<br> <p> </div> Wed, 11 Dec 2019 08:44:54 +0000