|
|
Log in / Subscribe / Register

NT (Windows kernel) doesn't care about filenames any more than Linux

NT (Windows kernel) doesn't care about filenames any more than Linux

Posted Mar 28, 2009 15:36 UTC (Sat) by tialaramex (subscriber, #21167)
In reply to: Wheeler: Fixing Unix/Linux/POSIX Filenames by dwheeler
Parent article: Wheeler: Fixing Unix/Linux/POSIX Filenames

It's always worth telling people this, because it tends to make them rock back on their heels if they've been (wrongly) believing that NT is doing something special here.

NT (the kernel API in Windows NT, 2000, XP and etc.) doesn't care about filename encodings. The only thing that makes NT's attitude to such things different from that of Linux's is that NT's arbitrary sequences of non-zero code units used for filenames use 16-bit code units, and in Linux obviously they're 8-bit.

Everything else you see, such as case-insensitivity, bans on certain characters or sequences of characters, is implemented in other layers of the OS or even in language runtimes, not the kernel. Low-level programmers, just as on Unix, can call a file anything they like.

And the consequence is the same thing being lamented in this article - badly written Windows programs crash or do insane things when faced with filenames that don't look like the ones the poor third rate programmer who wrote the code was familiar with. In the absence of defensive programming this software also doesn't like leap years, or leap seconds, or files that are more than 2GB long, or... you could go on all day, badly written programs suck.

On encodings - I encourage you to use UTF-8. I encourage people with other encodings to migrate to UTF-8, but using UTF-8 and blindly trusting that everything you work with is actually legal and meaningful display-safe UTF-8 are quite different things. People who can't keep them separate are doing a bad job, whether handling filenames or displaying email.


to post comments

NT (Windows kernel) doesn't care about filenames any more than Linux

Posted Mar 29, 2009 14:36 UTC (Sun) by epa (subscriber, #39769) [Link] (3 responses)

NT (the kernel API in Windows NT, 2000, XP and etc.) doesn't care about filename encodings. The only thing that makes NT's attitude to such things different from that of Linux's is that NT's arbitrary sequences of non-zero code units used for filenames use 16-bit code units, and in Linux obviously they're 8-bit.

Everything else you see, such as case-insensitivity, bans on certain characters or sequences of characters, is implemented in other layers of the OS or even in language runtimes, not the kernel. Low-level programmers, just as on Unix, can call a file anything they like.

Does that mean if you code against the NT API directly, you can create files foo and FOO in the same directory? I expect that opens up all sorts of juicy security holes - many of them theoretical, since a typical NT system has just one user and there is not much need for privelege escalation - but still it sounds fun.
using UTF-8 and blindly trusting that everything you work with is actually legal and meaningful display-safe UTF-8 are quite different things.
Indeed. Hence the benefit of enforcing this at the OS level: it gets rid of the need for sanity checks that slow down the good programmers and were never written anyway by the bad programmers.

NT (Windows kernel) doesn't care about filenames any more than Linux

Posted Mar 30, 2009 10:55 UTC (Mon) by nye (guest, #51576) [Link] (2 responses)

>Does that mean if you code against the NT API directly, you can create files foo and FOO in the same directory?

Yes. This is what the POSIX subsystems for NT do; they're implemented on top of the native API, as is the Win32 API. Note that Cygwin doesn't count here as it's a compatibility layer on top of the Win32 API rather than its own separate subsystem.

Unfortunately the Win32 API *does* enforce things like file naming conventions, so it's impossible (at least without major voodoo) to write Win32 applications which handle things like a colon in a file name, and since different subsytems are isolated, that means that no normal Windows software is going to be able to do it.

(I learnt all this when I copied my music collection to an NTFS filesystem, and discovered that bits of it were unaccessible to Windows without SFU/SUA, which is unavailable for the version of Windows I was using.)

http://en.wikipedia.org/wiki/Native_API

NT (Windows kernel) doesn't care about filenames any more than Linux

Posted Mar 30, 2009 15:13 UTC (Mon) by foom (subscriber, #14868) [Link] (1 responses)

>> Does that mean if you code against the NT API directly, you can create files foo and FOO in the same directory?
> Yes. This is what the POSIX subsystems for NT do

You can actually do this through the Win32 API: see the FILE_FLAG_POSIX_SEMANTICS flag for CreateFile. However, MS realized this was a security problem, so as of WinXP, this option will in normal circumstances do absolutely nothing. You now have to explicitly enable case-sensitive support on the system for either the "Native" or Win32 APIs to allow it.

(the SFU installer asks if you want to this, but even SFU has no special dispensation)

NT (Windows kernel) doesn't care about filenames any more than Linux

Posted Nov 15, 2009 0:06 UTC (Sun) by yuhong (guest, #57183) [Link]

Another trick you can use with CreateFile is to start the filename with \\.\.
If that is done, the only processing done on the filename before CreateFile
calls NtCreateFile with the name is that \\.\ is replace with \??\, which is
an alias of \DosDevices\.

NT (Windows kernel) doesn't care about filenames any more than Linux

Posted Nov 14, 2009 23:58 UTC (Sat) by yuhong (guest, #57183) [Link]

"files that are more than 2GB long"
Yep, NT had supported both files and disks larger than 2GB from the first
version (NT 3.1) using the NTFS filesystem. Exercise: compare the design of
the GetDiskFreeSpace and SetFilePointer APIs (look them up using MSDN or
Google), both of which has existed since NT 3.1. Which one was so much more
error-prone that the versions of Windows released in 1996 had to cap the
result to 2GB, even though older versions of NT supported returning more than
2GB using it, and why?


Copyright © 2026, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds