Not logged in
Log in now
Create an account
Subscribe to LWN
Dividing the Linux desktop
LWN.net Weekly Edition for June 13, 2013
A report from pgCon 2013
Little things that matter in language design
LWN.net Weekly Edition for June 6, 2013
Review: The Linux Programming Interface
Posted Jan 20, 2011 19:15 UTC (Thu) by RobSeace (subscriber, #4435)
And, using filesystem permissions to control access of Unix domain sockets is highly unportable... Many systems totally ignore perms on Unix domain sockets, making them effectively always 0777, like symlinks... (Yes, of course, the abstract namespace is also unportable, so relying on one Linux quirk is as good/bad as relying on another, I suppose...) But, any sane app that needs to restrict who talks to it over a Unix domain socket will use a much better app-level restriction of some kind, probably using SO_PEERCRED or SCM_CREDENTIALS/SO_PASSCRED or something of the sort, if not a full-blown login/authentication mechinism...
Posted Jan 20, 2011 19:20 UTC (Thu) by Cyberax (✭ supporter ✭, #52523)
That's not a shortcoming of lsof, but a shortcoming of having a separate namespace. It goes against all the Unix traditions.
While we're at it, why not move /dev into a separate namespace (like Windows does, BTW)? And then /sys and /proc.
Well, personally I don't care about other Unixes. However, ability to use AppArmor to restrict access to sockets somehow makes me feel more secure.
Posted Jan 20, 2011 20:10 UTC (Thu) by RobSeace (subscriber, #4435)
Not when it comes to sockets, it doesn't... Do your TCP and UDP sockets exist in the filesystem, as well? How do you think lsof deals with those? Right, it has to deal with a separate namespace... Just like it should be taught how to do with Linux's abstract Unix domain namespace... It's not like it's hard: they're right there in "/proc/net/unix", which it already reads anyway...
What, aside from lsof, actually needs to ever reference a Unix domain socket by pathname as if it were a file, anyway? It's not like you can just pass one to an arbitrary app which is expecting a file, and expect it to do anything sensible... Like, for instance, you can do with a named pipe... That's a case where existing in the filesystem is actually useful... There's absolutely no use to socket files; they're not "files" in any meaningful sense, because you can't do file I/O on them... They're just special creatures that happen to be identified via a pathname...
And, your strawman about getting rid of "/dev", "/sys", "/proc", etc. is missing the point entirely... The things under those dirs are perfectly usable AS FILES! They may be special creatures of their own, but plain old file I/O works on them; you can open() them, read() from them, write() to them... You can't do that on a socket "file"... I'm all in favor of "everything as a file", having used Unix-like systems for well over 20 years now... But, I'm not in favor of leaving tons of file-like tokens scattered all over the filesystem, which can't actually be used like files for anything, and which only exist there for the sole purpose of having a unique name to identify them by...
Posted Jan 20, 2011 20:21 UTC (Thu) by Cyberax (✭ supporter ✭, #52523)
Which is a shortcoming of Unix design (which was fixed in 9p, btw).
>What, aside from lsof, actually needs to ever reference a Unix domain socket by pathname as if it were a file, anyway? It's not like you can just pass one to an arbitrary app which is expecting a file, and expect it to do anything sensible...
How about security with AppArmor? Does your namespace work with chroot? Also, unix sockets have a creation time (which helped me once).
>And, your strawman about getting rid of "/dev", "/sys", "/proc", etc. is missing the point entirely... The things under those dirs are perfectly usable AS FILES!
Not really. I can't open a lot of files in /proc for writing or reading. Quite a lot of devices in /dev are IOCTL-only and can't perform any read/write operation and so on.
Posted Jan 20, 2011 21:08 UTC (Thu) by RobSeace (subscriber, #4435)
Perhaps, but given that you need to use completely different syscalls to work with sockets, and can't really just use regular file syscalls on them (well, for the most part; of course you can use read()/write() on them), I'm not sure I agree... How does Plan9 deal with this? Can you just open() up a TCP socket, and somehow specify a host and port to connect to, or one to listen on? (A la bash's "/dev/tcp"...) Once you do, does this process-specific socket exist as a separate file for others to see (and interact with)? I'm not sure I see how it makes much sense, in general...
> How about security with AppArmor?
Maybe, I know nothing about AppArmor... My distro (RHEL/CentOS) doesn't use it, favoring SELinux instead... (And, there's probably SOME method of making SELinux work with abstract Unix domain sockets, but goodness knows if anyone could ever figure out HOW!)
> Does your namespace work with chroot?
My namespace? Thanks for the credit, but I didn't invent it; I'm just a very big fan of it who regularly uses it... ;-)
But, sure, I guess that's a possible valid use... I'm not sure I can really conceive of a real-world use for such a thing, however... Maybe sandboxing something with its own private "/dev/log" that goes somewhere other than the real syslogd? *shrug*
> Also, unix sockets have a creation time (which helped me once).
Well, not really; like any other file, they've got modify and change times... But, yeah, both will generally reflect creation time... I'm not really sure how that'd be of much help in general, though?
> I can't open a lot of files in /proc for writing or reading.
Like which? For reasons other than permissions? Not talking about directories (or symlinks to them), I assume? Do you mean the non-file FD symlinks under "/proc/#/fd/"? Those you can at least readlink() like a real symlink...
> Quite a lot of devices in /dev are IOCTL-only and can't perform any read/write operation and so on.
Is it really a lot? Even if so, at least you can open() them... And, at least all the device special files are confined to "/dev" (in practice), rather than scattered wherever (usually in "/tmp") like most filesystem Unix domain sockets... If they had their own directory to live in which everyone used by convention, I wouldn't mind them nearly as much... I don't mind "/dev/log", since it's stuck in "/dev/" with the other special files...
One huge benefit of abstract Unix domain sockets is no need to worry about unlink()'ing them when you're done, and dealing with the race conditions inherent in that... They just go away when you exit or close() the listening socket... With filesystem sockets, a server needs to see if the socket already exists in the filesystem; if so, maybe that means another copy of itself (or some other app) is using that socket; or, maybe it means it previously crashed before being able to remove that socket... Should it unlink() it and try to bind() it itself? Only way to know is try to connect() to it, and see if someone is really listening on it (or look for it in "/proc/net/unix")... None of this is needed with abstract Unix domain sockets...
Posted Jan 21, 2011 15:00 UTC (Fri) by price (subscriber, #59790)
Cyberax's examples of the things you want to do include
* protect with AppArmor
* hide in a namespace away from some processes (like with chroot)
* see when they were created (presumably for debugging)
Here's my example:
* move them out of the way.
You think you wouldn't want that for a socket? Think again. I once had to deal with a buggy server process (clvmd) that would occasionally hang unkillably (due to a kernel bug), while holding an abstract-namespace socket. This means that when I tried to restart it, the new process would immediately fail because the socket was already bound.
If the clvmd authors had used filesystem sockets like good Unix-respecting developers, I could have simply mv'd or even rm'd the old socket, and the new process would have been free to bind to its socket at the usual name. Instead, I had to restart the box. This was a VM server -- dozens of people's VMs were affected by each restart. The bug recurred a couple of times a day. I *really* wished the program had used sockets in the filesystem, or that somebody had implemented rename() or unlink() for abstract-namespace sockets -- but who would do that? The program should have used sockets in the filesystem.
If you're unhappy because a system leaves files around in /tmp that aren't used anymore, you're really focusing on the wrong things.
Posted Jan 21, 2011 15:39 UTC (Fri) by RobSeace (subscriber, #4435)
No, I want to do none of those things... Maybe they are things YOU and others want to do... But, personally, I have no need for any of them, and they mostly seem like stretches and grasping at straws to justify them existing in a place they certainly don't belong (the filesystem)...
And, your example of a buggy app and/or kernel is just crazy... You want to be able to kluge around a serious app/kernel bug by stealing its socket out from under it, and replacing it via another running copy? How about just fixing the bug! What if it were holding a TCP port# instead? Do you complain that you can't "rm" listening TCP ports, too?
> If you're unhappy because a system leaves files around in /tmp that aren't used anymore
That's part of why I dislike them... They're scattered around wherever, often somewhere under "/tmp" (which is a really poor place for something designed to be a shared identifier for communication between multiple apps)... But, mostly I dislike them because THEY ARE NOT FILES! Just having them exist as a directory entry in the filesystem does not fulfill some Unix utopia idea of "everything is a file"... In order for that to be fulfilled, the things must actually be usable AS FILES... If they were designed such that you could pass one to an otherwise unsuspecting app, which just open()'d it normally, and that magically let that app talk to whoever is listening on the other end of the socket (a la a named pipe), then I'd be all in favor of them... That would be brilliant... But, no, you can't do that... All you can do with a Unix domain socket "file" is to bind() to it or connect() to it... They're not files; they're filesystem representations of unique socket addresses, and that's all... As such, there's no need for them to live in the filesystem at all... (Unless you have special rare needs like those previously mentioned which can only be solved by them having a pathname in the filesystem...)
Posted Jan 27, 2011 14:14 UTC (Thu) by renox (subscriber, #23785)
I had this need in one of our application, to workaround an issue I had to use a small tool which can 'force close' a 'listening TCP ports', having the possibility to 'rm' listening TCP ports would have been much more easy.
Posted Jan 28, 2011 21:32 UTC (Fri) by spitzak (guest, #4593)
I think this is your only legitimate complaint. Why not make a fix so the "abstract name space" is mounted under a permanent name. Then everything is in a predictable place in the filesystem, and you have all the HUGE advantages that they are in the same namespace you can search with existing tools.
/proc is full of files that used to be "namespaces" (actaully various kernel calls and tools that peeked into kernel memory maps). I think it is pretty obvious that /proc is VASTLY superior to the old api, in that it is discoverable and many more tools are written to use it.
Posted Jan 21, 2011 14:13 UTC (Fri) by cras (guest, #7000)
Posted Jan 21, 2011 14:54 UTC (Fri) by RobSeace (subscriber, #4435)
In the Linux implementation, sockets which are visible in the filesys-
tem honour the permissions of the directory they are in. Their owner,
group and their permissions can be changed. Creation of a new socket
will fail if the process does not have write and search (execute) per-
mission on the directory the socket is created in. Connecting to the
socket object requires read/write permission. This behavior differs
from many BSD-derived systems which ignore permissions for Unix sock-
ets. Portable programs should not rely on this feature for security.
Posted Jan 21, 2011 15:05 UTC (Fri) by cras (guest, #7000)
Posted Jan 25, 2011 9:47 UTC (Tue) by paulj (subscriber, #341)
Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds