User: Password:
|
|
Subscribe / Log in / New account

sysfs is dumb

sysfs is dumb

Posted Oct 17, 2009 19:54 UTC (Sat) by quotemstr (subscriber, #45331)
In reply to: A few words on DRBD and user space - kernel interfaces by jageorge
Parent article: Infrastructure unification in the block layer

<rant>

Another reason people don't like ioctl is that it's not generically scriptable: to use an interface exposed by an ioctl, a C program must be written that can understands the appropriate structure definitions. Scripts can then only run these wrapper programs, and I suppose people didn't want to undertake the chore of wrapper writing. At first, sysfs seems to solve that problem, but the necessary filesystem structure is so hairy, and the ordering and atomicity requirements are so arcane, that people end up writing wrappers anyway! (Consider lspci and lsof.)

Serious question: how is sysfs better than sysctl? Both give you hierarchically-organized human readable ASCII-based cross-architecture key-value pairs that can be manipulated by scripts, but because sysctl is a single system call, there's at least a possibility of making atomic changes without disgusting hacks or having to implement a full filesystem transaction layer.

I don't see sysfs's filesystem interfaces as much of an advantage. You can grep sysctl output even more easily than you can grep /sys; and speaking of the name /sys: it's a de-facto standard. Mounting it elsewhere isn't particular useful except in the chroot case, and with a sysctl interface, you wouldn't have to mount anything at all!

Sure, you might be able to eventually do something Plan9-like and mount /sys and /proc over NFS, but the last mention I can find of anyone actually attempting that is from 1998. It doesn't seem terribly useful, and besides, and the security implications scare bejeesus out of me.

Besides: using sysctl is simpler! You don't have to worry about opening files, closing them, and so on. And the BSD people seem to get along fine without a sysfs, after all.

Having a private (per process) sysfs (or procfs) directory where any sysfs hierarchy can be created and later pushed into place (mv?) under a "magic" subdirectory entry in sysfs under your device.
This approach won't be particularly popular with people who like to manipulate sysfs with shell scripts.


(Log in to post comments)

sysfs is dumb - that depends

Posted Oct 18, 2009 1:37 UTC (Sun) by jageorge (guest, #61413) [Link]

Sysctl under Linux is just a wrapper around /proc... and I'm not saying that the BSD guys got it wrong, but sysfs IS the strategic direction already taken by Linux. However, there are clearly problems with the status quo especially when it comes to atomic operations. Both of my proposals (multi-element sysfs nodes, and process private staging sysfs directory) are compatible with the evolving direction of Linux system resource management from userspace.

The scriptability issue around my private staging tree proposal is easily addressable by using some sort of token (futex/mutex/semaphone) based approach to opening the staging directory instead of a purely PID based approach. Perhaps I'll try a kernel patch to illustrate what I mean ... if I can drum up some interest.

One way or the other private staging of atomic operations (whether ioctl() or some variation on my proposals) is essential for certain operations, and trying to avoid it _will_ result in race conditions many of which have security implications as well... now that I think about it token based private directories would be cool from a temporary directory perspective as well especially if the OS automatically reaped the result after the last token holder exited... so many cool implications... :-)

sysfs is dumb - that depends

Posted Oct 18, 2009 20:03 UTC (Sun) by quotemstr (subscriber, #45331) [Link]

sysfs IS the strategic direction already taken by Linux
It does seem that we're stuck with it for now, though it could be deprecated as many other interfaces have been.

So I agree, there's a need for atomic operations on sysfs. Your ideas seem over-engineered to me though. What's wrong with the following scheme? An application would create a temporary directory anywhere it liked. Under this temporary directory, an application would create a sysfs tree corresponding to the nodes to change, and after that, would write the name of the temporarily directory to a new special file, /sys/commit. If the commit is successful, the kernel would remove the temporary directory; if there's an error, it would leave the directory in place and return an error from write, or leave an error file in the temporary directory describing what went wrong.

This scheme doesn't require any new system calls or VFS infrastructure, and it's shell-script compatible.

sysfs is dumb - that depends

Posted Oct 19, 2009 14:40 UTC (Mon) by jageorge (guest, #61413) [Link]

Your suggest is essentially where I started, but there appear to be a couple of potential issues. 1. The commit from physical file system to sysfs seemed as if it could be expensive and/or racy. 2. Anything that exists in the normal file system environment is potentially vulnerable from a security/race (even multiple instances of the same monitoring/management software) standpoint.

Nevertheless, I don't want to over-complicate the implementation, and it is possible that there are already security facilities in the kernel which could serve to isolate something as process private. Furthermore, I agree that shell scripting should be relatively simple with any solution to this problem... to some extent that's one of the key ideas behind sysfs. An obvious first step would be to stage something without resolving the private view security question... perhaps even something like staging from a normal physical file system and using mv to flatten the directory structure into a text file which would be fed into a writable sysfs inode.

Basically the problem space is pretty clear (non-trivial atomic operations on IO devices) as is the high level of how to address it (sysfs nodes in the correct context which manage security and race problems). Once someone (possibly me) creates an implementation I expect many of the details to fall into place pretty quickly... and then it's just a matter of getting it past Greg and Al (shudder). The sad thing is that after 6 years of sysfs/udev as a "production" solution no one has done anything other than ducking the problem.


Copyright © 2017, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds