|
|
Subscribe / Log in / New account

Configfs - an introduction

Complicated kernel subsystems can require complex configuration. Traditionally, Unix-like subsystems have made this configuration possible either via new system calls, or by way of a complex, ioctl()-based interface. Neither approach is considered to be optimal. New system calls clutter the namespace and must be added separately for each architecture; they are also quite inflexible once defined and used by user-space code. Anybody who uses the ioctl() interface for new code tends to get sneered at; using ioctl() is like adding new system calls but without the clear definition of the interface that a system call gives you.

So how should a new subsystem allow for configuration from user space? In some cases, sysfs can be used. Sysfs, however, was never really meant for this application. It provides a view into the kernel's data structures, and it can be used to cause things to happen with those structures. But sysfs cannot be used to create new objects - at least, not without distorting the interface somewhat. It is the wrong tool for this job.

The right tool might turn out to be a thing called configfs. It is yet another virtual filesystem, but one which is oriented toward user-space configuration tasks. It is currently part of the OCFS2 patch set, but it is likely to be merged separately due to interest from other kernel projects. It could, conceivably, be merged as early as 2.6.14.

Configfs is meant to be mounted on /config. Each subsystem which uses configfs then creates one or more top-level directories within configfs for their configurations; the distributed lock manager code, for example, creates /config/dlm/. That directory can start out empty, or it can be populated with the initial configuration of the subsystem, whichever is appropriate.

Like sysfs, configfs uses directories as the way of representing objects. Directories contain files ("attributes") which display the current state of the object, and which, optionally, may be writable to change that state. A fundamental difference, however, is that a suitably-privileged user-space process can create directories within configfs. That action will result in a callback within the kernel and the creation of the corresponding object. Directories created within configfs will have a set of attribute files from the beginning.

As an example (taken from the configfs documentation), consider a hypothetical network block device driver called "fakenbd." This driver would set up /config/fakenbd, which would start out empty. A system administrator could then use mkdir to create a network disk by creating an appropriately-named subdirectory under /config/fakenbd. That directory (called disk1, say) would be populated by the kernel with the relevant attributes: target for the IP address of the server providing the disk, device for the device on the server, and rw to control whether the disk is to be writable or not. The administrator would simply write the appropriate value into each attribute, and the disk would be configured.

Some observers have questioned the distinction between configfs and sysfs. Users may well wonder why there are two separate directory trees performing similar tasks - especially since sysfs can be used for certain types of administrative functions. Configfs also has certain problems (such as persistence of attribute permissions) which have already been encountered - and solved - in sysfs. The kernel developers do see the two as being fundamentally different, however, so a merger seems unlikely.

If configfs takes off, one could imagine it being used all over the kernel. Much of what is done with ioctl() now could be moved over. Other patches (such as CKRM) which have their own configuration filesystems could switch to configfs. In the long term, configfs could be the path to a much more consistent - and transparent - way of configuring the many subsystems which make up the Linux kernel.

Index entries for this article
KernelConfigfs


to post comments

deja vu

Posted Aug 25, 2005 2:10 UTC (Thu) by ccyoung (guest, #16340) [Link] (3 responses)

isn't this sounding a lot like the Reiser file system?

deja vu

Posted Aug 25, 2005 3:09 UTC (Thu) by dlang (guest, #313) [Link]

not very.

the thing with the reiserfs was to create files that contained attributes (represented as a subdirectory) as well as data

this is the plain old directory-is-a-directory and file-is-a-file approach (with the addition that creatign a directory triggers a callback to create files in that directory)

nothing new that would break existing tools here, unlike the reiser4 combo-file proposal

deja vu

Posted Aug 25, 2005 3:11 UTC (Thu) by elanthis (guest, #6227) [Link]

No. ReiserFS is based on the idea of plugins used for manipulating the user-level contents and meta-data for a file. It can be used to turn /etc/passwd into a directory of files, so that your two line shell script with sed and awk can become a two line shell script with cat and ls. (Jibes aside, there are some other theoretical uses that could make life a lot easier for developers and administrators.) ReiserFS exposes plugins based on content stored within the ReiserFS file system itself, and not external data like from the kernel.

ConfigFS is a file system designed solely for exposing kernel structures and allowing addition of new entries and changes to those structures. The file system is self contained and mounted in its own unique location, and does not share the mount namespace with regular files like ReiserFS.

Logically, from a user perspective, these are two feature sets are almost the same thing. Internally, however, they are truly different. I suppose, in many ways, it's like saying that OpenLDAP and Active Directory look alike - they're both directory servers, both speak LDAPv3, but they really aren't the same internally and have very different goals and implementations.

deja vu

Posted Sep 1, 2005 19:35 UTC (Thu) by pivo (guest, #32229) [Link]

No, it doesn't sound like reiser4, but it would really benefit from the
new reiser4 syscall.

To set up a nbd device from the example requires many syscalls to
open/write/close the directory and individual files. With reiser4
interface it would likely take only a single syscall. See
http://namesys.com/v4/v4.html#reiser4_call

Configfs - an introduction

Posted Aug 25, 2005 3:47 UTC (Thu) by flewellyn (subscriber, #5047) [Link] (9 responses)

Interesting. If this replaces ioctl() completely, then perhaps it (and the BKL) could finally go away
forever?

Configfs - an introduction

Posted Aug 25, 2005 6:41 UTC (Thu) by bronson (subscriber, #4806) [Link] (8 responses)

Can't do it. Not unless you want to break backward compatibility with pretty much every app in existence today.

ConfigFS might allow migrating slowly off ioctls, of course, but that is years away. The BKL doesn't really get in the way anymore, does it?

Configfs - an introduction

Posted Aug 25, 2005 8:07 UTC (Thu) by farnz (subscriber, #17727) [Link] (7 responses)

The obvious way to deal with it is to recognise that ioctl is normally a library function, and have glibc do the ConfigFS magic when ioctl is called. Then make ioctl support a compile-time option, and remove it once everyone's updated to a glibc that does the ioctl->ConfigFS translation.

Configfs - an introduction

Posted Aug 25, 2005 9:42 UTC (Thu) by daniel (guest, #3181) [Link] (6 responses)

"The obvious way to deal with it is to recognise that ioctl is normally a library function"

Not it isn't, an ioctl goes straight through to the kernel without interpretation.

"and have glibc do the ConfigFS magic when ioctl is called"

What a perfectly horrible idea. Ioctls are lightweight, configfs is anything but. Configfs is for people. You can echo MySetting >/config/MySystem/frobme. For a program, it is a lot of pointless work opening the file, formatting the parameter, writing to it, closing it. An ioctl is one or two lines.

Configfs - an introduction

Posted Aug 25, 2005 9:51 UTC (Thu) by farnz (subscriber, #17727) [Link] (5 responses)

In my systems, ioctl(2) is called through glibc, just like any other syscall. The fact that glibc normally passes the data straight through to the kernel is irrelevant; ioctl(2) is normally a library function, not a direct kernel call.

If you want to replace ioctl with ConfigFS, this is the obvious transition plan. If you don't want to do so, then of course the transition plan's a bad idea.

Configfs - an introduction

Posted Aug 25, 2005 18:45 UTC (Thu) by elanthis (guest, #6227) [Link] (4 responses)

It's a library call that is often inlined (possibly by specialized compiler support), isn't it?

I tend to be in a minority, but so far as I'm concerned, *any* breakage of *any* user-space application (that isn't doing something unsupported/undefined by the official call interface) is a serious problem. I shouldn't be required to recompile my user-space software to upgrade core components to fix bugs or security holes, ever.

Configfs - an introduction

Posted Aug 25, 2005 19:24 UTC (Thu) by farnz (subscriber, #17727) [Link] (3 responses)

So you do it over a long time if the aim is to phase out ioctl(2). Phase 1 is to update glibc and friends to do the translation, together with a moratorium on new ioctls. Phase 2, some time later, is to provide a kernel option to disable ioctl(2), so that people can see if their software is broken. Phase 3, a couple of years later, is to disable that option by default. Finally, phase 4 is to remove ioctl(2) once no-one uses it.

Configfs vs ioctl

Posted Aug 26, 2005 15:37 UTC (Fri) by giraffedata (guest, #1954) [Link] (2 responses)

I can't see how you could ever phase out ioctl via this strategy. libc (and generic parts of the kernel) have no idea what the argument to ioctl means. Individual device drivers and filesystem drivers assign meaning to it. That's the major reason ioctls are used. Would you put cases for every known use of ioctl in libc? And even if you did, how would libc know which language the particular ioctl is in?

Configfs vs ioctl

Posted Aug 26, 2005 15:50 UTC (Fri) by farnz (subscriber, #17727) [Link] (1 responses)

That's exactly what you'd do; one case for each ioctl goes into the library, translating the ioctl to a ConfigFS access. This allows you to use ConfigFS instead of ioctl.

I don't understand the language comment; how does the kernel do it now? I thought it got a set of binary values from userspace, which it acted on. This code could be moved into the runtime libraries for all languages that provide ioctl access, converting the binary values into text for ConfigFS, which the kernel would then convert back to the binary values it would have acted on.

Let me emphasise again that this is only what you'd do if you'd already decided to phase out ioctl for ConfigFS. There's no reason why you can't do this change, but lots of reasons why you shouldn't.

Configfs vs ioctl

Posted Aug 26, 2005 20:18 UTC (Fri) by giraffedata (guest, #1954) [Link]

Let me emphasise again that this is only what you'd do if you'd already decided to phase out ioctl for ConfigFS.

I agree that this is the best way given that you are replacing ioctls with configfs. The obvious inference from the fact that you brought it up in response to a concern about backward compatibility is that you're saying it could be a practical way to get backward compatibilty; so I'm trying to show that it's not practical, so the backward compatibilty objection to configfs has to stand. As long as we agree there's no practical way to get backward compatibility, I have no dispute.

... one case for each ioctl goes into the library

I don't think anyone would accept that.

I don't understand the language comment; how does the kernel do it now? I thought it got a set of binary values from userspace, which it acted on.

It also gets a file descriptor, which has a lot of context with it. In particular, it tells the kernel which ioctl handler to call, and that ioctl handler knows what language (protocol) the argument is in. libc would have to be hacked really hard to have it track open file state and know which open files go with with device/file types.

oops...

Posted Aug 25, 2005 4:59 UTC (Thu) by roelofs (guest, #2599) [Link] (1 responses)

A fundamental difference, however, is that a suitably-privileged user-space process can create directories within configfs.

Perhaps it just looks like a problem, but I'd be more worried about an inadvertent rm -rf blowing away my devices. Or is unlink() not hooked up via the same kind of callbacks?

Greg

oops...

Posted Aug 25, 2005 7:58 UTC (Thu) by Thalience (subscriber, #4217) [Link]

Seems to me that destroying runtime device configuration is one of the less-harmful things that an inadvertent "rm -rf" could do to a system. Unlike the contents of /etc or /home, /config would be regenerated after a reboot.

Memory?

Posted Aug 25, 2005 15:20 UTC (Thu) by simlo (guest, #10866) [Link] (1 responses)

I remember complains about sysfs taking too much memory. I am afraid configfs will be the same.

Wouldn't it be better to merge sysfs and configfs into one to have both properties, viewing and setting/configurating?

Memory?

Posted Aug 25, 2005 18:18 UTC (Thu) by daniel (guest, #3181) [Link]

I remember complains about sysfs taking too much memory. I am afraid configfs will be the same.

Yes, currently configfs is a memory pig because all its directory inodes are pinned in memory, see my post:

http://lwn.net/Articles/148978/

Wouldn't it be better to merge sysfs and configfs into one to have both properties, viewing and setting/configurating?

Indeed. All configfs does is take instantiation events via the filesystem instead of, e.g., the hotplug system as sysfs does. In fact, configfs is just cut & paste of the sysfs code with some special case code here and there to handle the different event source. Except for initialization, the data structures are identical. Nearly all of sysfs is still there in configfs. Hmm, what is the code trying to tell us? I'm checking right now to see how easy it is to put this forked code back together so that a kernel module can specify whether it wants user-driven directory creation or not. Oddly enough, the maintainers think this is hard, but I will see for myself.

Regards,

Daniel

/config != configuration files directory

Posted Aug 25, 2005 15:58 UTC (Thu) by LogicG8 (guest, #11076) [Link] (1 responses)

I don't look forward to having to explain this one. It's hard
enough trying to explain /etc without there being a /config to really
confuse matters. With the somewhat overwhelming proliferation of virtual
file systems exported by the kernel can't we find a nicer place to put
them all? Also doesn't the FHS forbid adding new directories to /. IIRC
there was a similar problem with debugfs. How about expanding the role
of /sys to include hosting all the miscellaneous filesystems. sysfs could
have a directory fs which would have /sys/fs/debug /sys/fs/config
/sys/fs/proc (there would have to be a /proc symlink for compatibility)
/sys/fs/relayfs and so on. Maybe /virtual could work. It has been said
that "The Unix Way: Everything is a file. The Linux Way: Everything is a
filesystem." If this is true, shouldn't we adopt a system for the central
metaphor of our OS is extended?

/config != configuration files directory

Posted Sep 21, 2006 10:59 UTC (Thu) by astrand (guest, #4908) [Link]

I fully agree. There are too many magic toplevel directories already. The root should be clean and small, so that the users are not lost when going from /home/user to /media/cdrom via the root. With the current amount of magic toplevel directories, it's no wonder why KDE and GNOME are implementing "magic" "My Computer" style file managers which have their own roots. It's a sad story.

Configfs - an introduction

Posted Aug 25, 2005 20:59 UTC (Thu) by nagar (guest, #4734) [Link]

Jonathan's observation about CKRM (http://ckrm.sf.net) (and similar) projects being able to use configfs is right on the money !

I just finished coding up CKRM's RCFS using configfs and ended up saving 1100 lines of source out of the over 1800 lines of original RCFS code.
Not only that, the complexity of the code is also reduced considerably.

So our project, for one, is certainly interested in configfs.

Regards,
Shailabh Nagar

Configfs vs ioctl

Posted Aug 26, 2005 15:47 UTC (Fri) by giraffedata (guest, #1954) [Link] (2 responses)

I can see that configfs can replace special purpose configuration/control filesystems, but I don't see it eliminating much use of ioctl in its present form.

When you create something, you normally have parameters other than its name. mkdir() doesn't allow for any. ioctl() allows for as flexible a parameter scheme as you need. To use configfs where a directory stands for an object, you'd have to do some complicated thing where the object is in a "being built" state while you write to files and supply the creation parameters with additional system calls. What a mess. Extra code; synchronization nightmares.

I like the idea, but it needs an interface for creating directories that allows for parameters.

Configfs vs ioctl

Posted Oct 18, 2013 11:16 UTC (Fri) by mmorrow (guest, #83845) [Link] (1 responses)

You could of course do some elaboration of:
{
  ioctl(_,&(a_t){.x="abc",.y=42,.z="!"});
}

<===>

# mkdir '{.x="abc",.y=42,.z="!"}'
However, I can't decide if I'm kidding or not.

Configfs vs ioctl

Posted Oct 18, 2013 11:18 UTC (Fri) by mmorrow (guest, #83845) [Link]

(I now realize I've replied to a post from 2005.)

Clean up my namespace!

Posted Sep 4, 2005 18:40 UTC (Sun) by erikharrison (guest, #11204) [Link]

Can we create an official top level directory to hold all these virtual file systems that point to data structures in kernelspace?

You can mount sysfs anywhere, but that doesn't help the dozens of apps that expect /sys to be there is sysfs exists. I'm just getting itchy about the name space pollution in the root.

Perhaps we should have /kernel? Then the kernel devs can make virtual file systems all day long (as seems to be the rage now) without mucking up my root directory.


Copyright © 2005, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds