The internal representation of device numbers
[Posted April 22, 2003 by corbet]
The expanded device number type - one of the big remaining items for the
2.5 development cycle - is getting closer to reality. Much of the
preparation work has been done. There are still a few issues to be
resolved, however; this week's discussion mostly centers around how device
numbers should be represented in the kernel.
One seeming outcome is that the kdev_t type will go away.
Alexander Viro, who has recently resurfaced behind a UK email address, is
pushing
strongly for this change. Among other things, he has posted a set of "kdev_t-ectomy" patches which remove
the kdev_t type from the TTY layer and a few other spots.
kdev_t variables are replaced with direct pointers to driver data
structures or integer indexes, depending on the context. Every instance of
kdev_t, according to Al, is a sign of a problem; he'll be
submitting more cleanup patches in the future.
As this work progresses, device numbers will become less visible throughout
much of the kernel. But there will still be a need to work with device
numbers; they are, after all, token which is passed between kernel and user
space. A 64-bit device number seems like a done deal, but it's still not
entirely clear how they will be represented. A few schools of thought
exist:
- Many developers have been proceeding on the assumption that a simple,
64-bit integer would be used to hold device numbers in the future.
This approach, of course, is just an extension of the current 16-bit
number scheme.
- While most developers, perhaps, see that 64-bit quantity as being
split into 32-bit major and minor numbers, there are still people who
would like to get rid of the major/minor distinction altogether. The
management of the device number space will make that distinction
increasingly unimportant. Still, retention of the distinction between
major and minor numbers seems likely for now.
- Linus has been advocating a tuple representation, where major and
minor numbers would be carried around independently of each other.
Few others have argued for this representation, however, and Linus
does not appear to feel strongly enough to force the issue.
The end result will matter little for most developers, since the
MAJOR() and MINOR() macros will work as always. The real
concern has to do with how backward compatibility will be supported. We
all have filesystems and applications with 16-bit numbers wired deeply into
them; we all expect those filesystems and applications to work with the 2.6
kernel. That means that a 16-bit device number, with eight-bit
major and minor numbers:
will look to the kernel like a device number with a major number of zero
and a large minor number:
This case is easy to detect, of course, and it is not that big a deal to
map it into the proper large representation:
The important thing is that this remapping must happen consistently
everywhere in the kernel. So, in every place where device numbers enter
the kernel, they must be turned into a standard form, be it a combined
device number or some sort of tuple representation. In practice, this
remapping need not happen in many places; the mknod(),
open() and stat() system calls are the big ones.
Peter Anvin proposed a different way of
representing device numbers in a 64-bit word:
This representation appears to be more complicated, since obtaining the
major and minor numbers would require extracting and splicing bit fields.
It's worth noting again, however, that this work would be hidden within the
MAJOR() and MINOR() macros, and invisible to kernel
code. And, with this representation, no remapping of device numbers would
be required.
The discussion seemed to wind down in an inconclusive manner. The real
decisions will be made, of course, when the patches appear and are merged.
(
Log in to post comments)