The unfinished SCSI job
[Posted October 21, 2003 by corbet]
The repository for SCSI patches
has just been
forked into two separate trees. One of them is a bugfix-only
repository, with its contents meant to get past Linus's "stability fixes
only" filter and into the 2.6.0-test kernel. The other is for everything
else, which will be held for 2.7, or, at least, a post-2.6.0 release.
This change brought out the question: what about expanding the number of
SCSI disks (and partitions) that can be supported by the kernel? That was,
after all, one of the reasons for expanding the dev_t type in the
first place. The larger device numbers are now in place, but there are no
patches in the mainline to make more SCSI disks available.
There are, as it turns out, a few remaining
issues that must be addressed before the SCSI expansion can be
completed. One of those is naming. Currently, the first 26 SCSI drives
are called sda through sdz. Then a second letter is
added, making sdaa through sdzz available. The default
plan seems to be to go to sdaaa thereafter, and sdaaaa if
need be.
Is the number of partitions per drive to be expanded? The current limit of
fifteen is apparently constraining to some. As a result, there has been
persistent talk of raising the limit to 63.
That change, however, would create interesting numbering challenges. The
current numbering scheme divides the (eight-bit) minor number in half; the
upper nibble is the drive number, and the lower nibble is the partition
number. To support more partitions, the portion of the (now 20-bit) minor
number dedicated to the partition number would have to be expanded. A
naive implementation would simply remap the minor number so that bits 0..5
describe the partition, and bits 6..19 the drive number.
The only problem with that approach is that it would break all existing
SCSI device nodes. The kernel hackers have a sense that they might get a
complaint or two if they did that, so they are fairly strongly committed to
ensuring that old device numbers continue to work. As a result, there have
been proposals for more complicated schemes, with the two new partition
bits being placed, for example, up at the high end of the minor number.
This approach would put an end to the manual creation of device nodes for
large SCSI devices - who wants to figure out what number to give to
mknod? - but there was not likely to be much of that going on
anyway.
A better long-term approach might be to go to one or more completely new
major numbers for SCSI drives. The block layer could then assign numbers
dynamicly as the drives are discovered, with a tool like udev creating
device nodes on demand. For sites that need old numbers to work, a small
compatibility module could map between the old and new numbers at device
open time. That is all certainly 2.7 material, however. For 2.6.0, the
most likely scenario might be the merging of a simple patch (like Badari
Pulavarty's patch found in the -mm tree) which expands the number of
disks supported in a relatively unintrusive way. The complete solution can
come later.
(
Log in to post comments)