Weekly Edition Return to the Kernel pageSponsored link Serve your customers, not your servers, with VERIO Linux VPS. Full-access test-drive here. |
Partitioned loopback devices
The expanded device number type in the 2.6 kernel makes it possible, at the
lowest level, to support vast numbers of partitions on every block device
in the system. Unfortunately, the Linux block drivers have not caught up
with this change. SCSI, in particular, is still limited to 15 partitions
per device. There are a few reasons for this lag, but the largest is
simple compatibility: there is no easy way to incorporate support for more
partitions without breaking the existing device numbering scheme. The
block layer assumes that partitions have consecutive minor numbers, so
supporting more partitions means increasing the portion of the minor number
which is dedicated to the partition number. But changing the
interpretation of minor numbers in this way would break existing systems,
and that is something the kernel developers are reluctant to do.
Carl-Daniel Hailfinger has recently posted an interesting solution to the partition limit: partitioned loopback devices. A loopback device is a kernel-implemented virtual block device which is backed up by something real - usually a disk partition or a file on a disk somewhere. Common uses for loopback devices include mounting regular files as filesystems or the creation of encrypted filesystems (though the device mapper is the preferred means for the latter application in 2.6). Loopback devices do not support partitions in their own right; they simply provide block-level access to the backing store as a single partition. Carl-Daniel noticed, however, that adding partition support to loopback devices would be a relatively straightforward thing to do. In 2.6, partition handing is (finally) part of the block layer; all that is really required to support partitions in the loopback driver is to tell the block layer that those partitions exist. So, with a small patch, each loopback device can have up to 127 partitions. The bulk of the patch, in fact, is there to ensure continued compatibility for users of non-partitioned loopback devices. This capability is interesting because it is a simple matter of one losetup command to create a loopback interface to a real disk drive. Thus, by using loopback devices in this mode, system administrators can get around the partition limits enforced by the real hardware drivers and divide their disks into lots of tiny little pieces. There is some small overhead associated with using the loopback device, but, for users in need of more partitions, it may well be a price worth paying. (Log in to post comments)
Partitioned loopbackdevices Posted Nov 11, 2004 7:16 UTC (Thu) by Duncan (guest, #6647) [Link] This is an interesting solution indeed.A couple months ago, my attention was drawn abruptly to this partition issue. As luck would have it, I had just decided to forgo SATA for another upgrade round and stick with PATA for one more generation, so it wasn't me. However, someone else on the Gentoo AMD64 list ran into the entirely predictable problem, attempting to upgrade his SATA disks from the old IDE side SATA drivers to the newer SCSI side SATA drivers. A good portion of his partitions were suddenly unreachable!!! Unfortunately, there wasn't much to tell him except to go back to the old kernel and drivers at least long enough to grab the data from the extra partitions, store it elsewhere, and repartition into fewer partitions. I DID thank him, however, for pointing out the problem to this guy who had decided to wait another upgrade cycle for SATA, due to a general feeling that I was already pushing the envelope enough with newer AMD64 gear, and running ~amd64 (Gentoo uses ~ to denote unstable/beta, altho it's supposed to have been tested past alpha at least), and that I didn't want to gamble any further with as yet unstable driver implementations for SATA on TOP of the other leading/bleeding edge stuff I was running. Anyway, it would have been very useful to have this solution available in the kernel at that time, such that with a couple additional configuration tweaks, he'd have been on his way. Barring some sort of magic and SCSI or at least the SATA-SCSI subset, developing >16 partition support by the time I DO switch, hopefully this solution WILL be in the mainline kernel by then and decently widely deployed and documented. As it happens, I've 20 partitions now, on my 250G PATA, and that's with ~100G still unpartitioned. It's possible I'll have mid-20s partitions by upgrade time, and be ready for even MORE, on what I expect by that time will be my new half terabyte or larger drive. (Or drives, if I go RAID by then, as I might.) Maybe this'll serve at a bit of a heads-up to some others, thinking about upgrading to SATA, as well. It could certainly add a bit of unexpected complexity to your upgrade, if you aren't ready for it and have the 20-ish partitions I do. Duncan
Partitioned loopbackdevices Posted Nov 11, 2004 11:59 UTC (Thu) by ekj (subscriber, #1524) [Link] Just out of curiosity; what exactly are you doing that means it makes sense to make 20 partitions, on a single harddisk, totaling 150GB ?
Partitioned loopbackdevices Posted Nov 11, 2004 12:13 UTC (Thu) by Liefting (subscriber, #8466) [Link] And, more importantly, why are they not under LVM?
Partitionedloopbackdevices Posted Nov 13, 2004 13:18 UTC (Sat) by Duncan (guest, #6647) [Link] Well, you asked...hda1 boot, 2,3 root and root-mirror (root copied to root-mirror periodically, when I know stuff is working, so I can just switch roots at the boot prompt if an update screwed things up and I can't boot my working root), 4 is of course the extended partition, mapping the additional logical partitions. That takes care of the four primary partitions. 5 and 19 are /usr and usr-mirror, giving me a backup /usr in the event an update screws my working copy up. 6-8 are my Gentoo portage partitions (which would normally be under /usr/portage, thus their location after /usr), 6 being the equivalent of /usr/portage, getting it off of /usr as it's rsynced as part of my daily update, 7 being the package sources (as opposed to the Gentoo portage ebuild install scripts on 6), and 8 being binary packages created at source merge time, so I don't have to go recompiling if I have to backup a version or two. The partitions serve to size discipline each of these, of course. 9 and 10 are /usr/src and /usr/local, thus getting them AND the /usr/portage dirs off of the /usr partition making mirroring it much simpler. src doesn't need mirrored as the stuff there is easily replaced from the net, and local is mirrored to another disk. 11-13 are /var, a separate /var/log for size control reasons, and a separate ccache partition (which by default would be a subdir of /var). 14 is an empty /opt partition. 15 is a 10 gig /home (again, the backup is on another disk). 16-18 are my dedicated mail, news, and media partitions, also relatively large (20 gig mail archive, 8 gig news cache only, 40+ gig media archive, respectively). Thus, the 10 gig home is PLENTY big, even for duplicated backup user dirs. After 18, my media partition, is the 100 gig of blank space, allowing for expansion of the media partition or other flexibility as desired. 19 as I mentioned is the usr-mirror. 20 is a quite large 15 gig /tmp. I could easily do with just a gig, but I have the room, and I decided to appropriate enough space for it so I could stick a couple DVD images there if necessary, when I was partitioning. Also, emerge can take up to 5 gigs or so of tmpspace for packages such as OOo, according to reports, and while that's normally in /var/tmp for security in multi-user situations, that's not an issue here, so I have portages tmpspace mapped to /tmp, allowing me to avoid yet ANOTHER partition for /var/tmp. Note that I don't mention swap partitions. I have a gig of memory, and decided to disable swap in my kernel config, as I didn't need it and it only added needless complication and code complexity to the kernel. (On AMD64's flat memory architecture, the memory zone issues that cause problems with swap disabled on ia32 don't apply, and the first one that might hits at 4G, so with only a gig, I'm safe with it too.) I had done that while running Mandrake, so eliminated the swap partitions when I wiped Mandrake and reorganized Gentoo on the remaining space. I mentioned a second disk. It's far smaller, only 36G, but I still keep two additional copies (backup-working and backup-backup) of / and /usr on it, meaning I have four copies of those critical partitions, a working and a backup copy on each of a working and backup disk. It has additional (single) partitions for /var, /usr/local, and /tmp, and a copy of the critical personal data from /home as well. With all that, I keep two copies of both disk's partition tables in /root, root's home, on the / partition, meaning a total of EIGHT copies of the partition tables, two each in four different /root homedirs. Likewise with fstab in /etc, eight copies of that as well (plus automated edit backups in fstab~). I could have accomplished the same goal using mount --bind and fewer partitions, putting all the /usr subdir partitions on one partition in different subdirs mount-bound as appropriate, for example. That would have kept me under the 16-partition barrier, and is actually what I may end up doing when I upgrade to SATA. However, the 20-partition thing has worked out quite well on PATA. I actually had a few more partitions (24, I think) when I was dual booting Mandrake and Gentoo, as I learned about Gentoo and made the switch. However, I reorganized things when I killed my Mandrake install, just as I had for it when I killed my MSWormOS install. As for LVM, I've not learned it yet, and besides, it'd only be something else that could go wrong. I do fine without it, tho I'll probably take the trouble to learn it at some point. Duncan
Partitionedloopbackdevices Posted Nov 18, 2004 12:04 UTC (Thu) by job (subscriber, #670) [Link] Learn LVM! It's madness not to. All you need to learn are a few more words and two or three simple command line utilities. It's an half hour really well spent. It works just like partitions, but you can resize them at will, and refer to them by name instead of number (which gets really handy when these partitions, called volumes, span multiple disks).
Partitionedloopbackdevices Posted Nov 18, 2004 17:41 UTC (Thu) by wolfrider (guest, #3105) [Link] --Webmin is your friend for LVM... Best interface I've seen since Yast.
' apt-cache search webmin|grep lvm '
Partitioned loopbackdevices Posted Nov 11, 2004 16:40 UTC (Thu) by vmole (subscriber, #111) [Link] Anyway, it would have been very useful to have this solution available in the kernel at that time, such that with a couple additional configuration tweaks, he'd have been on his way. I don't think that's actually the case (although I haven't looked at the actual patch, so correct me if I'm wrong). The implication of this article was that you could create a loopback device whose backing store was a single SCSI (SATA) partition, and then partition the loopback device. Accessing existing partitions isn't the same thing.
Partitioned loopbackdevices Posted Nov 11, 2004 18:04 UTC (Thu) by pflugstad (subscriber, #224) [Link] no, I think this patch lets you map a loopback device to an entire block device - see the example - he uses /dev/hdb with 60+ partitions.
libata limits Posted Nov 11, 2004 18:06 UTC (Thu) by pflugstad (subscriber, #224) [Link] So libata is limited to 15 partions as well? Is that related to the SCSI limitiation somehow?
libata limits Posted Nov 13, 2004 13:22 UTC (Sat) by Duncan (guest, #6647) [Link] There are two SATA implementations in the kernel. The older one is underIDE, and has the 64-partition IDE limit. The newer one (that uses libata) is part of the SCSI subsystem, yes, so is limited to the SCSI 16 partitions. Duncan
Partitioned loopback devices Posted Nov 12, 2004 21:30 UTC (Fri) by giraffedata (subscriber, #1954) [Link] I've always thought that partitioning should be done only by something like the loopback device, which means the logic could go in the loopback device driver instead of the block layer. This means the loopback device driver is an LVM, by the way.Is there some reason I've missed that partition awareness by the block layer is a good thing?
Partitioned loopback devices Posted Nov 15, 2004 12:27 UTC (Mon) by garloff (subscriber, #319) [Link] But unfortunately we have only 255 loopback devices, don't we?So either we use 32k SCSI disks with 16 partitions each, or 256 SCSI disks in loopback mode with 127 partitions each. But not both. Therefore this does not offer a good generic solution :-(
Partitioned loopback devices Posted Nov 19, 2004 11:40 UTC (Fri) by Blaisorblade (guest, #25465) [Link] > But unfortunately we have only 255 loopback devices, don't we?
We had those. But with 32 bit majors/minors, we can build far more (2^20 minors are available). And from reading the patch, it seems it can already take advantage of that (it uses MINOR_BITS to calculate the maximum minor number, and I assume MINOR_BITS is set to 20, i.e. the correct value).
Partitioned loopback devices Posted Nov 20, 2004 18:33 UTC (Sat) by theraphim (subscriber, #25955) [Link] Loop device partitioning (and 64bit losetup offset) is handy when doing forensic analysis of entire harddisk images.
|
Copyright © 2004, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds
Powered by Rackspace Managed Hosting.