Driver porting: Char devices and large dev_t
| This article is part of the LWN Porting Drivers to 2.6 series. |
Major and minor numbers
With the expanded dev_t, it is no longer be possible to assume that major and minor numbers fit within eight bits. To the greatest extent possible, the relevant interfaces have been changed in ways that will not break existing drivers. In particular, a driver which uses the longstanding register_chrdev() function to register a char device will never see minor device numbers greater then 255. Attempts to open a device node with a larger minor number will simply fail with a "no such device" error.One change that is visible to all drivers, however, is the elimination of the kdev_t type. Device numbers are now a simple dev_t throughout the kernel. The place where this change is most apparent for most will be the change in the type of the inode i_rdev field. Drivers which need to get major or minor numbers from inodes should use the two new helper functions:
unsigned iminor(struct inode *inode);
unsigned imajor(struct inode *inode);
Use of these functions will help keep a driver working in the future, even if the representation within inodes changes again.
The new way
register_chrdev() continues to work as it always did, and drivers which use that function need not be changed. Unchanged drivers, however, will not be able to use the expanded device number range, or take advantage of the other features provided by the new code. Sooner or later, it is worthwhile to get to know the new interface.The new way to register a char device range is with:
int register_chrdev_region(dev_t from, unsigned count, char *name);
Here, from is the device number of the first device in the range, count is the number of device numbers to register, and name is the base name of the device (it appears in /proc/devices). The return value is zero if all goes well, and a negative error number otherwise.
Note that from is a device number, not a major number. This interface allows the registration of an arbitrary range of device numbers, starting from anywhere. So the from argument specifies both the beginning major and minor number. If the count argument exceeds the number of minor numbers available, the allocation will continue on into the next major number; this is a design feature.
register_chrdev_region() works if you know which major device number you wish to use. If, instead, your driver expects to work with dynamic major number allocation, it should use:
int alloc_chrdev_region(dev_t *dev, unsigned baseminor,
unsigned count, char *name);
In this case, dev is an output-only parameter which will be set to the first device number of the allocated range. The input parameters are baseminor, the first minor number to use (usually zero); count, the number of device numbers to allocate; and name, the base name of the device. Once again, the return value is zero or a negative error code.
Connecting up devices
Some readers may have noticed that the above functions, unlike register_chrdev(), do not have a file_operations argument. Registering a device number range sets those numbers aside for your use, but it does not actually make any device operations available to user space. There is now a separate object (struct cdev) which represents char devices, and which must be set up by your driver to actually make a device available.To work with struct cdev, you code should include <linux/cdev.h>. Then, the usual way of getting one of these structures is with:
struct cdev *cdev_alloc(void);
If all goes well, the return value will be a pointer to a newly allocated, initialized cdev structure. Check that value, though; there is a memory allocation involved, and things can always fail.
It is also possible to declare a static cdev structure, or to embed one within another structure. In this case, you should pass it to:
void cdev_init(struct cdev *cdev, struct file_operations *fops);
before doing anything else with it.
Your driver will need to set a couple of fields in the cdev structure before adding it to the system. The owner field should be set to the owning module, usually THIS_MODULE. The device's file_operations structure should be pointed to by the ops field. And, to get a directory in sysfs, you should also set the name field in the embedded kobject, with something like:
struct cdev *my_cdev = cdev_alloc();
kobject_set_name(&cdev->kobj, "my_cdev%d", devnum);
Note that kobject_set_name() takes a printf()-like format string and associated arguments.
Once you have the structure set up, it's time to add it to the system:
int cdev_add(struct cdev *cdev, dev_t dev, unsigned count);
cdev is, of course, a pointer to the cdev structure; dev is the first device number handled by this structure, and count is the number of devices it implements. This, one cdev structure can stand in for several physical devices, though you will usually not want to do things that way.
There are two important things to bear in mind when calling cdev_add(). The first is that this call can fail. If the return value is nonzero, the device has not been added and is not visible to user space. If, instead, the call succeeds, the device becomes immediately live. You should not call cdev_add() until your driver is completely ready to handle calls to the device's methods.
Adding a device also creates a directory entry under /sys/cdev, using the name stored in the kobj.name field. As of this writing, that directory is empty, but one assumes that all sorts of good things (the associated device numbers, if nothing else) will eventually show up there.
Deleting devices
If you need to get rid of a cdev structure, the usual way of doing things is to call:
void cdev_del(struct cdev *cdev);
This function should only be called, however, on a cdev structure which has been successfully added to the system with cdev_add(). If you need to destroy a structure which has not been added in this way (perhaps cdev_add() failed), you must, instead, manually decrement the reference count in the structure's kobject with a call like:
kobject_put(&cdev->kobj);
Calling cdev_del() on a device which is still active (if, say, a user-space process still has an open file reference to it) will cause the device to become inaccessible, but it will not actually delete the structure at that time. The reference count in the structure will keep it around until all the references have gone away. That means that your driver's methods could be called after you have deleted your cdev object - a possibility you should be aware of.
The reference count of a cdev structure can be manipulated with:
struct kobject *cdev_get(struct cdev *cdev);
void cdev_put(struct cdev *cdev);
Note that these functions change two reference counts: that of the cdev structure, and that of the module which owns it. It will be rare for drivers to call these functions, however.
Finding your device in file operations
Most of the methods provided by the driver in the file_operations structure take a struct inode (or a struct file which can be used to find the associated inode) as an argument. Traditionally, Linux drivers have looked at the device number stored in the inode's i_rdev field to determine which device is being operated upon. That technique still works, but, in many cases, there is a better way. In 2.6, struct inode contains a field called i_cdev, which contains a pointer to the associated cdev structure. If you have embedded one of those structures within your own, device-specific structure, you can use the container_of() macro (described in the kobject article) to obtain a pointer to that structure.
Why things were done this way
The new interface may seem rather more complex to many. Before, a single call to register_chrdev() was all that was necessary; now a driver has to deal with the additional hassle of managing cdev structures. This approach provides a great deal of flexibility, however, in how the device number space can be managed. Each device gets exactly the number range it needs, and its operations will never be invoked for device numbers outside that range. In the past, it has been noted that many drivers had incorrect range checks on minor numbers; with the new scheme, all those range checks can go away altogether.
The new method also makes it easy for each device to have its own
file_operations structure without the need for big switch
statements in the open() method. Separate cdev
structures can also have separate entries in /sys/cdev.
In general, char devices have
become proper objects within the kernel, with all the advantages that come
with that status. A little bit of extra object management is a small price
to pay.
