User: Password:
Subscribe / Log in / New account

Kernel development

Brief items

Kernel release status

The current 2.6 prepatch is 2.6.19-rc4, released by Linus on October 30. The changelog notes that this kernel is "not scary," but it does contain a problem in the block layer resulting from a missed warning (see below). Quite a few fixes made it into this release, including a fix for the change that broke ndiswrapper. The long-format changelog has all the details.

Patches continue to accumulate in the mainline git repository; post -rc4 changes include some networking fixes, some eCryptfs changes, and a few large architecture updates.

Adrian Bunk continues to maintain a list of known regressions in the current 2.6 prepatches.

The current -mm tree is 2.6.19-rc4-mm1. Recent changes to -mm include the dropping of the ACPI and driver core trees due to various problems and some i386 paravirtualization support patches.n

Comments (none posted)

Kernel development news

Quote of the week

How many times have you seen some code coming out of a "GPL code release" from one of the many (mostly embedded) vendors that was actually useful to be contributed back to an existing Free Software project, or even that spawned a new Free Software project? I for my part am certain to say: Zero. The actual number might be close to zero, but very small anyways.

-- Harald Welte

Comments (18 posted)

Buried in warnings

The 2.6.19-rc4 prepatch release did not go quite as well as the developers might have liked; some confusion over the return type for an internal function led to an undesirable mixing of pointer and integer types in the depths of the block layer. As it turns out, gcc noticed this problem and duly issued warnings about it, but nobody saw them before the mistaken patch was merged and the resulting kernel shipped. This is, in other words, a problem which should have been easily avoidable.

Linus responded this way:

And I have SYSFS enabled, so I should have seen this warning.

But I've become innoculated against warnings, just because we have too many of the totally useless noise about deprecation and crud, and ppc has it's own set of bogus compiler-and-linker-generated warnings..

At some point we should get rid of all the "politeness" warnings, just because they can end up hiding the _real_ ones.

A few kernel developers were doubtless wondering just why it took so long to reach this point - there have been complaints about excessive warnings for some time now. There is a lot of support for having the computer find problems whenever possible, and that has led to an increasing number of "must check" annotations and other changes which cause warnings to be issued whenever something looks suspicious. On top of that, gcc generates a fair number of warnings in situations where no real problems exist. The end result is that warnings which refer to real problems tend to get lost in the flood.

Patches which address many of the spurious "this variable might not be initialized before being used" warnings have been circulating for some time. There is resistance to applying them, however; some developers resent cluttering up the code (and bloating the kernel) with unneeded initializations to deal with what they see as a gcc bug. There is no real sign that this latest episode has changed the thinking on that score; the initialization patches may well continue to languish.

A different approach has been taken by Al Viro. He has developed a little tool called "remapper" which tracks how blocks of code move around from one kernel version to the next. Using the generated information, a set of compiler warnings from an old kernel can be remapped to their line numbers in a newer kernel. Then, a tool like diff can be used to compare the output from old and new compiles; the end result is a listing of the warnings which first appear in the new kernel - and only those. With this filtered output, developers can quickly find places where the compiler has pointed out real problems.

Remapper can be had via git from:


Dave Jones also makes daily snapshots available.

Use of remapper is relatively straightforward: after building the remap-log tool, one starts with a command like this:

    diff-remap-data 2.6.19-rc2 2.6.19-rc3 >

The resulting "map" file is full of file names and numbers; they simply map line numbers from the old directory tree to the new one - and mark blocks of code which were removed altogether. There is another tool (git-remap-data) which performs the same task for two commits in a git repository; in this case, file renames can be handled properly as well.

The remap-log tool can then be used to move old compile logs into the present:

    remap-log < 2.6.19-rc2.log > 2.6.19-rc2-remapped.log

If the new log is then compared to the output from a 2.6.19-rc3 build with diff, the only output will be any warnings (or errors) which have appeared or disappeared between the two kernel versions. Those which have only moved due to changes elsewhere in the file will be filtered out. The short documentation file packaged with the code offers some other potential uses, such as carrying forward annotated grep output as an ongoing "to do" list.

Some developers swear by this tool. Jeff Garzik, however, is not entirely pleased; in an earlier discussion he said:

I think it's both sad, and telling, that the high level of build noise has trained kernel hackers to tune out warnings, and/or build tools of ever-increasing sophistication just to pick out the useful messages from all the noise.

Jeff has, instead, put together a separate kernel tree with many of the bogus warnings silenced. It is a labor-intensive task - each warning must be investigated and shown to be spurious before being quieted. This work is not intended for merging; instead, it's meant to help create a development platform in which the useful warnings can actually be seen. This set of changes has been part of the -mm tree since 2.6.18-mm3.

Yet another approach to the "may be uninitialized" warnings was floated last May; it introduces a special macro which "initializes" a variable without actually doing anything. That silences the warning without adding to the size of the kernel. The macro is only supposed to be used in cases where the code paths have been audited. The objection that was raised at the time was that, while the current use of a variable might be correct, future changes to the code could introduce a path where that variable is, indeed, used without initialization. The warning would still be suppressed, however, and the bug might not be caught until much later. So the patch was never merged.

Compiler bugs can, perhaps, eventually be fixed. But the increasing interest in the use of automated tools to find potential bugs all but guarantees that there will continue to be a stream of spurious warnings for developers to deal with. If those automated warnings are to lead to real fixes - before somebody gets burned - ways of keeping the noise level down will have to be found.

Comments (23 posted)

Upcoming API change: struct path

The file structure, representing an open file, is passed into the vast majority of filesystem and driver-oriented operations. It contains a couple of useful fields:

	struct dentry		*f_dentry;
	struct vfsmount         *f_vfsmnt;

Josef Sipek recently noticed that in fs/namei.c there is a similar-looking structure defined:

    struct path {
	struct vfsmount *mnt;
	struct dentry *dentry;

He then decided that struct path deserved wider circulation; the result was a series of patches moving struct path into <linux/namei.h> and changing struct file to use struct path in place of the two separate fields listed above.

Of course, there is a certain amount of code in the kernel which is used to struct file in its older configuration; in particular, the f_dentry field is widely used. So this move is an internal API change, which takes a bit of work to fix up. So, when the whole patch set went into 2.6.19-rc3-mm1, Andrew Morton annotated them as "102 patches to do something rather pointless."

So what is the point? When asked, Josef explained it like this:

It's little cleaner than having two pointers. In general, there is a number of users of dentry-vfsmount pairs in the kernel, and struct path nicely wraps it

"A little cleaner" tends to be fairly faint praise for a patch which touches this many files and will affect a lot of out-of-tree code as well. It has made it as far as -mm, however, suggesting that it has a good chance of getting into 2.6.20. Pointless or not, struct path appears to be coming.

Comments (1 posted)

Video4Linux2 part 3: Basic ioctl() handling

The Video4Linux2 API series.
Anybody who has spent any amount of time working through the Video4Linux2 API specification will have certainly noted that V4L2 makes heavy use of the ioctl() interface. Perhaps more than just about any other type of peripheral, video hardware has a vast number of knobs to tweak. Video streams have many parameters associated with them, and, often, there is quite a bit of processing done in the hardware. Trying to operate video hardware outside of its well-supported modes can lead to poor performance at best, and often no performance at all. So there is no alternative to exposing many of the hardware's features and quirks to the end application.

Traditionally, video drivers have included ioctl() functions of approximately the same length as a Neal Stephenson novel; while the functions often come to more satisfying conclusions than the novels, they do tend to drag a lot in the middle. So the V4L2 API was changed in 2.6.18; the interminable ioctl() function has been replaced with a large set of callbacks which implement the individual ioctl() functions. There are, in fact, 79 of them in 2.6.19-rc3. Fortunately, most drivers need not implement all - or even most - of the possible callbacks.

What has really happened is that the long ioctl() function has been moved into drivers/media/video/videodev.c. This code handles the movement of data between user and kernel space and dispatches individual ioctl() calls to the driver. To use it, the driver need only use video_ioctl2() as its ioctl() method in the video_device structure. Actually, most drivers should be able to use it as unlocked_ioctl() instead; the locking within the Video4Linux2 layer can handle it, and drivers should have proper locking in place as well.

The first callback your driver is likely to implement is:

    int (*vidioc_querycap)(struct file *file, void *priv, 
                           struct v4l2_capability *cap);

This function handles the VIDIOC_QUERYCAP ioctl(), which asks a simple "who are you and what can you do?" question. Implementing it is mandatory for V4L2 drivers. In this function, as with all other V4L2 callbacks, the priv argument is the contents of file->private_data field; the usual practice is to point it at the driver's internal structure representing the device at open() time.

The driver should respond by filling in the structure cap and returning the usual "zero or negative error code" value. On successful return, the V4L2 layer will take care of copying the response back into user space.

The v4l2_capability structure (defined in <linux/videodev2.h>) looks like this:

    struct v4l2_capability
	__u8	driver[16];	/* i.e. "bttv" */
	__u8	card[32];	/* i.e. "Hauppauge WinTV" */
	__u8	bus_info[32];	/* "PCI:" + pci_name(pci_dev) */
	__u32   version;        /* should use KERNEL_VERSION() */
	__u32	capabilities;	/* Device capabilities */
	__u32	reserved[4];

The driver field should be filled in with the name of the device driver, while the card field should have a description of the hardware behind this particular device. Not all drivers bother with the bus_info field; those that do usually use something like:

    sprintf(cap->bus_info, "PCI:%s", pci_name(&my_dev));

The version field holds a version number for the driver. The capabilities field is a bitmask describing various things that the driver can do:

  • V4L2_CAP_VIDEO_CAPTURE: The device can capture video data.
  • V4L2_CAP_VIDEO_OUTPUT: The device can perform video output.
  • V4L2_CAP_VIDEO_OVERLAY: It can do video overlay onto the frame buffer.
  • V4L2_CAP_VBI_CAPTURE: It can capture raw video blanking interval data.
  • V4L2_CAP_VBI_OUTPUT: It can do raw VBI output.
  • V4L2_CAP_SLICED_VBI_CAPTURE: It can do sliced VBI capture.
  • V4L2_CAP_SLICED_VBI_OUTPUT: It can do sliced VBI output.
  • V4L2_CAP_RDS_CAPTURE: It can capture Radio Data System (RDS) data.
  • V4L2_CAP_TUNER: It has a computer-controllable tuner.
  • V4L2_CAP_AUDIO: It can capture audio data.
  • V4L2_CAP_RADIO: It is a radio device.
  • V4L2_CAP_READWRITE: It supports the read() and/or write() system calls; very few devices will support both. It makes little sense to write to a camera, normally.
  • V4L2_CAP_ASYNCIO: It supports asynchronous I/O. Unfortunately, the V4L2 layer as a whole does not yet support asynchronous I/O, so this capability is not meaningful.
  • V4L2_CAP_STREAMING: It supports ioctl()-controlled streaming I/O.

The final field (reserved) should be left alone. The V4L2 specification requires that reserved be set to zero, but, since video_ioctl2() sets the entire structure to zero, that is nicely taken care of.

A fairly typical implementation can be found in the "vivi" driver:

    static int vidioc_querycap (struct file *file, void  *priv,
					struct v4l2_capability *cap)
	strcpy(cap->driver, "vivi");
	strcpy(cap->card, "vivi");
	cap->version = VIVI_VERSION;
	cap->capabilities =	V4L2_CAP_VIDEO_CAPTURE |
				V4L2_CAP_STREAMING     |
	return 0;

Given the presence of this call, one would expect that applications would use it and avoid asking specific devices to perform functions that they are not capable of. In your editor's limited experience, however, applications tend not to pay much attention to the VIDIOC_QUERYCAP call.

Another callback, which is optional and not often implemented, is:

    int (*vidioc_log_status) (struct file *file, void *priv);

This function, implementing VIDIOC_LOG_STATUS, is intended to be a debugging aid for video application writers. When called, it should print information describing the current status of the driver and its hardware. This information should be sufficiently verbose to help a confused application developer figure out why the video display is coming up blank. Your editor would also recommend, however, that it be moderated with a call to printk_ratelimit() to keep it from being used to slow the system and fill the logfiles with junk.

The next installment will start in on the remaining 77 callbacks. In particular, we will begin to look at the long process of negotiating a set of operating modes with the hardware.

Comments (none posted)

Patches and updates

Kernel trees


Core kernel code

Development tools

Device drivers


Filesystems and block I/O


Memory management



Virtualization and containers


Page editor: Jonathan Corbet
Next page: Distributions>>

Copyright © 2006, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds