|
|
Subscribe / Log in / New account

Resurrecting fbdev

By Jake Edge
January 19, 2022

The Linux framebuffer device (fbdev) subsystem has long languished in something of a purgatory; it was listed as "orphaned" in the MAINTAINERS file and saw fairly minimal maintenance, mostly driven by developers working elsewhere in the kernel graphics stack. That all changed, in an eye-opening way, on January 17, when Linus Torvalds merged a change to make Helge Deller the new maintainer of the subsystem. But it turns out that the problems in fbdev run deep, at least according to much of the rest of the kernel graphics community. By seeming to take on the maintainer role in order to revert the removal of some buggy features from fbdev, Deller has created something of a controversy.

Part of the concern within the graphics community is the accelerated timeline that these events played out on. Deller posted his intention to take over maintenance of the framebuffer on Friday, January 14, which received an ack from Geert Uytterhoeven later that day. Two days later, before any other responses had come in, Deller sent a pull request to Torvalds to add Deller as the fbdev maintainer, which was promptly picked up. On January 19, Deller posted reversions of two patch sets that removed scrolling acceleration from fbdev. In the meantime, those reversions had already been made in Deller's brand new fbdev Git tree.

The patch sets that were being targeted for reversion had been posted and merged some time ago. Daniel Vetter disabled accelerated scrolling for the framebuffer console (fbcon) back at the end of 2020. At the time, he added a "todo" item to garbage collect the code that supported that accelerated scrolling. Claudio Suarez posted a patch completing that todo item in September 2021, which was committed in October. On January 13, shortly before deciding to take on maintenance of fbdev, Deller asked for a reversion of the latter patch (or parts of it).

Once Monday January 17 rolled around, Vetter and others noticed the flurry of activity that had occurred over the weekend and weighed in. Vetter suggested that it might have been premature to make a maintainer change "without even bothering to get any input from the people who've been maintaining it before". In particular, he was concerned about moving fbdev and fbcon to a tree separate from the DRM tree; the subsystem may have been marked as orphaned but the situation is more complicated than that:

Because the status isn't entirely correct, fbdev core code and fbcon and all that has been maintained, but in bugfixes only mode. And there's very solid&important reasons to keep merging these patches through a drm tree, because that's where all the driver development happens, and hence also all the testing (e.g. the drm test suite has some fbdev tests - the only automated ones that exist to my knowledge - and we run them in CI [continuous integration]). So moving that into an obscure new tree which isn't even in linux-next yet is no good at all.

Now fbdev driver bugfixes is indeed practically orphaned and I very much welcome anyone stepping up for that, but the simplest approach there would be to just get drm-misc commit rights and push the oddball bugfix in there directly.

Beyond that, Jani Nikula was taken aback by the whirlwind pace of the changes. In particular, he was not happy to see the reversions being made in the new fbdev tree almost immediately, even though the objection was only made a few days earlier. "I'm heavily in favor of maintainers who are open, transparent, collaborative, who seek consensus through discussion, and only put their foot down when required." Deller said that he had just started going through the backlog of patches; "nothing has been pushed yet". He said that Nikula should simply ignore the state of the fbdev tree at this point.

In response to Vetter, Deller said that having a separate tree was not important. He listed four goals for maintaining fbdev going forward:

  1. to get fixes which were posted to fbdev mailing list applied if they are useful & correct,
  2. to include new drivers (for old hardware) if they arrive (probably happens rarely but there can be). I know of at least one driver which won't be able to support DRM.... Of course, if the hardware is capable to support DRM, it should be written for DRM and not applied for fbdev.
  3. reintroduce the state where fbcon is fast on fbdev. This is important for non-DRM machines, either when run on native hardware or in an emulator.
  4. not break DRM development

Vetter pointed Deller to the documentation for coming up to speed on DRM development and for getting commit rights in the drm-misc tree, which is the proper path for fbdev fixes, he said. After that:

I think once we've set that up and got it going we can look at the bigger items. Some of them are fairly low-hanging fruit, but the past 5+ years absolutely no one bothered to step up and sort them out. Other problem areas in fbdev are extremely hard to fix properly, without only doing minimal security-fixes only support, so fair warning there. I think a good starting point would be to read the patches and discussions for some of the things you've reverted in your tree.

Anyway I hope this gets you started, and hopefully after a minor detour: Welcome to dri-devel, we're happy to take any help we can get, there's lots to do!

Deller eventually decided to keep the fbdev tree, though he does plan to coordinate with the rest of the graphics development community:

I'm not planning to push code to fbdev/fbcon without having discussed everything on dri-devel. Everything which somehow would affect DRM needs to be discussed on dri-devel and then - after agreement - either pushed via the fbdev git tree or the drm-misc tree.

It is clear there are differences of opinion on how to proceed. The hardware-accelerated scrolling that was removed was dependent on the 2D bit-blit acceleration features of older hardware. But the code that used it in the fbdev drivers was apparently rather buggy; over the years, syzbot repeatedly found problems in that code, which is why it was eventually removed. The DRM subsystem does not have support for 2D acceleration, and will not, due to some serious technical difficulties in doing so.

On the other hand, Deller and others have graphics hardware that uses the fbdev drivers and, formerly, had reasonable performance using the hardware-accelerated scrolling. That scrolling performance went away when the code was removed, and they would like to get it back. But reverting the removals simply brings back the buggy code. From the perspective of the DRM developers, the right way forward is to create DRM-based drivers for these devices, but Deller and others disagree.

The larger issue is how the transition has been handled, Vetter said in the reversion thread:

The other side is that being a maintainer is about collaboration, and this entire fbdev maintainership takeover has been a demonstration of anything but that. [...] This entire affair of rushing in a maintainer change over the w/e [weekend] and then being greeted by a lot of wtf mails next Monday does leave a rather sour aftertaste. Plus that thread shows a lot of misunderstandings of what's all been going on and what drm can and cannot do by Helge, which doesn't improve the entire "we need fbdev back" argument.

Vetter strongly believes that if the removed features are to return, the fbdev code needs to be modernized to a point "where we can still tell distros that enabling it is an ok thing to do and not just a CVE subscription". In addition, he believes there is a more straightforward path toward improving the scrolling behavior without bringing back all of the problems that syzbot has found:

Also wrt the issue at hand of "fbcon scrolling": The way to actually do that with some speed is to render into a fully cached shadow buffer and upload changed areas with a timer. Not with hw accelerated scrolling, at least not if we just don't have full scale development teams for each driver because creating 2d accel that doesn't suck is really hard. drm fbdev compat helpers give you that shadow buffer for free (well you got to set some options).

But Deller sees things differently; there are existing drivers that need the support that was removed. He intends to try to restore that support, while also presumably fixing whatever problems syzbot or others find:

But in addition fbdev/fbcon is the kernel framework for nearly all existing graphic cards which are not (yet) supported by DRM. They need fbdev/fbcon to show their text console and maybe a simple X server. If you break fbdev for those cards, they are completely stuck. Hopefully those drivers will be ported to DRM, but that's currently not easily possible (or they would be so slow that they are [unusable]).

The DRM developers seem skeptical that the problems already identified can be addressed, but it would seem that they should be giving Deller some time to do so. The "orphaned" status of fbdev was perhaps not the right choice, though it is clear that the DRM community believed any change to that status would come by way of discussion and agreement, rather than via a surprise weekend takeover. Be that as it may, a new maintainer for a long-unloved part of the kernel should be seen as a good thing. We will have to wait to see how it all works out.


Index entries for this article
KernelDevelopment model/Maintainers
KernelDevice drivers/Graphics


to post comments

Resurrecting fbdev

Posted Jan 20, 2022 3:49 UTC (Thu) by flussence (guest, #85566) [Link] (18 responses)

Making fbcon render asynchronously from a shadow buffer should be the top priority here. To demonstrate a point:

~ $ time find /usr/share
[~140k lines snipped]
# on a vt, backed by radeondrmfb and a very fast CPU
real 1m4.837s	user 0m0.128s	sys 0m31.975s

# xfce4-terminal, which according to actual science is one of the slowest X11 terms:
# https://lwn.net/Articles/751763/
real 0m0.873s	user 0m0.192s	sys 0m0.199s

A bare-bones fbcon is almost 75x slower than a (imho) somewhat bloated terminal emulator that happens to have a simple redraw-throttling optimisation - that's the kind of thing that leads to cargo-cult boot optimisation guides telling people to disable all kernel and init output to shave off a few centiseconds.

Resurrecting fbdev

Posted Jan 20, 2022 9:12 UTC (Thu) by blackwood (guest, #44174) [Link] (17 responses)

The problem with fbcon is that people want two things from it, which aren't very compatible:

Either a fast console because they don't want to run bloated X or something like that, or because boot times or whatever. Those people want the shadow buffered fbcon and redraw limiting. Because all the people capable of writing high performance graphics are busy hacking on wayland (not even X anymore), there's just not many folks who care enough to push the kernel console forward.

The others want the dumbest possible console which can be slow, but should try the hardest to get an oops to the screen when the kernel is on fire. In that case you do not want shadow buffering, and the dumber and more immediate the code is, the better. This also means the least amount of hw touching, so definitely no acceleration or anything.

You can't really have it both ways, at least not without some solid effort by a pile of people.

Resurrecting fbdev

Posted Jan 20, 2022 11:29 UTC (Thu) by Karellen (subscriber, #67644) [Link] (3 responses)

You can't really have it both ways, at least not without some solid effort by a pile of people.

I guess the answer would be two separate modules, fbcon_basic and fbcon_fast, and you pick which one you want at either kernel build time, or build both and pick at initrd creation time?

But I suppose that falls under "some solid effort by a pile of people".

Resurrecting fbdev

Posted Jan 20, 2022 18:32 UTC (Thu) by JoeBuck (subscriber, #2330) [Link] (2 responses)

Or on a panic, do a switch to dumb mode and print raw, but this needs to work in the face of unknown kernel corruption.

Resurrecting fbdev

Posted Jan 20, 2022 21:42 UTC (Thu) by MrWim (subscriber, #47432) [Link] (1 responses)

> work in the face of unknown kernel corruption.

I quite like the scheme described here: https://apenwarr.ca/log/20190216 . The kmsg buffer is at a fixed location in memory, so you can write there which should be reliable even in the face of significant kernel memory corruption. On the next boot the kernel checks if there are valid messages there and preserves them so you can see what went wrong. It's beautiful in its simplicity.

The patches were posted upstream, but it never went anywhere.

https://lore.kernel.org/lkml/1331617001-20906-1-git-send-...

Resurrecting fbdev

Posted Jan 21, 2022 18:29 UTC (Fri) by bnorris (subscriber, #92090) [Link]

You mean something like this?

https://www.kernel.org/doc/html/latest/admin-guide/ramoop...

That's been upstream for quite a while, and it works just fine. Chrome OS uses it heavily for field crash logging. Despite its naming, it can be configured to dump logs to RAM even on non-crashes.

Resurrecting fbdev

Posted Jan 20, 2022 16:22 UTC (Thu) by ibukanov (subscriber, #3942) [Link]

If one needs absolutely reliable way to print kernel errors, then just output it to a serial port or log it over network if the former is not available. The modern graphics hardware is way too complex to allow for the dumbest frame buffer to be reliable. Probability of if not hardware bugs but at least quirks is just too high.

Resurrecting fbdev

Posted Jan 20, 2022 18:22 UTC (Thu) by gnoutchd (guest, #121472) [Link] (2 responses)

Whatever became of kmscon? Putting the "smart" console in userspace sounds like a solution, no? Should people be thinking about converting hw-accelerated fbdev drivers into kmscon backends, or somesuch?

Resurrecting fbdev

Posted Jan 20, 2022 20:57 UTC (Thu) by josh (subscriber, #17465) [Link] (1 responses)

I agree; putting the simplest possible thing in kernel space and all the complexity into userspace seems like the right answer here. Keep a terminal up-to-date, and render it to a framebuffer based on a frame rate (e.g. vsync).

Resurrecting fbdev

Posted Jan 20, 2022 23:09 UTC (Thu) by blackwood (guest, #44174) [Link]

It stalled for a few years, but a few folks from rh and suse have picked up the pieces again. We have simpledrm now merged, which means firmware fbdev drivers (efifb, vesafb, ...) aren't needed anymore, and you can have a pure kms world.

Next step is to get some kernel console on top of drm landed, for (emergency) logging. This should probably wait for the remaining console_lock rework so that we don't have to maintain our own kthread for offloading.

At that point all the pieces are in places for distros to ditch CONFIG_VT and fbcon and use kmscon for non-graphical shell logins on the local machine. Apparently logind/seatd already can do all the vt switching without CONFIG_VT (since that's needed for all the additional seats anyway), so that part is covered.

It might all actually happen this decade!

Resurrecting fbdev

Posted Jan 20, 2022 19:08 UTC (Thu) by flussence (guest, #85566) [Link] (8 responses)

Right now we have neither:

When things are working normally, the console's only function is as blinkenlights so people know their boot hasn't frozen. It's also a significant bottleneck as shown (restoring hwaccel would only hide this symptom) and as an interactive terminal it's pretty bad too. There's no reason to have "every frame a painting" here.

When things aren't working, the console is useless because it scrolls far too fast for an operator at the terminal to read it all (restoring hwaccel would only make this worse), and fbcon scrollback rarely works at the best of times so any important output that scrolls offscreen is lost. There are already half a dozen ways to get organic-readable boot output, this part doesn't need further fixing.

It's basically the atime thing all over again.

Resurrecting fbdev

Posted Jan 21, 2022 8:49 UTC (Fri) by geert (subscriber, #98403) [Link] (7 responses)

> fbcon scrollback rarely works

Fbcon scrollback was removed in v5.9 (https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/...).

Resurrecting fbdev

Posted Jan 21, 2022 11:51 UTC (Fri) by ballombe (subscriber, #9523) [Link] (6 responses)

I hope it will be restored. This breaks consolation

Resurrecting fbdev

Posted Jan 21, 2022 12:11 UTC (Fri) by geert (subscriber, #98403) [Link] (2 responses)

I wasn't aware of that. So this should be reported as as userspace regression.

Resurrecting fbdev

Posted Jan 22, 2022 16:55 UTC (Sat) by zdzichu (guest, #17118) [Link] (1 responses)

It's been years, isn't that too long to claim regression?

Resurrecting fbdev

Posted Jan 23, 2022 20:39 UTC (Sun) by ballombe (subscriber, #9523) [Link]

Everyone knew it was a regression when it was done. shift-page up is a documented user interface. The IOCTL is documented. consolation already existed. It is never too late to do the right thing.

Resurrecting fbdev

Posted Jan 22, 2022 8:28 UTC (Sat) by tdz (subscriber, #58733) [Link] (2 responses)

Hi, could you please link a bug report for this problem?

Resurrecting fbdev

Posted Jan 22, 2022 9:25 UTC (Sat) by ballombe (subscriber, #9523) [Link] (1 responses)

Is it what you want ?
https://bugs.debian.org/988039

Resurrecting fbdev

Posted Jan 22, 2022 16:24 UTC (Sat) by tdz (subscriber, #58733) [Link]

Thanks a lot.

Resurrecting fbdev

Posted Jan 20, 2022 6:17 UTC (Thu) by error27 (subscriber, #8346) [Link] (4 responses)

The fbdev subsystem has needed a maintainer for a years. It's a super frustrating subsystem because when you report bugs the attitude is "Yeah. Fbdev is a known root exploit so who cares if it crashes a bit?"

For example, here are some bugs I reported back in 2014.
https://marc.info/?l=kernel-janitors&m=13906030980688...

Here is the fix for that from last December but it still has not been applied. I asked George Kennedy to wait for a bit and resend it through Andrew Morton.
https://lkml.org/lkml/2021/12/7/1040

On the other hand, I really don't want the accelerated scrolling patch to be revived. The main problem with fbdev is all the crashing and the memory corruption bugs. Memory corruption bugs confuse syzbot so instead of showing up as one bug, it shows up as dozens of crazy bugs in random parts of the kernel. It's horrible and we spent a long time fixing it. Let's not go backwards.

Resurrecting fbdev

Posted Jan 20, 2022 9:05 UTC (Thu) by blackwood (guest, #44174) [Link] (3 responses)

From drm side we do care a bit about fbdev, but it's extremely limited to just the core code, fbcon, and the handful of firmware drivers that run before drm drivers take over.

So yeah any driver patches for something like cirrusfb tend to get ingored. Run the drm/cirrus driver instead, if that blows up there will be people who take a look.

The thing _is_ a root exploit pretty much end-to-end :-/

Resurrecting fbdev

Posted Jan 20, 2022 10:56 UTC (Thu) by ballombe (subscriber, #9523) [Link] (1 responses)

> The thing _is_ a root exploit pretty much end-to-end :-/

If this is the case, there is nothing Deller can do to make it worse.
Since KMS made all but impossible to use the VGA console, all systems are running fbdev now.
I for one, am happy there is a new maintainer that see fbdev as something else as a liability.

Resurrecting fbdev

Posted Jan 28, 2022 3:23 UTC (Fri) by HelloWorld (guest, #56129) [Link]

> If this is the case, there is nothing Deller can do to make it worse.

Of course there is. He can make people think that using fbdev is a reasonable idea because, after all, it is being maintained now! Or he can delay the removal of that code.

This whole situation is very unfortunate. Just let it die already.

Resurrecting fbdev

Posted Jan 20, 2022 10:56 UTC (Thu) by error27 (subscriber, #8346) [Link]

Huh... I know that patching fbdev is a waste of time, but sometimes code annoys me so much that I just fix it anyway. I just looked at my outbox and I discovered that *none* of my fbdev patches from 2021 were applied.

I sent four patches to fbdev in 2021. In one case, the original author fixed his bug before me so that's fine. In another case, my patch was Acked but not applied. And for the remaining two patches I was just ignored. There is no other subsystem right now that's as bad as that.

It's such a discouraging thing because newbies are like:

Step 1: Find an easy bug
Step 2: Fbdev hasn't applied bugfixes for years so it has the most obvious bugs
Step 3: Send a patch
Step 4: Wait for feedback before going further
Step 5: Die of old age

Resurrecting fbdev

Posted Jan 20, 2022 9:37 UTC (Thu) by shiftee (subscriber, #110711) [Link]

Easier to ask for forgiveness than permission

Resurrecting fbdev

Posted Jan 20, 2022 15:05 UTC (Thu) by daenzer (subscriber, #7050) [Link]

"The DRM subsystem does not have support for 2D acceleration" is not correct. The linked article is specifically about a generic 2D acceleration UAPI, which fbdev doesn't have either (because it's basically impossible to define a sensible one).

A DRM driver can however implement the same fbdev acceleration hooks as a native fbdev driver can, and some DRM drivers actually did. Many DRM drivers haven't done this because the HW is more complex than a traditional 2D accelerator and cannot safely be used under all circumstances where fbdev acceleration hooks can get called.


Copyright © 2022, Eklektix, Inc.
This article may be redistributed under the terms of the Creative Commons CC BY-SA 4.0 license
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds