Resurrecting fbdev
The Linux framebuffer device (fbdev) subsystem has long languished in something of a purgatory; it was listed as "orphaned" in the MAINTAINERS file and saw fairly minimal maintenance, mostly driven by developers working elsewhere in the kernel graphics stack. That all changed, in an eye-opening way, on January 17, when Linus Torvalds merged a change to make Helge Deller the new maintainer of the subsystem. But it turns out that the problems in fbdev run deep, at least according to much of the rest of the kernel graphics community. By seeming to take on the maintainer role in order to revert the removal of some buggy features from fbdev, Deller has created something of a controversy.
Part of the concern within the graphics community is the accelerated timeline that these events played out on. Deller posted his intention to take over maintenance of the framebuffer on Friday, January 14, which received an ack from Geert Uytterhoeven later that day. Two days later, before any other responses had come in, Deller sent a pull request to Torvalds to add Deller as the fbdev maintainer, which was promptly picked up. On January 19, Deller posted reversions of two patch sets that removed scrolling acceleration from fbdev. In the meantime, those reversions had already been made in Deller's brand new fbdev Git tree.
The patch sets that were being targeted for reversion had been posted and merged some time ago. Daniel Vetter disabled accelerated scrolling for the framebuffer console (fbcon) back at the end of 2020. At the time, he added a "todo" item to garbage collect the code that supported that accelerated scrolling. Claudio Suarez posted a patch completing that todo item in September 2021, which was committed in October. On January 13, shortly before deciding to take on maintenance of fbdev, Deller asked for a reversion of the latter patch (or parts of it).
Once Monday January 17 rolled around, Vetter and others noticed the
flurry of activity that had occurred over the weekend and weighed in.
Vetter suggested
that it might have been premature to make a maintainer change "without even bothering
to get any input from the people who've been maintaining it
before
". In particular, he was concerned about moving fbdev and
fbcon to a tree separate from the DRM tree; the subsystem may have been
marked as orphaned but the situation is more complicated than that:
Because the status isn't entirely correct, fbdev core code and fbcon and all that has been maintained, but in bugfixes only mode. And there's very solid&important reasons to keep merging these patches through a drm tree, because that's where all the driver development happens, and hence also all the testing (e.g. the drm test suite has some fbdev tests - the only automated ones that exist to my knowledge - and we run them in CI [continuous integration]). So moving that into an obscure new tree which isn't even in linux-next yet is no good at all.Now fbdev driver bugfixes is indeed practically orphaned and I very much welcome anyone stepping up for that, but the simplest approach there would be to just get drm-misc commit rights and push the oddball bugfix in there directly.
Beyond that, Jani Nikula was taken aback by the
whirlwind pace of the changes. In particular, he was not happy to see the
reversions being made in the new fbdev tree almost immediately, even though
the objection was only made a few days earlier. "I'm heavily in favor
of maintainers who are open, transparent,
collaborative, who seek consensus through discussion, and only put their
foot down when required.
" Deller said
that he had just started going through the backlog of patches;
"nothing has been pushed yet
". He said that Nikula should
simply ignore the state of the fbdev tree at this point.
In response to Vetter, Deller said that having a separate tree was not important. He listed four goals for maintaining fbdev going forward:
- to get fixes which were posted to fbdev mailing list applied if they are useful & correct,
- to include new drivers (for old hardware) if they arrive (probably happens rarely but there can be). I know of at least one driver which won't be able to support DRM.... Of course, if the hardware is capable to support DRM, it should be written for DRM and not applied for fbdev.
- reintroduce the state where fbcon is fast on fbdev. This is important for non-DRM machines, either when run on native hardware or in an emulator.
- not break DRM development
Vetter pointed Deller to the documentation for coming up to speed on DRM development and for getting commit rights in the drm-misc tree, which is the proper path for fbdev fixes, he said. After that:
I think once we've set that up and got it going we can look at the bigger items. Some of them are fairly low-hanging fruit, but the past 5+ years absolutely no one bothered to step up and sort them out. Other problem areas in fbdev are extremely hard to fix properly, without only doing minimal security-fixes only support, so fair warning there. I think a good starting point would be to read the patches and discussions for some of the things you've reverted in your tree.Anyway I hope this gets you started, and hopefully after a minor detour: Welcome to dri-devel, we're happy to take any help we can get, there's lots to do!
Deller eventually decided to keep the fbdev tree, though he does plan to coordinate with the rest of the graphics development community:
I'm not planning to push code to fbdev/fbcon without having discussed everything on dri-devel. Everything which somehow would affect DRM needs to be discussed on dri-devel and then - after agreement - either pushed via the fbdev git tree or the drm-misc tree.
It is clear there are differences of opinion on how to proceed. The hardware-accelerated scrolling that was removed was dependent on the 2D bit-blit acceleration features of older hardware. But the code that used it in the fbdev drivers was apparently rather buggy; over the years, syzbot repeatedly found problems in that code, which is why it was eventually removed. The DRM subsystem does not have support for 2D acceleration, and will not, due to some serious technical difficulties in doing so.
On the other hand, Deller and others have graphics hardware that uses the fbdev drivers and, formerly, had reasonable performance using the hardware-accelerated scrolling. That scrolling performance went away when the code was removed, and they would like to get it back. But reverting the removals simply brings back the buggy code. From the perspective of the DRM developers, the right way forward is to create DRM-based drivers for these devices, but Deller and others disagree.
The larger issue is how the transition has been handled, Vetter said in the reversion thread:
The other side is that being a maintainer is about collaboration, and this entire fbdev maintainership takeover has been a demonstration of anything but that. [...] This entire affair of rushing in a maintainer change over the w/e [weekend] and then being greeted by a lot of wtf mails next Monday does leave a rather sour aftertaste. Plus that thread shows a lot of misunderstandings of what's all been going on and what drm can and cannot do by Helge, which doesn't improve the entire "we need fbdev back" argument.
Vetter strongly believes that if the removed features are to return, the fbdev code needs to be
modernized to a point "where we can still
tell distros that enabling it is an ok thing to do and not just a CVE
subscription
". In addition, he believes there is a more
straightforward path toward improving the scrolling behavior without
bringing back all of the problems that syzbot has found:
Also wrt the issue at hand of "fbcon scrolling": The way to actually do that with some speed is to render into a fully cached shadow buffer and upload changed areas with a timer. Not with hw accelerated scrolling, at least not if we just don't have full scale development teams for each driver because creating 2d accel that doesn't suck is really hard. drm fbdev compat helpers give you that shadow buffer for free (well you got to set some options).
But Deller sees things differently; there are existing drivers that need the support that was removed. He intends to try to restore that support, while also presumably fixing whatever problems syzbot or others find:
But in addition fbdev/fbcon is the kernel framework for nearly all existing graphic cards which are not (yet) supported by DRM. They need fbdev/fbcon to show their text console and maybe a simple X server. If you break fbdev for those cards, they are completely stuck. Hopefully those drivers will be ported to DRM, but that's currently not easily possible (or they would be so slow that they are [unusable]).
The DRM developers seem skeptical that the problems already identified can be addressed, but it would seem that they should be giving Deller some time to do so. The "orphaned" status of fbdev was perhaps not the right choice, though it is clear that the DRM community believed any change to that status would come by way of discussion and agreement, rather than via a surprise weekend takeover. Be that as it may, a new maintainer for a long-unloved part of the kernel should be seen as a good thing. We will have to wait to see how it all works out.
Index entries for this article | |
---|---|
Kernel | Development model/Maintainers |
Kernel | Device drivers/Graphics |
Posted Jan 20, 2022 3:49 UTC (Thu)
by flussence (guest, #85566)
[Link] (18 responses)
Making fbcon render asynchronously from a shadow buffer should be the top priority here. To demonstrate a point: A bare-bones fbcon is almost 75x slower than a (imho) somewhat bloated terminal emulator that happens to have a simple redraw-throttling optimisation - that's the kind of thing that leads to cargo-cult boot optimisation guides telling people to disable all kernel and init output to shave off a few centiseconds.
Posted Jan 20, 2022 9:12 UTC (Thu)
by blackwood (guest, #44174)
[Link] (17 responses)
Either a fast console because they don't want to run bloated X or something like that, or because boot times or whatever. Those people want the shadow buffered fbcon and redraw limiting. Because all the people capable of writing high performance graphics are busy hacking on wayland (not even X anymore), there's just not many folks who care enough to push the kernel console forward.
The others want the dumbest possible console which can be slow, but should try the hardest to get an oops to the screen when the kernel is on fire. In that case you do not want shadow buffering, and the dumber and more immediate the code is, the better. This also means the least amount of hw touching, so definitely no acceleration or anything.
You can't really have it both ways, at least not without some solid effort by a pile of people.
Posted Jan 20, 2022 11:29 UTC (Thu)
by Karellen (subscriber, #67644)
[Link] (3 responses)
I guess the answer would be two separate modules, fbcon_basic and fbcon_fast, and you pick which one you want at either kernel build time, or build both and pick at initrd creation time? But I suppose that falls under "some solid effort by a pile of people".
Posted Jan 20, 2022 18:32 UTC (Thu)
by JoeBuck (subscriber, #2330)
[Link] (2 responses)
Posted Jan 20, 2022 21:42 UTC (Thu)
by MrWim (subscriber, #47432)
[Link] (1 responses)
I quite like the scheme described here: https://apenwarr.ca/log/20190216 . The kmsg buffer is at a fixed location in memory, so you can write there which should be reliable even in the face of significant kernel memory corruption. On the next boot the kernel checks if there are valid messages there and preserves them so you can see what went wrong. It's beautiful in its simplicity.
The patches were posted upstream, but it never went anywhere.
https://lore.kernel.org/lkml/1331617001-20906-1-git-send-...
Posted Jan 21, 2022 18:29 UTC (Fri)
by bnorris (subscriber, #92090)
[Link]
https://www.kernel.org/doc/html/latest/admin-guide/ramoop...
That's been upstream for quite a while, and it works just fine. Chrome OS uses it heavily for field crash logging. Despite its naming, it can be configured to dump logs to RAM even on non-crashes.
Posted Jan 20, 2022 16:22 UTC (Thu)
by ibukanov (subscriber, #3942)
[Link]
Posted Jan 20, 2022 18:22 UTC (Thu)
by gnoutchd (guest, #121472)
[Link] (2 responses)
Posted Jan 20, 2022 20:57 UTC (Thu)
by josh (subscriber, #17465)
[Link] (1 responses)
Posted Jan 20, 2022 23:09 UTC (Thu)
by blackwood (guest, #44174)
[Link]
Next step is to get some kernel console on top of drm landed, for (emergency) logging. This should probably wait for the remaining console_lock rework so that we don't have to maintain our own kthread for offloading.
At that point all the pieces are in places for distros to ditch CONFIG_VT and fbcon and use kmscon for non-graphical shell logins on the local machine. Apparently logind/seatd already can do all the vt switching without CONFIG_VT (since that's needed for all the additional seats anyway), so that part is covered.
It might all actually happen this decade!
Posted Jan 20, 2022 19:08 UTC (Thu)
by flussence (guest, #85566)
[Link] (8 responses)
When things are working normally, the console's only function is as blinkenlights so people know their boot hasn't frozen. It's also a significant bottleneck as shown (restoring hwaccel would only hide this symptom) and as an interactive terminal it's pretty bad too. There's no reason to have "every frame a painting" here.
When things aren't working, the console is useless because it scrolls far too fast for an operator at the terminal to read it all (restoring hwaccel would only make this worse), and fbcon scrollback rarely works at the best of times so any important output that scrolls offscreen is lost. There are already half a dozen ways to get organic-readable boot output, this part doesn't need further fixing.
It's basically the atime thing all over again.
Posted Jan 21, 2022 8:49 UTC (Fri)
by geert (subscriber, #98403)
[Link] (7 responses)
Fbcon scrollback was removed in v5.9 (https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/...).
Posted Jan 21, 2022 11:51 UTC (Fri)
by ballombe (subscriber, #9523)
[Link] (6 responses)
Posted Jan 21, 2022 12:11 UTC (Fri)
by geert (subscriber, #98403)
[Link] (2 responses)
Posted Jan 22, 2022 16:55 UTC (Sat)
by zdzichu (guest, #17118)
[Link] (1 responses)
Posted Jan 23, 2022 20:39 UTC (Sun)
by ballombe (subscriber, #9523)
[Link]
Posted Jan 22, 2022 8:28 UTC (Sat)
by tdz (subscriber, #58733)
[Link] (2 responses)
Posted Jan 20, 2022 6:17 UTC (Thu)
by error27 (subscriber, #8346)
[Link] (4 responses)
For example, here are some bugs I reported back in 2014.
Here is the fix for that from last December but it still has not been applied. I asked George Kennedy to wait for a bit and resend it through Andrew Morton.
On the other hand, I really don't want the accelerated scrolling patch to be revived. The main problem with fbdev is all the crashing and the memory corruption bugs. Memory corruption bugs confuse syzbot so instead of showing up as one bug, it shows up as dozens of crazy bugs in random parts of the kernel. It's horrible and we spent a long time fixing it. Let's not go backwards.
Posted Jan 20, 2022 9:05 UTC (Thu)
by blackwood (guest, #44174)
[Link] (3 responses)
So yeah any driver patches for something like cirrusfb tend to get ingored. Run the drm/cirrus driver instead, if that blows up there will be people who take a look.
The thing _is_ a root exploit pretty much end-to-end :-/
Posted Jan 20, 2022 10:56 UTC (Thu)
by ballombe (subscriber, #9523)
[Link] (1 responses)
If this is the case, there is nothing Deller can do to make it worse.
Posted Jan 28, 2022 3:23 UTC (Fri)
by HelloWorld (guest, #56129)
[Link]
Of course there is. He can make people think that using fbdev is a reasonable idea because, after all, it is being maintained now! Or he can delay the removal of that code.
This whole situation is very unfortunate. Just let it die already.
Posted Jan 20, 2022 10:56 UTC (Thu)
by error27 (subscriber, #8346)
[Link]
I sent four patches to fbdev in 2021. In one case, the original author fixed his bug before me so that's fine. In another case, my patch was Acked but not applied. And for the remaining two patches I was just ignored. There is no other subsystem right now that's as bad as that.
It's such a discouraging thing because newbies are like:
Step 1: Find an easy bug
Posted Jan 20, 2022 9:37 UTC (Thu)
by shiftee (subscriber, #110711)
[Link]
Posted Jan 20, 2022 15:05 UTC (Thu)
by daenzer (subscriber, #7050)
[Link]
A DRM driver can however implement the same fbdev acceleration hooks as a native fbdev driver can, and some DRM drivers actually did. Many DRM drivers haven't done this because the HW is more complex than a traditional 2D accelerator and cannot safely be used under all circumstances where fbdev acceleration hooks can get called.
Resurrecting fbdev
~ $ time find /usr/share
[~140k lines snipped]
# on a vt, backed by radeondrmfb and a very fast CPU
real 1m4.837s user 0m0.128s sys 0m31.975s
# xfce4-terminal, which according to actual science is one of the slowest X11 terms:
# https://lwn.net/Articles/751763/
real 0m0.873s user 0m0.192s sys 0m0.199s
Resurrecting fbdev
Resurrecting fbdev
You can't really have it both ways, at least not without some solid effort by a pile of people.
Resurrecting fbdev
Resurrecting fbdev
Resurrecting fbdev
Resurrecting fbdev
Whatever became of kmscon? Putting the "smart" console in userspace sounds like a solution, no? Should people be thinking about converting hw-accelerated fbdev drivers into kmscon backends, or somesuch?
Resurrecting fbdev
Resurrecting fbdev
Resurrecting fbdev
Resurrecting fbdev
Resurrecting fbdev
I hope it will be restored. This breaks consolation
Resurrecting fbdev
Resurrecting fbdev
Resurrecting fbdev
Resurrecting fbdev
Resurrecting fbdev
Resurrecting fbdev
https://marc.info/?l=kernel-janitors&m=13906030980688...
https://lkml.org/lkml/2021/12/7/1040
Resurrecting fbdev
Resurrecting fbdev
Since KMS made all but impossible to use the VGA console, all systems are running fbdev now.
I for one, am happy there is a new maintainer that see fbdev as something else as a liability.
Resurrecting fbdev
Resurrecting fbdev
Step 2: Fbdev hasn't applied bugfixes for years so it has the most obvious bugs
Step 3: Send a patch
Step 4: Wait for feedback before going further
Step 5: Die of old age
Resurrecting fbdev
Resurrecting fbdev