LWN: Comments on "Rolling stable kernels" https://lwn.net/Articles/871989/ This is a special feed containing comments posted to the individual LWN article titled "Rolling stable kernels". en-us Thu, 11 Sep 2025 03:02:28 +0000 Thu, 11 Sep 2025 03:02:28 +0000 https://www.rssboard.org/rss-specification lwn@lwn.net Rolling stable kernels https://lwn.net/Articles/927729/ https://lwn.net/Articles/927729/ tsaeger <div class="FormattedComment"> I git-golf from time-to-time :)<br> <p> # git merge -s "theirs"<br> THEIRS=v5.15.1<br> git merge -s ours --no-commit "${THEIRS}"<br> git read-tree -u --reset "${THEIRS}"<br> git commit --no-edit<br> <p> </div> Wed, 29 Mar 2023 19:34:13 +0000 Rolling stable kernels https://lwn.net/Articles/873772/ https://lwn.net/Articles/873772/ flussence <div class="FormattedComment"> <font class="QuotedText">&gt; How do you do this without a serial console?</font><br> printk.delay with a camera and/or a lot of patience :)<br> Or less conventionally, fbcon=rotate:1 to fit more lines in? I agree, it sucks.<br> <p> <font class="QuotedText">&gt; How do you do this on NFS servers that your desktops depend on (so if they don&#x27;t come up, the desktops are probably frozen waiting for them and can&#x27;t run a serial emulator?)</font><br> I&#x27;ve used EFI pstore for (precisely) that, but no one should take this as a recommendation - it&#x27;s not always available and it&#x27;s risky (you&#x27;re depending on the BIOS vendor&#x27;s code to work reliably... LOL). And it still requires an extra reboot cycle into a working kernel to get at the log when things go south.<br> </div> Fri, 22 Oct 2021 18:36:15 +0000 Rolling stable kernels https://lwn.net/Articles/873344/ https://lwn.net/Articles/873344/ Wol <div class="FormattedComment"> Look on gentoo-user ... there are patches to put that buffer back.<br> <p> It&#x27;s maintained there, if somebody could get it back into the kernel that would be good ...<br> <p> Cheers,<br> Wol<br> </div> Tue, 19 Oct 2021 18:56:37 +0000 Rolling stable kernels https://lwn.net/Articles/873340/ https://lwn.net/Articles/873340/ nix <div class="FormattedComment"> <font class="QuotedText">&gt; - some features being dropped (how many people complained about the loss of the frame buffer scroll-back buffer)</font><br> <p> I would still really like this to come back. I just went through hell diagnosing boot-time oopses that happened before syslogd was up because I couldn&#x27;t bloody scroll back any more. How do you do this without a serial console? How do you do this on NFS servers that your desktops depend on (so if they don&#x27;t come up, the desktops are probably frozen waiting for them and can&#x27;t run a serial emulator?)<br> <p> </div> Tue, 19 Oct 2021 18:08:54 +0000 Rolling stable kernels https://lwn.net/Articles/872595/ https://lwn.net/Articles/872595/ marcH <div class="FormattedComment"> <font class="QuotedText">&gt; The first is that there is an increased need for testing under the rolling model that is being pushed onto the users. But he disagrees. Currently users think they do not need to retest much if they go from x.y.50 to x.y.51, but that is not true. The same is true with rolling. Testing must be done whenever the kernel changes.</font><br> <p> <font class="QuotedText">&gt; But it all goes back to testing, he said; it is possible that the merge window introduced a lot of bugs, but it is also possible that a single stable patch does the same thing.</font><br> <p> And all these bold quality assertions were supported by which detailed metrics? I found none in the report which is strange when advocating for more testing. I mean &quot;testing&quot; sorry; &quot;more&quot; would not be binary enough.<br> <p> <p> <p> </div> Mon, 11 Oct 2021 20:57:36 +0000 Rolling stable kernels - just say no https://lwn.net/Articles/872592/ https://lwn.net/Articles/872592/ marcH <div class="FormattedComment"> <font class="QuotedText">&gt; Of course the last time I expressed an opinion to the stable kernel developers they pretty much told me in no uncertain terms (paraphrased), &quot;You use out-of-tree modules, so we don&#x27;t care about your use case or the problems that we are going to cause you.&quot; Sigh.</font><br> <p> They could have said &quot;Sorry, it&#x27;s not practically possible for us to support out-of-tree modules&quot; but they may not be able to see the non-technical difference. The same persons likely have a limited knowledge of what &quot;QA&quot; or &quot;CI&quot; means and they may even be unaware that out of the gazillions of devices running Linux that you can buy off the shelf, practically none runs a &quot;pure&quot; commit signed by Linus.<br> <p> Either you use out of tree code or you don&#x27;t: a binary world older than social media. Or maybe just an earlier... version of it :-)<br> <p> </div> Mon, 11 Oct 2021 20:39:54 +0000 Rolling stable kernels https://lwn.net/Articles/872588/ https://lwn.net/Articles/872588/ marcH <div class="FormattedComment"> <font class="QuotedText">&gt; And of course, testing cannot catch all, or even much, of the breakage. Your best defense against breakage is don&#x27;t change things.</font><br> <p> The best defense is to test more before upgrading production systems, ideally: Unfortunately, validation is not sexy and all but the super-rich big corps can&#x27;t or don&#x27;t want to pay for it anyway.<br> <p> To be fair there are projects where an insane amount of testing is performed on every single line changed but they&#x27;re generally &quot;pure software&quot; project not held back by hardware that admittedly makes everything ... harder.<br> <p> </div> Mon, 11 Oct 2021 20:16:05 +0000 Rolling stable kernels https://lwn.net/Articles/872418/ https://lwn.net/Articles/872418/ farnz <p>There's a system design corollary to this: systems designed to recover from component failure are more likely to have 100% system uptime in the long run than systems designed around components that never fail. If you absolutely must have no outages ever, then you need a system designed so that every component of the system can be taken out completely without notice and without the system breaking. <p>Anecdotally, this seems to happen because systems designed to recover from expected failures (like a host rebooting) end up with the hooks needed to let you recover briskly when the unexpected happens. These hooks can be reused nicely to handle maintenance - if you can cope with an instance of a component going down because it crashed, you can build a maintenance regime that takes instances down one by one for kernel upgrades. Systems designed around 100% uptime of all components don't have those hooks to begin with. <p>And, of course, you need to budget for all of this - redundancy is inefficient and expensive, so you may well find that the same people who want "no outages ever" are unable to fund their requirement. Mon, 11 Oct 2021 08:35:01 +0000 Rolling stable kernels https://lwn.net/Articles/872465/ https://lwn.net/Articles/872465/ Chousuke <div class="FormattedComment"> Yeah, though in practice, there&#x27;s no such thing as an unlimited budget, and all systems will have downtime. <br> <p> However, in well-managed systems, regular downtime due to maintenance is managed such that it will not impact day-to-day operations more than what is deemed acceptable, and the risk of extended downtime due to unexpected events is accounted for and mitigated within the budget, and any residual risk that can&#x27;t be reasonably mitigated is explicitly accepted.<br> <p> You can&#x27;t simply demand that things won&#x27;t fail, because reality doesn&#x27;t bend to your will, but *managing* the impact of failure is definitely possible.<br> </div> Mon, 11 Oct 2021 08:15:57 +0000 Rolling stable kernels https://lwn.net/Articles/872463/ https://lwn.net/Articles/872463/ nilsmeyer <div class="FormattedComment"> <font class="QuotedText">&gt; I customer demanding &quot;no outages ever&quot; is basically just saying &quot;I want you to take all the blame for everything that goes wrong.&quot;</font><br> <p> It&#x27;s possible with an unlimited budget. <br> <p> <font class="QuotedText">&gt; I see systems with a long time between maintenance or failure events as accumulating risk. If you never test what happens when your important system goes down, you are essentially just hoping that whenever it inevitably does happen, it&#x27;s &quot;oh, it recovered in 15 minutes&quot; instead of &quot;oh, we&#x27;re out of business&quot;, and that you&#x27;re not the one taking blame for it. </font><br> <p> I&#x27;ve seen more than my share of systems like that in my time. Very often they are also mission-critical. It&#x27;s a huge liability that very often is hidden from the business owners. <br> </div> Mon, 11 Oct 2021 07:58:14 +0000 Rolling stable kernels https://lwn.net/Articles/872458/ https://lwn.net/Articles/872458/ Lennie <div class="FormattedComment"> Well, maybe not every release, right ?:<br> <p> <a href="https://www.youtube.com/watch?v=SYRlTISvjww">https://www.youtube.com/watch?v=SYRlTISvjww</a><br> </div> Mon, 11 Oct 2021 05:34:12 +0000 Rolling stable kernels https://lwn.net/Articles/872455/ https://lwn.net/Articles/872455/ wtarreau <div class="FormattedComment"> In my opinion this is completely wrong. All this will result in is users not even applying fixes anymore by fear of even more breakage.<br> <p> Currently the problem is not solely to encourage users to migrate to a new branch but also to follow minor updates inside the branch they&#x27;re in! When you see phones being released with 2-years old kernel while the branch they use are 100 versions later, or some boards sold with a 2-year old BSP, you figure there is some serious fear of upgrading. This will not go away with this, quite the opposite. All those who face a boot failure once will roll back to their old bogus kernel and never move anymore.<br> <p> And sources of breakage on major upgrades are numerous:<br> - make oldconfig taking hours to run by hand, or olddefconfig not<br> always picking that tiny bit you need because some drivers were<br> split in two<br> - new drivers offered for some hardware you&#x27;re using and not<br> working similarly<br> - some features being dropped (how many people complained<br> about the loss of the frame buffer scroll-back buffer)<br> - new entries appearing in /sys and accidentally being found by<br> poorly written scripts<br> - some even depend on external drivers (network, filesystems, etc)<br> that are not necessarily ported quickly<br> <p> Branches ARE made exactly for this, to inform users on the risk of breakage. Linus himself is very careful about not changing certain things in the middle of an -rc cycle to have enough time to communicate on possibly breaking changes. Breaking changes ARE changes and ARE needed, to fix design mistakes or outdated designs that prevent from improving. You can&#x27;t just decide for users when it&#x27;s best for them to take that risk.<br> <p> What I suspect would happen if such a rolling update became popular is that users would start to implore stable maintainers &quot;not to merge the new one yet&quot;, so we&#x27;ll get an new-LTS, current-LTS, previous-LTS and older-LTS rolling updates. Right now these are called &quot;5.10&quot;, &quot;5.4&quot;, &quot;4.19&quot;, &quot;4.9&quot; and so on.<br> <p> I think that for desktop users, having the choice between &quot;current-LTS&quot; or &quot;previous-LTS&quot; would certainly be sufficient, i.e. stay on current and switch to previous when things break. But honestly, among desktop users, how many are building their own kernels instead of using their distro&#x27;s ? Probably most only kernel developers. And these ones are very likely not those who will use a rolling LTS tree because they want to follow more closely what they&#x27;re using.<br> <p> I personally find as a user that the current stable offering is really awesome and gives us a lot of choice. I personally don&#x27;t apply fixes often, but I always find a properly working kernel for any of my machines, regardless of the branch they&#x27;re on, and can decide to switch to a newer branch depending on the time I have available to assign to that task. I.e. my laptop is systematically updated if rebooted (by default I suspend it, but battery outages or failures to wakeup happen). And if I have enough time and am not on the latest LTS anymore, I take this opportunity to upgrade. For my ARM-based firewall, I reserve a week-end to switch the LTS branch, because I *know* that certain things will not work anymore and deserve more scrutiny if I want to be able to connect back home from work later. Same for most of my other servers (NAS, reverse proxies etc).<br> <p> </div> Mon, 11 Oct 2021 03:02:14 +0000 Rolling stable kernels https://lwn.net/Articles/872361/ https://lwn.net/Articles/872361/ Chousuke <div class="FormattedComment"> I customer demanding &quot;no outages ever&quot; is basically just saying &quot;I want you to take all the blame for everything that goes wrong.&quot;<br> <p> I see systems with a long time between maintenance or failure events as accumulating risk. If you never test what happens when your important system goes down, you are essentially just hoping that whenever it inevitably does happen, it&#x27;s &quot;oh, it recovered in 15 minutes&quot; instead of &quot;oh, we&#x27;re out of business&quot;, and that you&#x27;re not the one taking blame for it. <br> </div> Sat, 09 Oct 2021 14:41:37 +0000 Debian testing https://lwn.net/Articles/872360/ https://lwn.net/Articles/872360/ ballombe <div class="FormattedComment"> The article is referencing debian testing as an example of rolling release.<br> However only between 5-10% of Debiansystems are running testing compared to a stable version<br> &lt;<a href="https://popcon.debian.org/index.html">https://popcon.debian.org/index.html</a>&gt;<br> <p> <p> </div> Sat, 09 Oct 2021 14:01:24 +0000 Rolling stable kernels https://lwn.net/Articles/872336/ https://lwn.net/Articles/872336/ NYKevin <div class="FormattedComment"> <font class="QuotedText">&gt; Agreed on the second.. many developers seem to believe that because a new kernel was released all systems should have rebooted to it. [they are also the first to email sysadmin about why is the build systems are down because said reboot did not work.] Those complaints then get escalated.. and finally some higher manager/VP/CEO says &quot;Enough&quot; and you start getting change control procedures and committees to make sure that whatever caused the last reboot fiasco doesn&#x27;t happen again (only to find some new one). At which point the developers complain that nothing ever gets rebooted and why are we sitting on A.B.C when the &#x27;real world is running X.Y.Z&#x27;</font><br> <p> IMHO this is an antipattern. I strongly prefer the SRE way of handling this, which is, basically, &quot;Estimate the approximate size and scale of a plausible outage that results from pushing a bad build, then compare that to your error budget. If your error budget is too low, either make the push process safer (slower, usually) or delay it until your error budget has recovered from the last outage.&quot; If/when arguments arise, you always resolve them by going back to the error budget and looking at the data. If we can&#x27;t afford to have an outage this day/week/month/release-cycle, then we don&#x27;t push until we can afford it. It really is that simple.<br> <p> Of course, this also means your execs and decision makers need to be willing to express their business requirements in terms of error budgets, rather than saying something like &quot;no outages ever&quot; (might as well be &quot;I want a pony&quot; for all the good that&#x27;s going to do). It also means you need to have monitoring and alerting to calculate your remaining error budget, and notify you when it is at risk (but you really should have those things anyway!).<br> </div> Sat, 09 Oct 2021 07:00:12 +0000 Rolling stable kernels - just say no https://lwn.net/Articles/872319/ https://lwn.net/Articles/872319/ abatters <div class="FormattedComment"> Standing down from red alert then :)<br> </div> Fri, 08 Oct 2021 21:15:11 +0000 Rolling stable kernels - just say no https://lwn.net/Articles/872307/ https://lwn.net/Articles/872307/ sashal <div class="FormattedComment"> It&#x27;s just another option, nothing is going away or changing :)<br> </div> Fri, 08 Oct 2021 19:26:09 +0000 Rolling stable kernels - just say no https://lwn.net/Articles/872299/ https://lwn.net/Articles/872299/ mfuzzey <div class="FormattedComment"> Agreed.<br> <p> Rolling stable may be fine for *end users* that just trust their distribution or vendor to give the the &quot;best&quot; updates.<br> <p> But when you are *developping* on top of the kernel (which is basically everyone in the embedded space) there&#x27;s a huge difference as you say.<br> <p> I&#x27;m doing something similar to you and bumping a stable point release normally less than a day to update and a few days of tests (which I don&#x27;t get involved with unless there are problems).<br> <p> Whereas updating to a new &quot;major&quot; release is going to take at least a week for me fixing up various breakages in our patches (when the merge or build fails that&#x27;s the easy part, it&#x27;s more complicated if there are behaviour or performance changes). And I&#x27;m not talking about some franken kernel with thousands of local patches many touching core code. This is &lt;200 patches mostly in drivers. Fortunately we don&#x27;t have any 3rd party code though.<br> <p> Now of course the standard reply to that is &quot;upstream first&quot;. That&#x27;s fine for bugfixes and drivers for publically available hardware that help everyone (and we do that when we can) but do we really want to clutter the upstream kernel with thousands of &quot;one off&quot; drivers that are only useful to the few people in the world that have the custom hardware they support?<br> <p> </div> Fri, 08 Oct 2021 16:56:58 +0000 Rolling stable kernels https://lwn.net/Articles/872289/ https://lwn.net/Articles/872289/ vgoyal <div class="FormattedComment"> I think it worked by accident. overlayfs never supported selinux. And while swtiching creds it gave additional capabilities to current task (but did not change LSM security context of the task). So effectively DAC creds were changing but MAC creds were not changing. And that&#x27;s why it probably worked becase now MAC cred checking was happening in caller&#x27;s context (And not mounter&#x27;s context).<br> <p> Anyway, do propose patches again upstream and lets have a discussion on this and see if this new model conflicts with something else which is already in the code.<br> </div> Fri, 08 Oct 2021 15:30:00 +0000 Rolling stable kernels https://lwn.net/Articles/872286/ https://lwn.net/Articles/872286/ bluca <div class="FormattedComment"> It&#x27;s a bit off topic, as the point of the comment was to highlight how the &quot;we don&#x27;t break userspace&quot; is not what happens in reality. But on the issue, whatever the justification might be, it used to work and now it&#x27;s broken, and everyone I know who writes policy unanimously considers the new behaviour unacceptable, so either the feature gets yanked from the build or the broken behaviour gets patched out-of-tree. Anyway, I believe patches should be forthcoming in the near future, so we&#x27;ll see if v19 will have better luck.<br> </div> Fri, 08 Oct 2021 14:41:27 +0000 Rolling stable kernels https://lwn.net/Articles/872282/ https://lwn.net/Articles/872282/ smoogen <div class="FormattedComment"> Now that would be an interesting master&#x27;s thesis. Does 1000 POC changes in stable kernels actually run more or less than bare statistics would show. In a general sense I would believe that any 1000 change might be less likely to be ever run, but since these are &#x27;stable&#x27; kernels the fixes going into them may be more likely to hit &#x27;hot-spots&#x27; in code than other changes. [AKA someone ran into a problem with X.Y.Z and this X.Y.Z+N is going to fix it. The more likely it is to be hit, the more likely a change would be accepted in into the &#x27;stable&#x27; branch.]<br> <p> Agreed on the second.. many developers seem to believe that because a new kernel was released all systems should have rebooted to it. [they are also the first to email sysadmin about why is the build systems are down because said reboot did not work.] Those complaints then get escalated.. and finally some higher manager/VP/CEO says &quot;Enough&quot; and you start getting change control procedures and committees to make sure that whatever caused the last reboot fiasco doesn&#x27;t happen again (only to find some new one). At which point the developers complain that nothing ever gets rebooted and why are we sitting on A.B.C when the &#x27;real world is running X.Y.Z&#x27; <br> </div> Fri, 08 Oct 2021 14:28:26 +0000 Rolling stable kernels https://lwn.net/Articles/872253/ https://lwn.net/Articles/872253/ vimja <div class="FormattedComment"> In my experience it is no true. This sound all nice in theory, but it&#x27;s just not what I have experienced over the past 10 years.<br> <p> In practice it feels like a &quot;major&quot; new kernel (going from x.y to x.y+1) rather often breaks some minor thing or another whereas I almost never encounter any issues with &quot;minor&quot; upgrades (going from x.y.z to x.y.z+1). Quite the opposite actually, - usually as a kernel matures, the &quot;minor&quot; upgrades usually fix some of the issues I encountered during the earlier &quot;major&quot; upgrade.<br> </div> Fri, 08 Oct 2021 09:45:19 +0000 Rolling stable kernels - just say no https://lwn.net/Articles/872251/ https://lwn.net/Articles/872251/ nilsmeyer <div class="FormattedComment"> <font class="QuotedText">&gt; Of course the last time I expressed an opinion to the stable kernel developers they pretty much told me in no uncertain terms (paraphrased), &quot;You use out-of-tree modules, so we don&#x27;t care about your use case or the problems that we are going to cause you.&quot; Sigh.</font><br> <p> Sounds like a very unfortunate situation, given that &quot;just get your modules into the kernel&quot; is also a daunting proposition. <br> <p> Good on you that you at least try to stay as current as possible given the constraints. <br> </div> Fri, 08 Oct 2021 09:07:40 +0000 Rolling stable kernels https://lwn.net/Articles/872248/ https://lwn.net/Articles/872248/ pbonzini <div class="FormattedComment"> Yes, exactly. What one would need here is a &quot;theirs&quot; merge strategy, but it doesn&#x27;t exist.<br> </div> Fri, 08 Oct 2021 07:12:48 +0000 Rolling stable kernels https://lwn.net/Articles/872246/ https://lwn.net/Articles/872246/ pbonzini <div class="FormattedComment"> Of course the commits are different. In &quot;it becomes the same as 5.15.1&quot;, &quot;it&quot; refers to &quot;the content of rolling-stable&quot; from the previous sentence.<br> </div> Fri, 08 Oct 2021 07:11:29 +0000 Rolling stable kernels - just say no https://lwn.net/Articles/872231/ https://lwn.net/Articles/872231/ thwalker3 <div class="FormattedComment"> Honestly, I think &quot;rolling stable&quot; is meant for folks like Fedora, who do more or less the same thing anyway (at home on 5.14.9-300.fc35.x86_64 right now, expecting 5.14.10-300.fc35.x86_64 any day now). But for enterprise users who roll their own, jumping from LTS to LTS every year or so, &quot;rolling LTS&quot; already sounds a lot like what we&#x27;re already doing.<br> <p> Perhaps I&#x27;m not parsing the article correctly, but it seems to be doing little more than collapsing &quot;*a* stable branch&quot; down into &quot;*the* stable branch&quot; for &quot;rolling stable&quot;. However, if &quot;rolling LTS&quot; meant going back on the multi-year support for prior LTS kernels, that would be a big problem. With 10&#x27;s of thousands of systems and customers who hate upgrades, I&#x27;m still trying to flush 4.14 out of the environment. Not that it was much better in a prior life at a big bank with a beefy Redhat support contract. I still shudder remembering trying to get rid of the last couple hundred AS2.1 boxes 3 years after RHEL3 hit EOL (in the same timeframe we still had Solaris 8 kicking around so...)<br> </div> Fri, 08 Oct 2021 03:09:26 +0000 Rolling stable kernels https://lwn.net/Articles/872226/ https://lwn.net/Articles/872226/ amboar <div class="FormattedComment"> Ugh, that should have been the other way around with respect to the branch state and merged tag, but the important bit was the existence of the &quot;ours&quot; merge strategy :)<br> </div> Fri, 08 Oct 2021 00:28:28 +0000 Rolling stable kernels https://lwn.net/Articles/872224/ https://lwn.net/Articles/872224/ amboar <div class="FormattedComment"> Playing git-golf a bit here, but this can be done in one operation with `git merge -s ours 5.15.1`. As I was reading the article I thought it sounded awfully similar to something I blogged recently: <a href="https://amboar.github.io/notes/2021/09/16/history-preserving-fork-maintenance-with-git.html">https://amboar.github.io/notes/2021/09/16/history-preserv...</a><br> </div> Fri, 08 Oct 2021 00:18:45 +0000 Rolling stable kernels https://lwn.net/Articles/872222/ https://lwn.net/Articles/872222/ jwarnica <div class="FormattedComment"> If we are measuring lines of code, a 1000loc change is statistically unlikely to change anything a given user does, at all. It&#x27;s statistically likely they never even run that code.<br> <p> In the real world, a x.y.1 change will be YOLO&#x27;d by basically everyone who isn&#x27;t crazy paranoid.<br> <p> Speaking of the crazy paranoid, I have enterprise customers who put off upgrading RHEL for months &quot;to be safe&quot;. (Presumably someone was burned in 1993 by a point release in NetWare, and the mindset became encoded in &quot;the way we do things&quot;). Note that they don&#x27;t actually test things, they just wait.<br> <p> In any case, and notwithstanding that RH would insulate customers from whatever the kernel calls itself, this proposal helps neither side of what real people do.<br> <p> </div> Thu, 07 Oct 2021 23:22:26 +0000 Rolling stable kernels https://lwn.net/Articles/872216/ https://lwn.net/Articles/872216/ ibukanov <div class="FormattedComment"> How can it be the same as 5.15.1 if the commit message are different? Surely the tree sha stay the same, but sha of the commit that is derived both from the tree and the message will be different.<br> <p> Plus git commit-tree allows to perform the above manipulation without any merge or conflicts.<br> </div> Thu, 07 Oct 2021 21:57:51 +0000 Rolling stable kernels https://lwn.net/Articles/872198/ https://lwn.net/Articles/872198/ vgoyal <div class="FormattedComment"> Ok, so I think this is not just SELinux related as such. Issue is with how both DAC and MAC checks happen. Current model is that overlay inode checks happen with caller&#x27;s creds but checks on underlying layer happen with mounter&#x27;s creds. And I think android folks were asking that all checks (and overlay operations) happen with the creds of caller. That&#x27;s a different security model altogether.<br> <p> overlayfs does bunch of operations internally which require privileges and initially it would prepare creds with appropriate capabilities, use these creds to do the operation and then switch back to tasks&#x27;s creds. Later we changed to use mounter&#x27;s creds instead.<br> <p> Point is, overlayfs relies on some priviliged operations which require capabilities which typically unprivileged tasks don&#x27;t have. So overlayfs does these operations on tasks&#x27;s behalf as long as mounter had the capability. And we don&#x27;t want to do it unconditionally otherwise what task might not be able to do normally will be able to do by just creating overlayfs mount.<br> <p> So that&#x27;s how this existing notion of using mounter&#x27;s creds has evolved. And yes, this will need mounter&#x27;s task to have appropriate DAC and MAC (SELinux included) privileges to access.<br> <p> I don&#x27;t recall all the details of override_creds proposal, but I think crux was that operation on underlying fs<br> also happen with the creds of the caller (and not mounter). That means all the callers will have to be more privileged to be able to do privileged operations otherwise most of the operations will fail. So I was wondering how this is more useful. Practically all the users will have to be effectively &quot;root&quot; to be able to that kind of security model. That was my understanding. And I think that&#x27;s one reason those patches did not get lot of traction.<br> <p> I think if it really matters, propose patches one more time and we can have discussion one more time. In later version of patches I think I got busy in other things and could not follow the patches. I think Amir was looking at it though.<br> </div> Thu, 07 Oct 2021 19:37:10 +0000 Rolling stable kernels - just say no https://lwn.net/Articles/872174/ https://lwn.net/Articles/872174/ abatters <div class="FormattedComment"> The small company I work for is currenly using 5.10.x. Our plan is to upgrade to a new LTS series once a year, about 5 months or so after its initial release, to give it time to stabilize, and to have extra time for testing and validation. We apply a number of patches that have to be forward-ported to each new version, and a number of out-of-tree modules that have to be udpated for each new major version due to the constantly-changing kernel internal APIs. Some of the out-of-tree modules come from 3rd parties, and we have to wait for them to release new versions that support the new kernel version before we can upgrade. Major kernel upgrades almost always cause regressions that I have to fix due to the huge churn.<br> <p> We use Yocto to build our firmware. Yocto seems to have a problem building upstream kernels newer than the release of Yocto, even when using custom recipes. So upgrading to a newer upstream kernel would also require backporting patches from upstream Yocto.<br> <p> I have a complicated set of Yocto recipes for each major kernel version going back a long time that applies the right set of patches for each kernel, so that I can test old kernels when bisecting for regressions. The version numbers are *important* because they determine which set of patches need to be applied.<br> <p> We upgrade to a new stable kernel for every new firmware release, just before beginning testing/validation of the new firmware. Usually there is no need to update patches or external modules.<br> <p> In summary, upgrading to a new major version is completely different than upgrading to a new stable patchlevel:<br> - Porting patches<br> - Updating out-of-tree modules<br> - Waiting for 3rd-party out-of-tree modules to catch up<br> - More regressions<br> - More testing<br> - Waiting for 3rd-party support from e.g. Yocto kernel build system<br> - Checking and updating config options<br> <p> In addition, the kernel version numbers are themselves useful to indicate which patches to apply.<br> <p> This whole rolling stable kernel idea sounds like a nightmare to me. Please don&#x27;t. Don&#x27;t take away the freedom to choose the best time to switch to a new major kernel version. Not everyone works the same way as you.<br> <p> Of course the last time I expressed an opinion to the stable kernel developers they pretty much told me in no uncertain terms (paraphrased), &quot;You use out-of-tree modules, so we don&#x27;t care about your use case or the problems that we are going to cause you.&quot; Sigh.<br> <p> </div> Thu, 07 Oct 2021 15:31:05 +0000 Rolling stable kernels https://lwn.net/Articles/872163/ https://lwn.net/Articles/872163/ bluca <div class="FormattedComment"> <a href="https://lore.kernel.org/lkml/20201021151903.652827-1-salyzyn@android.com/">https://lore.kernel.org/lkml/20201021151903.652827-1-saly...</a><br> <p> <a href="https://source.android.com/devices/bootloader/partitions/system-as-root#implementing-vendor-overlay">https://source.android.com/devices/bootloader/partitions/...</a><br> <p> The process setting up the overlay must now have the same permissions as _every_ single process that _uses_ the overlay. Which is a gigantic overreach and everyone doing selinux policy goes &quot;nope, no way&quot; as it means adding way way more access rights than needed to that process, so it&#x27;s de-facto unusable.<br> </div> Thu, 07 Oct 2021 14:58:45 +0000 Rolling stable kernels https://lwn.net/Articles/872161/ https://lwn.net/Articles/872161/ giraffedata <p>Sasha (at least as reported here) doesn't seem even to acknowledge the very purpose of a stable branch: changes bring breakage. Stable means unchanging; as long as the code doesn't change much, your system will keep working. <p> But you don't want totally stable either, because then broken systems stay broken. <p> So it's a combination of size and value. Stable branches get low-risk high-reward changes. Normally, that means bug fixes, but a really complex bug fix would be excluded if the bug isn't that bad, while a trivial but important new feature (maybe the ability to turn off something that presents security risks) would be included. <p> And of course, testing cannot catch all, or even much, of the breakage. Your best defense against breakage is don't change things. <p> That is why people want someone to continue releasing low-risk high-reward changes to 5.7 long after 5.8 is available. Some will even stay on 5.7 after the bug fixes have stopped rather than risk going to 5.8. Thu, 07 Oct 2021 14:52:29 +0000 Rolling stable kernels https://lwn.net/Articles/872155/ https://lwn.net/Articles/872155/ smoogen <div class="FormattedComment"> The issue is that Sasha is describing the world as he thinks kernel developers see the kernel and you are describing the world as users see the kernel. Kernel developers are saying that they expect you to have done all the testing you did between 5.9 and 5.10 when you switch from 5.10.70 to 5.10.71. Those 1000 code lines may have significantly changed all kinds of functionality or it could have been some patching or it could have been that instead of doing the million line refactor all they needed to do for this functionality change was 1000 lines of crufty code that don&#x27;t need to be maintained in 5.11 and beyond. It could also be a horrible bodge that will need to be fixed in 5.10.72. [It looks right, and the test compiled and it seemed to work but really you need to run a massive back test over a week if you want a &#x27;guarentee&#x27;.]<br> <p> Users on the other hand tend to go by experience of activity. The system didn&#x27;t seem to be any different and my applications work so it must be ok. The difference between 5.10.70 and 5.10.71 is small so it shouldn&#x27;t be a problem. [If it crashes we go back to 5.10.70 and wait for 5.10.72 to see if it got fixed then usually discounting it as a small problem.]<br> <p> For the most part these two worlds work fine.. we don&#x27;t realize the gulf of viewpoints until someone starts talking about them. <br> </div> Thu, 07 Oct 2021 13:43:57 +0000 Rolling stable kernels https://lwn.net/Articles/872120/ https://lwn.net/Articles/872120/ pbonzini <p>Let's say the latest stable release is 5.14.9 and 5.15.1 comes out. The content of rolling-stable is 5.14.9, but when you do <pre>git merge -n 5.15.1 # fails with horrible conflicts git restore -s 5.15.1 . # some of you may know this as "git checkout 5.15.1 -- ." git commit -m'advance rolling stable-tree to 5.15.1'</pre> <p>... it becomes exactly the same as 5.15.1 and can be fast-forwarded from 5.14.9 to 5.15.1. Of course this works because 5.14.10 will never be merged into the rolling-stable tree. Thu, 07 Oct 2021 12:49:00 +0000 Rolling stable kernels https://lwn.net/Articles/872121/ https://lwn.net/Articles/872121/ vgoyal <div class="FormattedComment"> Curious about following.<br> <p> &quot;As another example, at some point OverlayFS completely broke down and is unusable when combined with SELinux - intentionally. The combination is still broken to this day, and the largest user of these (Google on Android) still has to carry out of tree patches to work around them.&quot;<br> <p> Can you provide some more details. Is it broken in upstream kernel? What was broken and how? I had added SELinux support to overlayfs and I did not hear complaints about it being broken.<br> </div> Thu, 07 Oct 2021 12:47:38 +0000 Rolling stable kernels https://lwn.net/Articles/872119/ https://lwn.net/Articles/872119/ cwhitecrowdstrike <div class="FormattedComment"> I wondered the same. I looked online, but the video doesn&#x27;t appear to be on YouTube yet, and I can&#x27;t find the slides (if there are any). Could anyone who attended post whatever details you may have? Thanks!<br> </div> Thu, 07 Oct 2021 12:43:17 +0000 Rolling stable kernels https://lwn.net/Articles/872113/ https://lwn.net/Articles/872113/ atnot <div class="FormattedComment"> Perhaps I should have said &quot;to a degree which is very difficult for the kernel&quot;. But ultimately it comes out to the same thing.<br> </div> Thu, 07 Oct 2021 11:23:49 +0000 Rolling stable kernels https://lwn.net/Articles/872104/ https://lwn.net/Articles/872104/ vegard <div class="FormattedComment"> I would tend to agree with you; it&#x27;s not true, and it&#x27;s easy to check with git. As an example, going from 5.10.70 to 5.10.71 has 1,766 lines changed while going from 5.10.71 to 5.11 has 1,181,674 lines changed. This is an order-of-magnitudes difference.<br> </div> Thu, 07 Oct 2021 07:41:27 +0000