The future of realtime Linux in doubt
nothing has materialized". Assuming that doesn't change, Gleixner plans to cut back on development and on plans to get the code upstream. "
After my last talk about the state of preempt-RT at LinuxCon Japan, Linus told me: 'That was far more depressing than I feared'. The mainline kernel has seen a lot of benefit from the preempt-RT efforts in the past 10 years and there is a lot more stuff which needs to be done upstream in order to get preempt-RT fully integrated, which certainly would improve the general state of the Linux kernel again."
Posted Jul 8, 2014 18:09 UTC (Tue)
by fiver22 (guest, #96348)
[Link] (89 responses)
Posted Jul 8, 2014 19:42 UTC (Tue)
by roblucid (guest, #48964)
[Link] (81 responses)
Posted Jul 8, 2014 20:19 UTC (Tue)
by SEJeff (guest, #51588)
[Link] (76 responses)
Realtime kernels are better suited for realtime tasks such as robotics or automation / safety critical systems.
Posted Jul 8, 2014 21:00 UTC (Tue)
by gus3 (guest, #61103)
[Link] (5 responses)
Posted Jul 8, 2014 21:12 UTC (Tue)
by SEJeff (guest, #51588)
[Link] (3 responses)
http://wiki.automotive.linuxfoundation.org/index.php/SCHE...
Linux makes a great soft realtime system, but isn't that awesome for hard realtime. VXworks is better at hard rt, which is why it runs most of those types of systems (up for debate I 'spose)
Posted Jul 8, 2014 23:08 UTC (Tue)
by tglx (subscriber, #31301)
[Link] (2 responses)
Well, that's why preempt-RT is there.
> VXworks is better at hard rt,
Unfortunately nobody is allowed to publish a fair and objective comparison of preempt-RT and vxworks. All you can get are the WindRiver marketing whitepapers...
> which is why it runs most of those types of systems
Sure, according to statistics provided by the most objective source on that matter
> (up for debate I 'spose)
Indeed.
Thanks,
tglx
Posted Jul 9, 2014 13:30 UTC (Wed)
by bokr (guest, #58369)
[Link] (1 responses)
But if you can design a system where every critical waiting-to-preempt
Reading about RT scheduling and CPU affinity etc. in TLPI it seems like
Posted Jul 9, 2014 20:14 UTC (Wed)
by klossner (subscriber, #30046)
[Link]
Posted Jul 8, 2014 22:27 UTC (Tue)
by rgmoore (✭ supporter ✭, #75)
[Link]
Automotive Grade Linux is about user-facing electronics like navigation and entertainment systems; they're actually building on Tizen. It's not serious RT application like RT Linux is targeting.
Posted Jul 8, 2014 21:14 UTC (Tue)
by drag (guest, #31333)
[Link] (51 responses)
Posted Jul 9, 2014 11:16 UTC (Wed)
by jrigg (guest, #30848)
[Link] (50 responses)
Posted Jul 9, 2014 11:38 UTC (Wed)
by jrigg (guest, #30848)
[Link]
Posted Jul 9, 2014 15:20 UTC (Wed)
by drag (guest, #31333)
[Link] (44 responses)
Like the man below said Linux has benefited greatly from this RT kernel work.
I played around a lot with Linux audio years ago and getting responsive system for USB Midi keyboard -> Software Synth (jackd) -> Alsa Modular Syth (jackd) -> speakers output required realtime patches, otherwise my choice was to have a very audible delay between pressing a button and hearing audio versus having frequent drop outs and audio artifacts.
I expect that with modern kernels and big multiprocessor machines things are much better now then they used to be.
'Realtime' is relative. Even 'Hard' realtime.
For my perspective 'Hard realtime' means counting CPU cycles. You know how many cycles it takes to accomplish a certain task. This is obviously not the same definition that other people use.
Posted Jul 9, 2014 18:36 UTC (Wed)
by Tara_Li (guest, #26706)
[Link]
Posted Jul 9, 2014 20:36 UTC (Wed)
by marcH (subscriber, #57642)
[Link] (42 responses)
There is a conceptually very simple and unambiguous definition of real time in hard (!) science with a lot of serious maths and research behind it. I've always wondered why so many people are using "real time" for other, often quite fuzzy concepts with so many discussions about the corresponding meaning of them.
[1] If something is guaranteed to happen ALWAYS before some deadline, no matter how many seconds away is the deadline, then it's real time.
Simple isn't it? Computer science real time is not about "low latency", whatever "low" means. It's only about determinism and DEADlines, which is what matters in safety critical systems (arguably not Linux' field).
Proving [1] real-time does not necessarily involve counting cycles. As long as they demonstrate determinism, coarser-grained proofs and measurements can be used.
Posted Jul 9, 2014 22:32 UTC (Wed)
by Wol (subscriber, #4433)
[Link] (2 responses)
Because you get too many idiots who believe in the Humpty Dumpty school of language - you know - those journalists who think that computer "memory" is the space on the hard disk, and in this particular example I came across a journalist (writing in an industry magazine, for fsck's sake!) who redefined "real time" as meaning "online". And when I complained he came back at me and said "what's the difference?" or "who cares" - something like that!
At the end of the day, we've got too many clueless journalists writing in industry magazines who don't know what words mean byt are only too happy to mis-use them and teach newbies their mis-understandings. Bit like the difference between a cracker and a hacker. How many opinion-formers (journalists) are even aware of the difference?!
(and how many, when you point out their error, simply come back like Humpty Dumpty?!)
Cheers,
Posted Jul 12, 2014 14:27 UTC (Sat)
by clump (subscriber, #27801)
[Link] (1 responses)
Posted Jul 12, 2014 22:45 UTC (Sat)
by Wol (subscriber, #4433)
[Link]
And in this case, that's exactly what happened - the guy used "real time" in a manner I didn't understand, and when I queried it and said "I think you mean "online" ", his reaction was "well, they are the same thing now, who cares".
Sorry, I do care very much because if we define away the meaning of words, we then have no way of expressing that concept, and the meaning of "real time" is life and death! As others have said, real-time is about deadlines, and when you're dealing with industrial plant, that could well be a DEAD line.
And if you're stupid enough to change one word into another, like here using "real time" when there's a perfectly good word "online", then I'm sorry but you damn well shouldn't be a journalist! The grief people have caused me (when I did PC support) because they refused to learn the difference between "memory" and "disk space", or "PC" and "terminal", and probably various other terms as well. If I have to teach them, then fine. If they refuse to learn, well, I'm sorry if I trashed your system by mistake because you didn't want to express yourself clearly ...
Cheers,
Posted Jul 10, 2014 8:23 UTC (Thu)
by dgm (subscriber, #49227)
[Link]
Posted Jul 10, 2014 13:34 UTC (Thu)
by PaulMcKenney (✭ supporter ✭, #9624)
[Link] (28 responses)
In fact, your definition is so simple that it excludes the possibility of any real-world system being a real-time system. After all, if you show me a real-time system, I can show you a hammer that will cause it to miss its deadlines. Of course, if you don't like hammers, there are any number of other implements and natural disasters that will do the trick.
So if you need a real-world system that meets latency specifications of some sort, you will need a somewhat more robust definition of "real time". On the other hand, if you don't need your solutions to work in the real world, carry on as before!
Posted Jul 10, 2014 14:48 UTC (Thu)
by dgm (subscriber, #49227)
[Link] (1 responses)
And if you don't agree I can reach for a bigger hammer.
Posted Jul 10, 2014 16:22 UTC (Thu)
by PaulMcKenney (✭ supporter ✭, #9624)
[Link]
As to your bigger hammer, if my guess as to its intended use is correct, then you are going to have to get in line. And it is a very long line. :-)
Posted Jul 10, 2014 15:05 UTC (Thu)
by jch (guest, #51929)
[Link] (1 responses)
After contact with the hammer, the system ceases having the realtime property.
> So if you need a real-world system that meets latency specifications of some sort, you will need a somewhat more robust definition of "real time".
No, the realtime property is only guaranteed under a given set of conditions (correct hardware, stable power supply, lack of contact with a hammer). This is no different from e.g. a filesystem only guaranteeing consistency if the hardware doesn't lie about its buffers being flushed to stable hardware.
--jch
Posted Jul 10, 2014 15:52 UTC (Thu)
by PaulMcKenney (✭ supporter ✭, #9624)
[Link]
Posted Jul 10, 2014 20:06 UTC (Thu)
by marcH (subscriber, #57642)
[Link] (1 responses)
Yeah, sure: if you drive really too fast or if your brake pads are worn out then the real-time software in your ABS will not stop you from crashing.
More seriously: the definition of real-time software is obviously only about *software* guarantees. At the end of the day your entire system is only as good/safe/secure as its weakest part.
Posted Jul 11, 2014 12:44 UTC (Fri)
by PaulMcKenney (✭ supporter ✭, #9624)
[Link]
And I completely agree that real time is a property of the entire system and its environment, not just of the software. "Hard real time" software doesn't help much unless the rest of the system and the environment are also amenable to hard real time.
Posted Jul 11, 2014 8:41 UTC (Fri)
by Wol (subscriber, #4433)
[Link] (1 responses)
Okay. Let's change the subject of the definition. Make it the definition of "a real-time software system".
At which point your hammer is simply "defined out of existence". :-)
"real time" exists only in maths, anyways, so defining hardware out of the equation isn't really a big thing. The software is real-time when run on appropriate hardware. Damaging the hardware promptly takes it out of spec and invalidates the maths...
Cheers,
Posted Jul 11, 2014 12:31 UTC (Fri)
by PaulMcKenney (✭ supporter ✭, #9624)
[Link]
Although I agree that math can be used approximate some important behaviors of the real world, it is quite easy to take things too far. In particular, any attempt to define a real-world hammer out of existence is unlikely to end well. ;-)
Posted Jul 11, 2014 12:17 UTC (Fri)
by ortalo (guest, #4654)
[Link] (13 responses)
IIRC, we were discussing the problem of cascading scheduling constraints in those hard realtime systems where users define the deadline incorrectly because they do not allow enough time for potential failure-management procedures to execute (think of the functions associated to a self-destruct procedure in a rocket for example).
Globally, my comment is that, yep, you speak of realtime when there are deadlines (definitive ones); but the real fun starts when you think about what your software has to do once these deadlines have been missed and the world is starting to collapse but there is still a chance to avoid it... I guess this is where you can really start to speak about hard realtime. ;-)
Posted Jul 11, 2014 12:25 UTC (Fri)
by ortalo (guest, #4654)
[Link] (3 responses)
As an exercise, I'll let you propose definitions for: comic realtime, poetic realtime, theatral realtime, etc. ;-))
Posted Jul 11, 2014 12:52 UTC (Fri)
by PaulMcKenney (✭ supporter ✭, #9624)
[Link] (2 responses)
Dramatic realtime? Tragic realtime? Comic realtime? Poetic realtime? Comic realtime? Well, there have been some times when a bit of comic realtime would have been quite welcome. And some might argue that many of these comments, no doubt including some of my own, constitute theatrical realtime. ;-)
Posted Jul 13, 2014 3:17 UTC (Sun)
by nevets (subscriber, #11875)
[Link] (1 responses)
Posted Jul 13, 2014 14:06 UTC (Sun)
by PaulMcKenney (✭ supporter ✭, #9624)
[Link]
But it is really hard to believe that article appeared nine years ago!
Posted Jul 11, 2014 14:00 UTC (Fri)
by roblucid (guest, #48964)
[Link] (8 responses)
Much better would be to see it might miss the deadline and for system to take avoiding action, so it meets a goal perhaps with degraded performance.
A processor might normally run down-clocked in a power saving freq. state. if a process which required 1ms CPU time every 100ms according to some worst case analysis, was in 'danger' then scheduler engaging a turbo mode 10ms before expiry and running that task as priority, provides the CPU time without reducing other tasks resources.
Presumably it's possible to have hardware normally use interrupts but fall back to polling of hardware registers, for instance.
Posted Jul 11, 2014 20:18 UTC (Fri)
by dlang (guest, #313)
[Link] (1 responses)
you aren't going to be able to predict that you will miss the deadline with any sort of reliability.
Posted Jul 13, 2014 16:28 UTC (Sun)
by roblucid (guest, #48964)
[Link]
The idea of an "up-clocking" strategy to increase resources "on demand at cost of power" was to mitigate the inherent indeterminism of modern CPU.
I considered how you can make case to be able to meet "hard" deadlines, and assumed any "hard" RT program that risks paging from disk, waits on non predictable event, or as in Mars Pathfinder case blocking on a lower priority process due to inversion is inherently "broken" thus not "hard" RT.
This came out of considering a conversation in late '89 with an RT developer colleague of control systems. Hard RT just loved polling because of it's predictability and simplicity, never mind the performance disadvantages. It seems that just runs counter to philosophy of Linux, which appreciates performance over predictability or simplicity.
A "fast path" which conserves power, but falls back to brute force, polling of registers etc, might be a viable hybrid strategy.
Posted Jul 12, 2014 13:40 UTC (Sat)
by ianmcc (subscriber, #88379)
[Link] (5 responses)
I think, going into the future, where even simple microcontrollers have pretty substantial CPU power, the issue of 'hard' real time is less important than robustness under failure conditions. The surrounding hardware has some failure rate too, and the software (1) needs to cope with that as best it can and (2) there is no point going to extreme lengths to engineer software that has a failure rate of X if the failure rate of the installation as a whole is 100000*X.
Posted Jul 13, 2014 4:26 UTC (Sun)
by mathstuf (subscriber, #69389)
[Link] (1 responses)
Posted Jul 15, 2014 15:11 UTC (Tue)
by Wol (subscriber, #4433)
[Link]
The cambelt fell off!!! Which was a known failure mode :-( the fact it wrecked the engine was a minor point ... I was on a motorway doing 70. Fortunately, late at night, there was no traffic so getting onto the hard shoulder wasn't hard. But if it had been daytime and busy ...
Cheers,
Posted Jul 13, 2014 15:40 UTC (Sun)
by PaulMcKenney (✭ supporter ✭, #9624)
[Link] (2 responses)
The more-reliable software might consume more memory. Adding more memory will degrade the overall system reliability, perhaps even to the point that the system containing more memory and more-reliable software is less reliable than the original system. As the old saying goes: "Be careful, it is a real world out there!"
Posted Jul 13, 2014 16:34 UTC (Sun)
by roblucid (guest, #48964)
[Link] (1 responses)
With enough "more" memory though, things like ECC and a fault tolerant technique like say triple channel with independent implementations allowing a "vote", then you gain greater reliability, like with modern planes which may tolerate multiple engine failures.
Posted Jul 16, 2014 19:19 UTC (Wed)
by PaulMcKenney (✭ supporter ✭, #9624)
[Link]
Posted Jul 11, 2014 19:37 UTC (Fri)
by marcH (subscriber, #57642)
[Link] (5 responses)
Not sure why you assume that defining the software forbids defining the rest of the system... strange.
Posted Jul 13, 2014 14:01 UTC (Sun)
by PaulMcKenney (✭ supporter ✭, #9624)
[Link] (4 responses)
Nice try, but you missed the mark.
The problem is that as you push towards the limits of the system's capabilities, you must be increasingly careful about when and how you apply abstraction. Out at the limits, little things that might otherwise be confined to a given piece of the system start making themselves felt globally. Therefore, out at the limits, premature abstraction is the root of all evil.
In short, at the extremes, you don't have the luxury of defining the software independently of the rest of the system.
One saving grace is that today's hardware can trivially meet requirements that were out at the limits a few decades ago. So the limits have moved out considerably over that time. However, the global awareness required at the limits remains unchanged.
Posted Jul 13, 2014 16:46 UTC (Sun)
by roblucid (guest, #48964)
[Link] (1 responses)
The cost in slightly more performant hardware, may be a wise investment to reduce system complexity, developer time and improved QA of the result.
If "all you care about" is meeting deadlines, sailing close to limits seems perverse and more a "soft" RT thing, where errors in assumptions are tolerable ie network sound streaming, where no-one dies if it fails in unusual circumstances.
Having seen specs and done a requirements analysis for a nuclear power station project, I can assure you squeezing the most out of the HW was the last thing desired. I suspect (I left the project for another opportunity) any error, would have been in over-provision causing too great an increase in system complexity and making it hard to analyse.
Posted Jul 16, 2014 19:11 UTC (Wed)
by PaulMcKenney (✭ supporter ✭, #9624)
[Link]
But if you require deadlines down in the tens of microseconds, overprovisioning may not help much. Last I heard, nuclear power stations had much longer deadlines, but you would know better than I.
Posted Jul 15, 2014 21:37 UTC (Tue)
by marcH (subscriber, #57642)
[Link] (1 responses)
So, why do you go there?
On life-critical systems (or... weapon systems) you do have this "luxury". Except for the name; I guess it's rather called "certification requirement" or something similar.
Posted Jul 16, 2014 19:13 UTC (Wed)
by PaulMcKenney (✭ supporter ✭, #9624)
[Link]
Posted Jul 10, 2014 18:09 UTC (Thu)
by dlang (guest, #313)
[Link] (6 responses)
The problem comes from the words "guaranteed" and "ALWAYS", because in the real world there are always qualifiers there.
If the building holding the computer gets blown up, the event is not going to happen
If the computer sends a signal out, but the wire it's sending it over is broken, the event is not going to happen
If the device the computer is controlling sticks (or otherwise fails), the event is not going to happen.
In practice, there are almost never absolute requirements for reliability
Posted Jul 10, 2014 19:00 UTC (Thu)
by nybble41 (subscriber, #55106)
[Link]
The qualifiers are implied. A guarantee is only as good as the entity making it; in this case, any guarantee by software can only hold so long as the hardware does its part. This qualification underlies *any* statement we may want to make about software behavior, so there is no need to complicate the definitions by making that dependency explicit.
Posted Jul 10, 2014 23:57 UTC (Thu)
by nevets (subscriber, #11875)
[Link] (4 responses)
In my presentations, I've always called the -rt patch a "hard real time designed" system. That is, all decisions are made to meet hard real time requirements (handling responses in a given time frame with no outliers). But with something as big as Linux, there are bound to be bugs. If there is an outlier, we consider it a bug and try to fix it. Hence, the "design" aspect.
Posted Jul 11, 2014 9:15 UTC (Fri)
by marcH (subscriber, #57642)
[Link] (3 responses)
The problem with any big and complex system is that you cannot reason and make proofs about it. One example is memory management: since it's shared across the whole system you can never predict how long it will take since it depends on too many other things.
So, again as an example, would you qualify as a "bug" an involvement (direct OR indirect) of the memory manager in your real-time thread(s)?
Now more general questions about the -rt branch (pardon my ignorance): is it possible to use -rt to implement something which is both "hard" real-time and non trivial and that does involve neither any memory management nor any other unpredictable part of the kernel? Or are people using the -rt branch just because it gives typically lower latencies?
Posted Jul 11, 2014 9:30 UTC (Fri)
by dlang (guest, #313)
[Link]
remember that "hard real time" only means that you meet your stated target. defining that you get the result of 1+1 within 1 second could be a "hard real time" target
more specifically, burning a optical disk is a 'hard real-time' task, if you let the buffer empty out the resulting disk is trash, but Linux has been successfully burning CDs since burners were first available. Back at the start it wasn't uncommon to have a buffer under-run, but it's become almost unheard of in recent years (unless you have a _really_ heavily loaded system)
That said, and answering what you were really asking :-)
anything you do with linux will involve memory management and "unpredictable" parts of the kernel.
The way that -rt addresses this is to work to "guarantee" that none of these parts will stop other things from progressing for "too long". Frequently this is done in ways that allow for more even progress between tasks, but lower overall throughput.
There isn't any academic measurement of the latency guarantees that Linux can provide (stock or with -rt), it all boils down to people doing extensive stress tests (frequently with a specific set of hardware) and determining if the result if fast enough for them.
As noted elsewhere in this topic, stock Linux is good enough for many "hard real-time" tasks, the -rt patches further reduce the max latency, making the result usable for more tasks.
many people use -rt because they think it's faster, even though the system throughput is actually lower, but there are people who use it for serious purposes.
The LinuxCNC project suggests using -rt and when driving machinery many people report substantially better results when using -rt
Posted Jul 12, 2014 11:27 UTC (Sat)
by Wol (subscriber, #4433)
[Link] (1 responses)
The problem with any system that you CAN reason and make proofs about it, is that those proofs have nothing whatsoever to do with the real world.
"As far as the laws of mathematics refer to reality, they are not certain; and as far as they are certain, they do not refer to reality." Albert Einstein.
Cheers,
Posted Jul 15, 2014 21:10 UTC (Tue)
by marcH (subscriber, #57642)
[Link]
Yes we all know that 2 + 2 is actually 5 in the "real world".
> "As far as the laws of mathematics refer to reality, they are not certain;"
... while "not certain" actually means "nothing whatsover to do" in the same real world.
Posted Jul 11, 2014 16:32 UTC (Fri)
by farnz (subscriber, #17727)
[Link] (1 responses)
You've phrased that slightly differently to the version I've worked with, and I'm wondering if we're thinking about the same concept.
In the version I know, the definitions are:
This definition has the advantage of coping with Paul McKenney's hammer; if he hits the system with the hammer, it's failed. It makes the distinction between "hard" and "soft" about the consequences of a missed deadline; in a hard real time system, once you've missed a deadline, there's no point continuing on (you've failed and need repairing), whereas in a soft real time system, you might as well keep going, because you could meet your next deadline.
Posted Jul 11, 2014 20:27 UTC (Fri)
by dlang (guest, #313)
[Link]
so audio recording would be hard-real-time. If it's a live source, you permanently loose data, but if it's recording from a different source, you can restart.
P.S. the ECU in a car needs to malfunction for a significant time period before it will do real harm to the engine
Posted Jul 9, 2014 19:12 UTC (Wed)
by daniel (guest, #3181)
[Link] (3 responses)
You could say that studio monitors are overkill for sound reproduction but some might disagree. If you are OK with the occasional pop, buzz or stutter in your Mahler then hard realtime might indeed be overkill for you.
Posted Jul 10, 2014 10:06 UTC (Thu)
by nye (subscriber, #51576)
[Link] (1 responses)
Bearing in mind that Linux has never been capable of hard realtime, and that vast amounts of audio production is done using general-purpose operating systems like Linux, Windows, or OS X, I think it is an *objective fact*, however much it outrages your delicate sensibilities.
(Low-latency, of course, is going to be a requirement for the majority of audio work, but that typically comes at the expense of hard realtime.)
Posted Jul 10, 2014 20:14 UTC (Thu)
by marcH (subscriber, #57642)
[Link]
If a general purpose operating system makes the audio deadlines 99.9% of the time, then it should indeed be good enough for audio production. It's just the matter of letting your antivirus or disk indexer decide when you do your tea breaks.
Posted Jul 10, 2014 11:13 UTC (Thu)
by jrigg (guest, #30848)
[Link]
Posted Jul 8, 2014 23:48 UTC (Tue)
by marcH (subscriber, #57642)
[Link] (17 responses)
No offence but software for safety critical systems is developed on a completely different planet than Linux - and rightly so. It's not that much about hard versus soft real-time or open versus closed source... the differences that matter are here about formal development, review and verification processes. All very costly BTW. See for instance: http://en.wikipedia.org/wiki/DO-178C
I suspect Linux has simply grown too big to ever stand a chance to be retro-certified and used in that space. Use the right tool for the job!
Posted Jul 13, 2014 14:18 UTC (Sun)
by PaulMcKenney (✭ supporter ✭, #9624)
[Link] (16 responses)
That said, formal proof is but one road to safety-critical certifications. For some applications in some jurisdictions, practical demonstration is permitted. For example, if a given system does the right thing for five years, then it will be considered to be suitable for that particular safety-critical application.
Of course, given that each application needs to a separate long-term safety test, and that any failure means that you start over, the practical-demonstration approach could take even more time and be even more costly than formal proof. The advantage that the practical-demonstration approach has is that the formal-proof approach simply cannot handle anything but the smallest of systems. And rumor has it that Linux actually is used today in some carefully chosen safety-critical applications.
Nicholas McGuire has done some excellent work in this area: http://www.slideshare.net/DTQ4/gnu-linux-for-safety-relat...
Posted Jul 15, 2014 22:29 UTC (Tue)
by marcH (subscriber, #57642)
[Link] (15 responses)
Like you mentioned, Linux has grown too big and general purpose to qualify. I think it's OK; it's good at pretty much everything else.
"Practical demonstration" should definitely be a requirement for safety-critical certifications - but not just! Far from it. I mean, testing should never be an excuse not to use other, common quality processes, especially not for safety-critical code.
> And rumor has it that Linux actually is used today in some carefully chosen safety-critical applications.
Meanwhile, people are kept in fear of terrorists...
Posted Jul 16, 2014 19:06 UTC (Wed)
by PaulMcKenney (✭ supporter ✭, #9624)
[Link] (14 responses)
Posted Jul 16, 2014 21:06 UTC (Wed)
by marcH (subscriber, #57642)
[Link] (13 responses)
Of course I have, who would not? Which software engineer with more than a tiny bit of experience would believe something as lame as "it worked for five years, so it will work forever even in new situations; no need to look at how it works". Would you? This type of subpar software QA is not even considered acceptable by regular, non-critical software!
What next? "It compiles without any warning, so it's good for production"?
> I suggest you take that up with the authorities in the jurisdictions where this is the case.
Names? Interesting to know in which places lives are put at risk.
Posted Jul 17, 2014 14:24 UTC (Thu)
by mathstuf (subscriber, #69389)
[Link] (2 responses)
Posted Jul 18, 2014 4:05 UTC (Fri)
by ssokolow (guest, #94568)
[Link] (1 responses)
Posted Jul 18, 2014 10:42 UTC (Fri)
by marcH (subscriber, #57642)
[Link]
Posted Jul 17, 2014 14:35 UTC (Thu)
by raven667 (subscriber, #5198)
[Link] (8 responses)
That's probably a bit hyperbolic, I mean clearly you can look outside and see that it is not the apocalypse yet.
Posted Jul 17, 2014 21:07 UTC (Thu)
by marcH (subscriber, #57642)
[Link] (7 responses)
> That's probably a bit hyperbolic,
A "risk" is anything with a non-zero probability; leaves plenty of room. It's all in the details.
Posted Jul 17, 2014 21:25 UTC (Thu)
by raven667 (subscriber, #5198)
[Link] (6 responses)
Non-zero doesn't mean significant, there is always risk, managing it is about assessing significance and making subjective value judgements.
8-)
Posted Jul 18, 2014 11:01 UTC (Fri)
by marcH (subscriber, #57642)
[Link] (5 responses)
By the way, when talking about safety security is never far away, example:
http://arstechnica.com/security/2013/07/disabling-a-cars-...
Approving a system after ONLY a successful, five years long demo is giving even less confidence about security than about safety. Most other QA tools and processes tackle both at the same time. If you don't use all known working QA options when working on a safety critical system, then you are not really considering it as safety critical. By the way: the MP3 player in the car is probably not safety critical. Unless it can hack into brakes or steering. Modularity and "less is more"; here we go again.
Note about the car example: unlike carnivore butterflies, cars have already killed and will continue to kill millions of people. But again the key question is: for any given specific crash, who can you blame and who can you sue?
Posted Jul 18, 2014 12:41 UTC (Fri)
by PaulMcKenney (✭ supporter ✭, #9624)
[Link] (4 responses)
Posted Jul 18, 2014 14:23 UTC (Fri)
by marcH (subscriber, #57642)
[Link] (3 responses)
Posted Jul 26, 2014 22:01 UTC (Sat)
by PaulMcKenney (✭ supporter ✭, #9624)
[Link] (2 responses)
If the plaintiff's attorney is better than the defense's attorney, the jury will believe that what was required was whatever you did, plus a lot more. There are always more tests that could have been run, and there are always more types of formal validation that you could have brought to bear. Even if you somehow managed to run all conceivable tests and carried out all conceivable formal validation techniques, more will have been conceived of after the fact.
Of course, this would mean that the only safe way to produce a safety-critical widget would be to invest an infinite amount of time and money into it, that is to say, to not produce it at all. And in some cases, the lack of that safety-critical widget will be costing lives, which clearly indicates a need for a balanced approach to this issue.
And this in turn is one reason that there are laws, rules, and regulations that specify what is required for various safety-critical classifications. And for some of those classifications, the powers that be have determined that a long testing period suffices. Other classifications also require formal validation. Which is in fact a reasonably balanced approach to this issue.
Posted Jul 27, 2014 9:53 UTC (Sun)
by marcH (subscriber, #57642)
[Link] (1 responses)
What I really meant (and was too lazy to write) is: the quality bar for safety-critical applications should at the very least be one big step up from non safety-critical applications. This means using at the very least all the QA tools and methods which are well-known and *routine* - and a bit more. I think no jury would like to hear that some basic code review process or some common and off the shelf static analyser was ignored. This obviously includes testing as well.
By the way: I would be surprised to hear about some place that does not bother mentioning anything beyond testing to qualify safety critical applications, taking all the rest for granted. Now I would be even more surprised to hear about a place that explicitly states that using other, common QA processes is NOT required!
> And in some cases, the lack of that safety-critical widget will be costing lives,
I'm not sure about this one: there is often the option of doing something simpler without using (too) complex software. Or worst case, not do something at all and wait until it can be done safely (and keep prototyping).
I was just reading http://www.dwheeler.com/essays/heartbleed.html again. Security and safety are close in the sense that the most Software Quality processes can be used for both. Quote from David:
Complexity is exactly why general purpose operating systems cannot DRIVE planes and trains. Cars are probably OK: it's much easier to lie and pretend the driver made a mistake ;-)
> what is required for various safety-critical classifications. And for some of those classifications, the powers that be have determined that a long testing period suffices. Other classifications also require formal validation.
I think this sentence shows that some overdue clarification of the meaning of safety-"critical" is needed. For instance I would not have called "safety-critical" a device that merely monitors and alerts. Because in not all but many situations its failure would not have any effect.
For instance Do-178B seems to use "critical" only for the highest level; even more limited meaning than I thought
http://en.wikipedia.org/wiki/DO-178B
Posted Jul 18, 2014 12:34 UTC (Fri)
by PaulMcKenney (✭ supporter ✭, #9624)
[Link]
As to places in which lives are put at risk, that would be pretty much anywhere that grants drivers licenses, and especially those that grant drivers licenses to teenagers on the one hand and to the elderly and infirm on the other, now wouldn't it? ;-)
Posted Jul 8, 2014 23:49 UTC (Tue)
by marcH (subscriber, #57642)
[Link] (3 responses)
FPGAs are used now. Software is too slow - any software.
Posted Jul 9, 2014 19:06 UTC (Wed)
by daniel (guest, #3181)
[Link] (2 responses)
Posted Jul 9, 2014 19:39 UTC (Wed)
by marcH (subscriber, #57642)
[Link] (1 responses)
Sure, but is this majority still in need of crazy low latencies now that FPGAs are used or are they happy with the default Linux ones? That was the question above.
Posted Jul 12, 2014 20:11 UTC (Sat)
by daniel (guest, #3181)
[Link]
Posted Jul 8, 2014 21:41 UTC (Tue)
by lamawithonel (subscriber, #86149)
[Link] (4 responses)
Posted Jul 9, 2014 1:40 UTC (Wed)
by areilly (subscriber, #87829)
[Link] (3 responses)
Seems that the solution-du-jour for mixing Linux stack and hard-real-time processing is to just add some more DSPs to the SoC, which seems to work fine...
Posted Jul 9, 2014 4:01 UTC (Wed)
by raven667 (subscriber, #5198)
[Link]
Posted Jul 9, 2014 15:36 UTC (Wed)
by drag (guest, #31333)
[Link]
The timing for certain types of games is critical. The 'Mortal Kombat' style fighting games, for instance, can be extremely competitive.
One of the biggest problems faced by consoles is that while they are much faster then older systems they are also much slower in other ways. For your average RTS or whatever it's pretty irrelevant.. and for FPS games network latency is a much bigger problem that masks other lesser sources of latency, but it's still a issue that needs to be addressed.
The 'pipeline' from a button press on a USB device, through the kernel, through the display server, through to the application, and then back through the GPU and being rendered on the screen is a extremely long one.
Ideally response should be probably under 10ms or at least it should be very consistent if you can't get that.
I think people underestimate what it takes to actually have a human interface device. It's not easy and when things are latency sensitive the intense complexity of modern OS is a huge liability.
> PC hardware was the GPU drivers locking the bus for excessive periods (to improve draw performance).
Shit graphics drivers have always been a especially big problem on Linux, but once you get that tackled you still have to deal with the rest of the system.
> DSPs to the SoC, which seems to work fine...
And it adds money, development time, and unfixable bugs. It also reduces flexibility.
If you can do everything using software using generic off the shelf parts you have a huge advantage.
Posted Jul 10, 2014 23:36 UTC (Thu)
by zlynx (guest, #2285)
[Link]
I wish that more game developers did care and use real time code techniques. The last game developer that seemed to care was John Carmack. The idtech 5 engine used for Rage does 60 fps. All the time. The textures may not be right yet but the game WILL run at full speed. I've seen the textures pop but I've never seen it jank like other games do.
On an unloaded system (like a console) real-time scheduling and such isn't needed, because if the CPU is doing only one thing and it is fast enough there isn't any need to prioritize things. But the *techniques* used in RT coding still apply. Don't let your RT threads block on IO not even memory paging. Use work queuing and lock-free algorithms instead of mutexes. Etc. Maybe the game AI falls behind the player but the UI should remain smooth and interactive under ALL conditions.
I do think it would be nice if game developers went that little extra way to use a real-time scheduling class and memory locking to guarantee good game play no matter what other applications start running or how much memory they use.
Posted Jul 9, 2014 8:50 UTC (Wed)
by zdzichu (subscriber, #17118)
[Link] (1 responses)
Posted Jul 9, 2014 10:00 UTC (Wed)
by mtaht (subscriber, #11087)
[Link]
Sure hope it's been patched for heartbleed, at least.
Posted Jul 9, 2014 7:36 UTC (Wed)
by jpfrancois (subscriber, #65948)
[Link] (1 responses)
Spacex is according to this article using linux for their spacecraft flight system. In a controlled environment, how much need is there for a RT kernel ? If you have no swap, enough memory, and a known software footprint, isn't soft realtime good enough ?
Posted Jul 9, 2014 8:27 UTC (Wed)
by tialaramex (subscriber, #21167)
[Link]
If you work in the space industry you'd better learn to tolerate failure, because you will fail A LOT. The question is never whether failure can be eliminated, but only whether it can be reduced in a cost-effective way.
Posted Jul 9, 2014 14:41 UTC (Wed)
by abacus (guest, #49001)
[Link] (3 responses)
Posted Jul 9, 2014 17:14 UTC (Wed)
by HIGHGuY (subscriber, #62277)
[Link]
Posted Jul 9, 2014 17:53 UTC (Wed)
by torch (guest, #35932)
[Link] (1 responses)
Posted Jul 10, 2014 12:50 UTC (Thu)
by da4089 (subscriber, #1195)
[Link]
Posted Jul 9, 2014 20:06 UTC (Wed)
by landley (guest, #6789)
[Link] (1 responses)
Posted Jul 10, 2014 5:09 UTC (Thu)
by xkr47 (guest, #48763)
[Link]
Posted Jul 11, 2014 8:46 UTC (Fri)
by russh347 (guest, #97826)
[Link] (2 responses)
Average latencies are acceptable (<< 100usec) either way. Unfortunately, there were occasional 4ms latencies in the 'stock' kernel. With the RT patch, the max latency dropped to something less than 50usec.
For many apps, 4ms is just fine. For ours, not so much.
Russ Hill
Posted Jul 11, 2014 13:03 UTC (Fri)
by ortalo (guest, #4654)
[Link] (1 responses)
However, from an outsider point of view, I can assure you that the people working on the RT kernel can give you adequate advice for low latency (and certainly better than myself)!
Posted Jul 11, 2014 20:23 UTC (Fri)
by dlang (guest, #313)
[Link]
When the -rt work started, this was probably more like 90%/9%
In many cases, while the percentage of cases where -rt helped, the reduction in latency experienced in those cases can be drastic (enough to seriously affect the average, not just the worst-case)
* percentages are not measured, but just my understanding
Posted Jul 12, 2014 1:40 UTC (Sat)
by xxiao (guest, #9631)
[Link]
Posted Feb 14, 2017 15:19 UTC (Tue)
by tlsmith3777 (guest, #114103)
[Link]
Have the funding issues been resolved and how were they resolved if so?
What are the future plans for the Linux real-time kernel, will it continue to be maintained by the community in the foreseeable future?
How does the Linux real time kernel compare with ROS for real-time determinism and latency? Any papers for comparison, especially for automotive autonomous driving solutions?
The future of realtime Linux in doubt
The future of realtime Linux in doubt
The future of realtime Linux in doubt
The future of realtime Linux in doubt
The future of realtime Linux in doubt
The future of realtime Linux in doubt
The future of realtime Linux in doubt
more important work than what the(!) CPU was doing, with hardware interrupt
signals being the primary way to yank the CPU's chain, so to speak.
thread already has its own CPU, and its response latency is that of
a smart spinlock, then I'd argue the idea of preemption changes its focus
-- presumably to prioritizing access to communication resources,
or guaranteeing dedicated ones, like fixed packet slots in synchronous
fast-enough streams (or dedicated parallel wire paths between things, if
serial packet slot multiplexing is not fast enough), and so on.
the pieces are there to enable developers to configure Linux to do most
anything, but getting the kernel to adapt various flavors of niceness
automatically to any arbitrary mix of apps thrown at it is probably tricky.
The future of realtime Linux in doubt
The future of realtime Linux in doubt
The future of realtime Linux in doubt
The future of realtime Linux in doubt
The future of realtime Linux in doubt
The future of realtime Linux in doubt
The future of realtime Linux in doubt
The future of realtime Linux in doubt
[2] If something happens "only" 99.9% of the time before the deadline then it's not real time - even if it's under microseconds 99.9% of the time.
The future of realtime Linux in doubt
Wol
The future of realtime Linux in doubt
The future of realtime Linux in doubt
Wol
The future of realtime Linux in doubt
The future of realtime Linux in doubt
The future of realtime Linux in doubt
The future of realtime Linux in doubt
The future of realtime Linux in doubt
The future of realtime Linux in doubt
The future of realtime Linux in doubt
The future of realtime Linux in doubt
The future of realtime Linux in doubt
Wol
The future of realtime Linux in doubt
The future of realtime Linux in doubt
My idea was to make the distinction between purely best effort systems (where the distinction between normal and realtime-specific systems is sometimes pretty slim) and more sophisticated environments where a lot of infrastructure is in place for providing guarantees to the user and means for activating recovery (or failure-attenuating) code.
The future of realtime Linux in doubt
- dramatic realtime: strong deadlines, but still hope if you miss them and lot of complex software to manage the outcome;
- tragic realtime: strong deadlines, no hope if you miss them, just do your best and live your remaining moments honorably - your own definition of honor matters a lot.
The future of realtime Linux in doubt
The future of realtime Linux in doubt
The future of realtime Linux in doubt
The future of realtime Linux in doubt
If I had a car engine, which tried to recover after missing a deadline which meant damage, I'd be pretty annoyed, when it turned itself off to avoid further problem. Or say, break fluid pressure low, best not to allow acceleration, but warn and put hazard lights on when travelling at speed.
The future of realtime Linux in doubt
The future of realtime Linux in doubt
The future of realtime Linux in doubt
The future of realtime Linux in doubt
The future of realtime Linux in doubt
Wol
The future of realtime Linux in doubt
The future of realtime Linux in doubt
The future of realtime Linux in doubt
The future of realtime Linux in doubt
The future of realtime Linux in doubt
The future of realtime Linux in doubt
The future of realtime Linux in doubt
The future of realtime Linux in doubt
The future of realtime Linux in doubt
The future of realtime Linux in doubt
The future of realtime Linux in doubt
The future of realtime Linux in doubt
The future of realtime Linux in doubt
The future of realtime Linux in doubt
The future of realtime Linux in doubt
Wol
The future of realtime Linux in doubt
The future of realtime Linux in doubt
The future of realtime Linux in doubt
The future of realtime Linux in doubt
The future of realtime Linux in doubt
>You could say that studio monitors are overkill for sound reproduction but some might disagree. If you are OK with the occasional pop, buzz or stutter in your Mahler then hard realtime might indeed be overkill for you.
The future of realtime Linux in doubt
The future of realtime Linux in doubt
safety critical
safety critical
safety critical
safety critical
safety critical
safety critical
Linux. Debian 6, specifically.
safety critical
safety critical
safety critical
safety critical
safety critical
safety critical
safety critical
safety critical
safety critical
safety critical
"code should be refactored over time to make it simple and clear, not just constantly add new features. [...] The goal should be code that is obviously right, as opposed to code that is so complicated that I can’t see any problems."
safety critical
The future of realtime Linux in doubt
The future of realtime Linux in doubt
The future of realtime Linux in doubt
The future of realtime Linux in doubt
The future of realtime Linux in doubt
The future of realtime Linux in doubt
The future of realtime Linux in doubt
The future of realtime Linux in doubt
The future of realtime Linux in doubt
The future of realtime Linux in doubt
The future of realtime Linux in doubt
The future of realtime Linux in doubt
http://lwn.net/Articles/540368/
The future of realtime Linux in doubt
Commercial real-time Linux support
Commercial real-time Linux support
Commercial real-time Linux support
Commercial real-time Linux support
The future of realtime Linux in doubt
The future of realtime Linux in doubt
The future of realtime Linux in doubt
The future of realtime Linux in doubt
Both things are obviously time-related and usually benefit from each other; but not always, so you should not necessarily expect the RT kernel to offer you low latency or even lower latency than a regular kernel on your hardware.
The future of realtime Linux in doubt
The future of realtime Linux in doubt
Maybe kernel.org should have a 'support' button to collect some funds for projects like this?
The future of realtime Linux in doubt