Nothing has been moved
Nothing has been moved
Posted Oct 31, 2012 10:09 UTC (Wed) by dlang (guest, #313)In reply to: Nothing has been moved by Arker
Parent article: Airlie: raspberry pi drivers are NOT useful
It seems to me that if the card firmware had a high level API (like we are talking about here), I would not have to decide between using the latest kernel (self compiled, with various 'odd' config options) or getting good performance.
I would actually prefer a card like that to what I can currently buy for my systems.
Posted Oct 31, 2012 23:36 UTC (Wed)
by Arker (guest, #14205)
[Link] (21 responses)
It seems to me that if the card were architected as you say you would like, you might indeed get the performance of the proprietary drivers while still using a Free shim. But you would also get the instability, and there would be absolutely no way you could fix it except to get new hardware. So it doesnt seem like an improvement to me, quite the opposite. With my ATI hardware, I at least have a choice.
I would love to have Free drivers that were stable and reliable and also supported the full feature set and ran as fast or faster than the blobs - that's what we should expect, frankly, but I know we arent getting it right now. Free drivers that are stable predictable reliable at least give me the opportunity to use the ATI hardware in my system without the bugginess of the proprietary driver, at some cost. If it were architected like the pi, it sounds like I would no longer have that option, whatever binary buginess it has will be found in the GPU, which you have no access to, but which by contrast can corrupt or deliberately overwrite anything you do with the ARM chip.
No?
Posted Nov 1, 2012 1:20 UTC (Thu)
by Cyberax (✭ supporter ✭, #52523)
[Link] (15 responses)
Posted Nov 1, 2012 22:56 UTC (Thu)
by Arker (guest, #14205)
[Link] (14 responses)
Free that up from the control of this GPU and associated binary blob and it could be quite useful. If only it were as simple as tearing that chip off the die and soldering on a serial port...
Posted Nov 1, 2012 23:49 UTC (Thu)
by dlang (guest, #313)
[Link] (13 responses)
there is nowhere near enough processing power in the ARM chip to simply scale video from 320x200 to full screen without help from the GPU. I accidently triggered this with mplayer a few weeks ago, and it takes 10-20 seconds to play ONE second worth of video when you are doing the scaling on the ARM chip
Posted Nov 2, 2012 3:20 UTC (Fri)
by Arker (guest, #14205)
[Link] (11 responses)
I wrote and played very complex games on an 8bit processor at around 3mhz with 2 *kilobytes* of RAM. If you cant do the same with many thousands of times the resources, you are doing something wrong. Then you need to look at your software stack because the hardware is MORE than capable of it. I could do that smoothly with no problems well over 10 years ago on a 386sx with 1 megabyte of ram and a simple svga card. Presumably you are using an encoding with a significantly higher overhead than MPEG-1 but you are also looking at a system with many *hundreds* of times the horsepower. If it were programmed specifically for the task it could probably drive several different videos of that size to several different monitors at once without dropping a frame. I hear all the time that things which were done routinely in earlier decades with far less power are now 'impossible' and it makes me laugh. These things arent impossible. Programmers have just been trained that their time is too valuable to spend it optimising anything, and the proper solution is to throw more hardware at it instead. It sells hardware I guess.
Posted Nov 2, 2012 3:28 UTC (Fri)
by dlang (guest, #313)
[Link] (5 responses)
the added screen data does make a significant difference.
I agree that lots of stuff is very bloated today, but your 386 was not expected to do full motion HD video output without GPU assistance, and it would not have been able to do so.
Posted Nov 2, 2012 3:43 UTC (Fri)
by Arker (guest, #14205)
[Link] (4 responses)
Now I had a very special software setup to do this, of course. A 'stock' configuration on the same machine would crap itself trying to play much smaller videos, in fact I started that project as a dare because the buddy that owned that machine was complaining it was obsolete because it was performing just like you described with your pi - taking a minute to play a second or two of video - with tiny lowres files even. But the hardware was still perfectly capable of doing the job.
Posted Nov 2, 2012 16:19 UTC (Fri)
by bronson (subscriber, #4806)
[Link] (3 responses)
Time moves on, ya know?
Posted Nov 2, 2012 23:44 UTC (Fri)
by Arker (guest, #14205)
[Link] (2 responses)
And this blunt comparison is a *severe* underestimate of the real difference, because an ARM11 can do a lot more with a clock cycle. That 386 chip didnt even have a floating point unit, let alone tricks like SIMD, branch prediction, out of order completion... clock for clock the ARM chip would still be far more powerful. And that's before you even consider the cache architecture, the system bus... over 500 times the main memory.
I have no doubt at all that if you could get a few thousand of those arm chips in the hands of promising young programmers WITHOUT the fancy GPU to fall back on, one of them would shock you all by making it do things you think are impossible. But if he's told instead he has to use the high level interface and pass OpenGL to a blob he cannot inspect or modify, he'll probably just pass messages until he gets bored, or finds a bug he cant fix, and then move onto something less frustrating than proprietary computing, like playing football with a bunch of guys twice his size or having molars extracted for fun.
Posted Nov 3, 2012 0:19 UTC (Sat)
by dlang (guest, #313)
[Link]
nobody is disputing that more access would be better, but you are making the assumption that doing new and interesting things with the video is the primary purpose of all users of the device.
It may surprise you that most people who use computers aren't going to try and debug video drivers or firmware, even where they do have that capability. They will usually just download the latest version to see if it's fixed, live with the problem, or revert to a prior version.
We saw this with the Intel video drivers a few years ago, fully open-source drivers, but when there were problems in the drivers in a ubuntu release, 99.999+% of the people just stuck with an older version.
For those people, the difference between a high-level API and a low-level API is meaningless. To be fair, probably 90% of them wouldn't care if the entire driver was a binary blob, but that still leaves a very large group of people who benefit from having all the kernel and userspace stuff being open, even while the firmware is closed and has a high-level API
Posted Nov 3, 2012 15:28 UTC (Sat)
by bronson (subscriber, #4806)
[Link]
On the Pi, the FB is in shared RAM clocked at 400MHz. RAM probably has a bandwidth of around 250MB/sec (wild ass guess based on parts). If you're driving 1080p, that doesn't leave much bandwidth for anything else. Plus ca change, eh?
Posted Nov 2, 2012 7:09 UTC (Fri)
by Cyberax (✭ supporter ✭, #52523)
[Link] (4 responses)
386dx was a little bit better - it could run simple games like Doom 3D, though it had to use pageflipping with non-standard display modes, because blitting 64kb framebuffer of data was too taxing for these systems.
Posted Nov 2, 2012 11:03 UTC (Fri)
by Arker (guest, #14205)
[Link] (3 responses)
This is exactly what I was talking about. You are so secure in your knowledge. Yet have you actually sat down and done the math? I know it's possible because I did it, so I know that if you actually did the math it would have to be possible. There is a huge difference in what it takes to decode and display a video on top of a multi-user general purpose software stack mostly written in ultra-high languages and essentially unoptimised, versus what is actually possible given a highly optimised decoder running without interference on the bare hardware. Even given the significant increases in resolution, and the modern codecs which require quite a bit more processor time, the increase in demand on the hardware is orders of magnitude off in comparison to the actual increase we have seen in hardware capability over time. That 386 ran at 16mghz and it was overclocked to do it, and clock for clock it was vastly inferior to your ARM11 which is running at over 60 times the clock frequency. Not to mention having 512 times the RAM on a much faster system bus...
Posted Nov 2, 2012 15:55 UTC (Fri)
by Cyberax (✭ supporter ✭, #52523)
[Link]
Anyway, Pi's bandwidth is barely enough for full HD video as it is. If you throw in non-trivial rendering - it's simply not enough, again.
Unless, of course, you're ready to limit yourself to "Tetris" or may be "Digger".
Posted Nov 2, 2012 17:24 UTC (Fri)
by nix (subscriber, #2304)
[Link] (1 responses)
I note that your 386 almost certainly did not have to decompress compressed video and blit it at the same time as everything else.
Posted Nov 2, 2012 23:50 UTC (Fri)
by Arker (guest, #14205)
[Link]
I didnt say the codecs are written in ultra-high level languages, although I wouldnt be shocked if you found an instance of it particularly on a less common architecture like ARM. But what I did say was that the rest of the system is often written so. And regardless of how good your codec is, it is still running inside of a much larger, looser system which has very significant performance costs.
Posted Nov 2, 2012 16:56 UTC (Fri)
by intgr (subscriber, #39733)
[Link]
The fact that MPlayer cannot do it isn't evidence that the CPU is incapable. Are you sure that MPlayer is actually utilizing everything that the ARM core has to offer? SIMD instructions etc?
It might just be that MPlayer on ARM is using some generic C scaling routine that nobody has bothered to optimize, because common x86 desktops are all running another assembly-optimized implementation.
Posted Nov 1, 2012 6:29 UTC (Thu)
by dlang (guest, #313)
[Link] (4 responses)
Personally, I suspect that most of the problems are in the latter category.
The closed driver is having to interact in a multi-threaded environment with other processes manipulating memory, with allocating memory in the same space as the rest of the kernel, and with all the locking that the rest of the kernel expects (and in some cases requires). And the closed driver is trying to do this without being modified from kernel version to kernel version, even though the rules for the kernel are changing (the locking rules in particular, although memory management changes somewhat as well).
If stuff running on the GPU limits itself to reading and writing buffers that are explicitly allocated for it, almost all of the problems mentioned go away, and the remaining 'shim' driver can evolve along with the rest of the kernel.
In this case, they talked about how part of the difference was the closed drivers supporting a newer version of opengl, with the high level interface this would not vary from driver to driver (unless specific drivers required different versions of the firmware), so I would expect that things would be a lot closer to feature parity between the two modes.
In any case, even with the ATI mode of doing things, if there is bugginess in the firmware, it can cause problems for the overall system
Posted Nov 1, 2012 23:04 UTC (Thu)
by Arker (guest, #14205)
[Link] (3 responses)
A distinction without a difference. The driver has no purpose and no function other than inside the kernel. Nothing has been moved
Nothing has been moved
Nothing has been moved
Nothing has been moved
Nothing has been moved
you can run a simple game, but not a more complex game.
there is nowhere near enough processing power in the ARM chip to simply scale video from 320x200 to full screen without help from the GPU. I accidently triggered this with mplayer a few weeks ago, and it takes 10-20 seconds to play ONE second worth of video when you are doing the scaling on the ARM chip
Nothing has been moved
Nothing has been moved
Nothing has been moved
Nothing has been moved
Nothing has been moved
Nothing has been moved
Nothing has been moved
Nothing has been moved
Nonsense.
Nothing has been moved
Nothing has been moved
Nothing has been moved
Nothing has been moved
Nothing has been moved
Nothing has been moved
the question is if the stability issues are due to the driver, or due to the driver's interaction with the rest of the kernel.
If stuff running on the GPU limits itself to reading and writing buffers that are explicitly allocated for it, almost all of the problems mentioned go away, and the remaining 'shim' driver can evolve along with the rest of the kernel.
But as long as the stuff running on the GPU is an opaque blob we cannot audit or replace there is absolutely no way we can ever have any confidence that it is limited like that.
Posted Nov 1, 2012 23:47 UTC (Thu)
by dlang (guest, #313)
[Link] (2 responses)
> A distinction without a difference. The driver has no purpose and no function other than inside the kernel.
Actually, in this case it is a very important distinction.
let's put it another way.
Are the bugs in the graphics logic, or in the interaction with the rest of the kernel.
If the bugs are in the graphics logic, then they would remain if they were separated the way the Pi broadcom driver is.
If the bugs are in the interaction with the rest of the kernel, then an API like the Pi has would allow us the best of both worlds, good graphics performance, and clean interaction with the kernel
The driver vendors keep wanting to have a stable API for their interaction with the kernel, and the kernel devs (for good reason) refuse to freeze the kernel internal APIs. But if the API to the device is defined and frozen by the firmware interface, everybody wins (except those people who want to make the graphics hardware do different things)
Yes, the graphics hardware could start scribbling to any part of memory that it wants, but technically, so could any bus-mastering controller card, and there have been very few cases where bus-mastering network or drive interface cards have caused problems from this.
Posted Nov 2, 2012 3:32 UTC (Fri)
by Arker (guest, #14205)
[Link] (1 responses)
No, I dont agree. There is no win there for me at all (other than simply not buying it.) The current situation with my ATI card is far preferable. You may call it a win if you get what you want out of it, but you do not get to define it as a win for me. What I want is a system where there is nothing running that I did not put there, nothing that I cannot edit, no code that I cannot audit - that is the whole point to free software. The hardware I pay for should respond to my commands, not anyone elses. Your 'solution' gives me exactly zero of what I want, it's not a compromise, it's a total loss.
Posted Nov 2, 2012 7:25 UTC (Fri)
by dlang (guest, #313)
[Link]
you conveniently left off the caveat that covered you
>> (except those people who want to make the graphics hardware do different things)
Nothing has been moved
Nothing has been moved
But if the API to the device is defined and frozen by the firmware interface, everybody wins
Nothing has been moved