User: Password:
|
|
Subscribe / Log in / New account

Nothing has been moved

Nothing has been moved

Posted Nov 1, 2012 23:49 UTC (Thu) by dlang (subscriber, #313)
In reply to: Nothing has been moved by Arker
Parent article: Airlie: raspberry pi drivers are NOT useful

you can run a simple game, but not a more complex game.

there is nowhere near enough processing power in the ARM chip to simply scale video from 320x200 to full screen without help from the GPU. I accidently triggered this with mplayer a few weeks ago, and it takes 10-20 seconds to play ONE second worth of video when you are doing the scaling on the ARM chip


(Log in to post comments)

Nothing has been moved

Posted Nov 2, 2012 3:20 UTC (Fri) by Arker (guest, #14205) [Link]

you can run a simple game, but not a more complex game.

I wrote and played very complex games on an 8bit processor at around 3mhz with 2 *kilobytes* of RAM. If you cant do the same with many thousands of times the resources, you are doing something wrong.

there is nowhere near enough processing power in the ARM chip to simply scale video from 320x200 to full screen without help from the GPU. I accidently triggered this with mplayer a few weeks ago, and it takes 10-20 seconds to play ONE second worth of video when you are doing the scaling on the ARM chip

Then you need to look at your software stack because the hardware is MORE than capable of it. I could do that smoothly with no problems well over 10 years ago on a 386sx with 1 megabyte of ram and a simple svga card. Presumably you are using an encoding with a significantly higher overhead than MPEG-1 but you are also looking at a system with many *hundreds* of times the horsepower. If it were programmed specifically for the task it could probably drive several different videos of that size to several different monitors at once without dropping a frame.

I hear all the time that things which were done routinely in earlier decades with far less power are now 'impossible' and it makes me laugh. These things arent impossible. Programmers have just been trained that their time is too valuable to spend it optimising anything, and the proper solution is to throw more hardware at it instead. It sells hardware I guess.

Nothing has been moved

Posted Nov 2, 2012 3:28 UTC (Fri) by dlang (subscriber, #313) [Link]

did your 386 have a 1920x1080 32 bit screen? or was it a 640x480 8 bit screen (i.e.VGA)?

the added screen data does make a significant difference.

I agree that lots of stuff is very bloated today, but your 386 was not expected to do full motion HD video output without GPU assistance, and it would not have been able to do so.

Nothing has been moved

Posted Nov 2, 2012 3:43 UTC (Fri) by Arker (guest, #14205) [Link]

It would actually go up to 1024x768 but I didnt think the monitor looked as good like that, so I usually ran it in 800x600 instead. Of course it didnt do "HD" that buzzword wasnt invented yet, but full screen full motion video without dropping a frame it could definitely do and did many times, and 'accelerated graphics' was also something yet to come, at least in my price range.

Now I had a very special software setup to do this, of course. A 'stock' configuration on the same machine would crap itself trying to play much smaller videos, in fact I started that project as a dare because the buddy that owned that machine was complaining it was obsolete because it was performing just like you described with your pi - taking a minute to play a second or two of video - with tiny lowres files even. But the hardware was still perfectly capable of doing the job.

Nothing has been moved

Posted Nov 2, 2012 16:19 UTC (Fri) by bronson (subscriber, #4806) [Link]

The job being talked about is pushing 1920x1080x4x24 (24 at a minimum) = ~190MB/s to the screen. You're talking about pushing 800x600x2?x24 = ~20MB/s.

Time moves on, ya know?

Nothing has been moved

Posted Nov 2, 2012 23:44 UTC (Fri) by Arker (guest, #14205) [Link]

I dont want to beat it to death, but please. Using your figures the video has scaled up 9.5 times (190/20). Comparing the clock rates of the processors, the hardware has increased in the same amount of time by a factor of 83 1/3rd (1000/16.)

And this blunt comparison is a *severe* underestimate of the real difference, because an ARM11 can do a lot more with a clock cycle. That 386 chip didnt even have a floating point unit, let alone tricks like SIMD, branch prediction, out of order completion... clock for clock the ARM chip would still be far more powerful. And that's before you even consider the cache architecture, the system bus... over 500 times the main memory.

I have no doubt at all that if you could get a few thousand of those arm chips in the hands of promising young programmers WITHOUT the fancy GPU to fall back on, one of them would shock you all by making it do things you think are impossible. But if he's told instead he has to use the high level interface and pass OpenGL to a blob he cannot inspect or modify, he'll probably just pass messages until he gets bored, or finds a bug he cant fix, and then move onto something less frustrating than proprietary computing, like playing football with a bunch of guys twice his size or having molars extracted for fun.

Nothing has been moved

Posted Nov 3, 2012 0:19 UTC (Sat) by dlang (subscriber, #313) [Link]

> I have no doubt at all that if you could get a few thousand of those arm chips in the hands of promising young programmers WITHOUT the fancy GPU to fall back on, one of them would shock you all by making it do things you think are impossible. But if he's told instead he has to use the high level interface and pass OpenGL to a blob he cannot inspect or modify, he'll probably just pass messages until he gets bored, or finds a bug he cant fix, and then move onto something less frustrating than proprietary computing, like playing football with a bunch of guys twice his size or having molars extracted for fun.

nobody is disputing that more access would be better, but you are making the assumption that doing new and interesting things with the video is the primary purpose of all users of the device.

It may surprise you that most people who use computers aren't going to try and debug video drivers or firmware, even where they do have that capability. They will usually just download the latest version to see if it's fixed, live with the problem, or revert to a prior version.

We saw this with the Intel video drivers a few years ago, fully open-source drivers, but when there were problems in the drivers in a ubuntu release, 99.999+% of the people just stuck with an older version.

For those people, the difference between a high-level API and a low-level API is meaningless. To be fair, probably 90% of them wouldn't care if the entire driver was a binary blob, but that still leaves a very large group of people who benefit from having all the kernel and userspace stuff being open, even while the firmware is closed and has a high-level API

Nothing has been moved

Posted Nov 3, 2012 15:28 UTC (Sat) by bronson (subscriber, #4806) [Link]

Comparing processor clocks is just silly. The framebuffer is not stored in L1 cache.

On the Pi, the FB is in shared RAM clocked at 400MHz. RAM probably has a bandwidth of around 250MB/sec (wild ass guess based on parts). If you're driving 1080p, that doesn't leave much bandwidth for anything else. Plus ca change, eh?

Nothing has been moved

Posted Nov 2, 2012 7:09 UTC (Fri) by Cyberax (✭ supporter ✭, #52523) [Link]

Nonsense. 386sx couldn't even do anything useful along with fluid full-screen animation in 13h mode (that's 320x200 with 256 colors) without palette tricks or hacks like Wolf3D's runtime code generation.

386dx was a little bit better - it could run simple games like Doom 3D, though it had to use pageflipping with non-standard display modes, because blitting 64kb framebuffer of data was too taxing for these systems.

Nothing has been moved

Posted Nov 2, 2012 11:03 UTC (Fri) by Arker (guest, #14205) [Link]

Nonsense.

This is exactly what I was talking about. You are so secure in your knowledge. Yet have you actually sat down and done the math?

I know it's possible because I did it, so I know that if you actually did the math it would have to be possible. There is a huge difference in what it takes to decode and display a video on top of a multi-user general purpose software stack mostly written in ultra-high languages and essentially unoptimised, versus what is actually possible given a highly optimised decoder running without interference on the bare hardware.

Even given the significant increases in resolution, and the modern codecs which require quite a bit more processor time, the increase in demand on the hardware is orders of magnitude off in comparison to the actual increase we have seen in hardware capability over time. That 386 ran at 16mghz and it was overclocked to do it, and clock for clock it was vastly inferior to your ARM11 which is running at over 60 times the clock frequency. Not to mention having 512 times the RAM on a much faster system bus...

Nothing has been moved

Posted Nov 2, 2012 15:55 UTC (Fri) by Cyberax (✭ supporter ✭, #52523) [Link]

I'm old enough to actually have used a 386sx-based computer. And to write simple demos on it - it was possible to write fluid (30fps) animation on it, but simply filling screen with a solid color was already taxing its RAM bandwidth.

Anyway, Pi's bandwidth is barely enough for full HD video as it is. If you throw in non-trivial rendering - it's simply not enough, again.

Unless, of course, you're ready to limit yourself to "Tetris" or may be "Digger".

Nothing has been moved

Posted Nov 2, 2012 17:24 UTC (Fri) by nix (subscriber, #2304) [Link]

Er, *none* of the video playback and decoding engines are written in 'ultra-high languages': the inner loops of most of them are handcrafted assembler. None of them are 'essentially unoptimized'. The pixel display routines in X are also, these days, mostly done by pixman and to a considerable degree done in handcrafted assembler, taking advantage of SSE and the like.

I note that your 386 almost certainly did not have to decompress compressed video and blit it at the same time as everything else.

Nothing has been moved

Posted Nov 2, 2012 23:50 UTC (Fri) by Arker (guest, #14205) [Link]

I didnt say the codecs are written in ultra-high level languages, although I wouldnt be shocked if you found an instance of it particularly on a less common architecture like ARM. But what I did say was that the rest of the system is often written so. And regardless of how good your codec is, it is still running inside of a much larger, looser system which has very significant performance costs.

Nothing has been moved

Posted Nov 2, 2012 16:56 UTC (Fri) by intgr (subscriber, #39733) [Link]

> there is nowhere near enough processing power in the ARM chip to simply scale video from 320x200 to full screen without help from the GPU. I accidently triggered this with mplayer a few weeks ago

The fact that MPlayer cannot do it isn't evidence that the CPU is incapable. Are you sure that MPlayer is actually utilizing everything that the ARM core has to offer? SIMD instructions etc?

It might just be that MPlayer on ARM is using some generic C scaling routine that nobody has bothered to optimize, because common x86 desktops are all running another assembly-optimized implementation.


Copyright © 2017, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds