User: Password:
Subscribe / Log in / New account

Embedded is a special case

Embedded is a special case

Posted Oct 4, 2007 6:00 UTC (Thu) by tshow (subscriber, #6411)
In reply to: Embedded is a special case by filker0
Parent article: Memory part 2: CPU caches

> Embedded systems are a special case and all of the rules change.

That's fair enough, but there are an awful lot of game machines and embedded systems out there; more than there are PCs, if you count game systems, cellphones, PDAs, set-top boxes, the control systems in cars...

Our game engine deals with tightly-coupled address-mapped memory on all the platforms it supports; on platforms that don't actually have such memory (PCs, mostly), we fake it with a block of normal memory. We've built our engine as an OS (and support libraries) for games; the idea being that a game will compile on any platform that the engine supports with minimal resorting to #ifdef. You *can* write fast platform-agnostic game code that crosses (very different) platforms.

A whole lot of the techniques that I'm sure this series of articles is going to delve into (walking memory in address-order whenever possible, aligning data structures to (ideally) machine word size, (hopefully) cache line size or (at worst) hardware page size, keeping transitions across page boundaries to a minimum, unrolling loops is no longer a good idea, strategies for preventing icache misses...) are just as applicable to embedded systems as they are to PCs. Arguably moreso; caches on embedded systems and game systems tend to be significantly smaller than on PCs, so the cost of cache misses is that much higher.

With relatively little effort and a little discussion of the wider realms beyond the beige (or black, or possibly silvery; your mileage may vary) desktop space heater, this could be a significantly more useful treatise.

(Log in to post comments)

No disagreement here

Posted Oct 5, 2007 1:10 UTC (Fri) by filker0 (guest, #31278) [Link]

Embedded systems may outnumber general purpose PCs, but I doubt that any single platform
outnumbers them on its own. Also, far fewer programmers ever have a chance to program one.
Whether all programmers have to know how to deal with systems with 4 different types of RAM,
or demand paged high speed static RAM that is paged from a larger SDRAM, that in turn is paged
from NOR or NAND Flash by a separate microprocessor that implements a predictive pre-fetch.
Each platform is a special case.

A game engine such as the one you describe provides a virtual machine, and makes a heck of a
lot of sense. All you have to port, as you said, is the VM. (Not all VMs use byte-codes, afterall).

My current project (I'm the low-level platform guy) involves a lot of cache performance
optimization in the application level code -- aligning data on cache line boundaries, use of burst
DMA to do memory-to-memory transfers in parallel with continued code execution, and explicit
cache loading and flushes. But in our system, everything is deterministic (it has to be by the
rules of our industry). Determinism is extremely hard on a pipelined RISC architecture, and when
you add cache to the picture, it becomes almost impossible. In our case, though we need to
squeeze every drop of performance that we can, that comes second to it always taking the same
amount of time to do a specific operation.

Most programmers don't have to know the kind of cache details that game console and some
other embedded programmers (avionics, in my case) do. Still, I think it's good that more
programmers understand the concepts and techniques for improving cache performance in a
general multi-programming environment.

Copyright © 2017, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds