User: Password:
Subscribe / Log in / New account

Drivers as documentation

Drivers as documentation

Posted Nov 23, 2011 23:50 UTC (Wed) by magnuson (subscriber, #5114)
Parent article: Drivers as documentation

As a digital hardware engineer myself I have to say that I got some hearty LOLs out of this article. Sadly, all of this is true. Specs are written before the hardware is coded and seldom is it updated after the fact once the rubber hits the road.

As for the the example I can tell you precisely why it's documented that way. "DMA engine randomly locks up" is a hardware bug that customers may demand fixed at great expense. A typical full layer spin for an IC will run you about $500,000 these days and that's if you have push a lot of volume. "Reserved" is no problem at all. Just don't do that. Simple.

I've coded a few magic constants in my day but they've been clearly labels as such with at least an attempt to describe what they do. Mostly they are escape hatches in case of bugs elsewhere... In any case my neck remain un-wrung to I must be doing something right.

(Log in to post comments)

Drivers as documentation

Posted Nov 24, 2011 11:57 UTC (Thu) by gnb (subscriber, #5132) [Link]

> "Reserved" is no problem at all. Just don't do that. Simple.

Ideally yes, but in the example given the bit defaults to the wrong value and needs to be set in defiance of the datasheet. This does happen.

Drivers as documentation

Posted Nov 25, 2011 5:56 UTC (Fri) by jzbiciak (subscriber, #5246) [Link]

In some cases, the "reserved" bits exist to disable or modify some feature, say as part of a test mode. They get marked "reserved", because a future product in the same family might want to use that bit for some other functionality. Or, the behavior of the test modes themselves might change between different revisions of the device.

So it doesn't surprise me at all that there might not be reasonable customer-level documentation for some of these "reserved" bits. At the same time, I can also see that workarounds for flaky designs ("Oops, the frobnitz accelerator sometimes frotzes when it should gronk, if two quux DMAs come in consecutive cycles") might rely on these "reserved" bits. The bit in the article above might be documented internally as "disable frobnitz acceleration (internal test purposes only; do not use for normal operation)".

Other times, it's due to the same peripheral getting used in different configurations on different chips, and the field in question should be "irrelevant" for this particular chip. So the bit exists and the feature it controls exists, but the feature isn't necessarily useful on this chip, or wasn't spec'd to be on this chip, and so it might not be tested. The fact that enabling it anyway stops DMAs from randomly crashing might be a happy accident.

In my day job, I'm at the head of one of these pipelines, writing specs that lead to the hardware and later to the customer documentation. I also had some good chuckles reading through this article. From what I've seen, the process of turning my specs into end customer documentation involves a lot of deleting (missing some of the internal implementation details, but invariably deleting some important detail customers need), and inserting several *ahem* interesting grammatical twists and confusing diagrams. I don't envy the folks that have to make the end-customer documentation from my specs, but sausage making is sausage making, no matter who makes it.

Drivers as documentation

Posted Dec 1, 2011 16:41 UTC (Thu) by LeftCoastDave (guest, #81645) [Link]

I'm glad you are a HW engineer and you admit it. In all my years of firmware, I've met very few HW engineer's that will admit this.

This article hit very close to home, especially the part about DMA randomly locking up. I previously spent a lot of time on a deeply embedded firmware design where I would scan all DMA operations looking for certain lockup conditions, then abort the DMA hardware operation and replace the operation by a firmware memcpy.

I believe the issue with digital HW engineers comes from their work environment. Here are a few reasons:

1) Their designs tend to be quite in depth, and it is difficult to read verilog/vhdl code, so the code is heavily supported by word docs with block diagrams. This of course results in code slowly migrating away from documentation, and without the review process only the designer of the module knows how it works. They've done this for so long, that they believe this is the only way to do development.

2) Because of (1), brick walls form around modules, and only black box testing is done by simulations. Now no one looks at your code, no one appreciates being told that their code isn't up to par for readability, so they get very defensive and don't like to admit any fault internal to their design.

3) On top of all that chip schedules are always rushed, but they have no easy ability to go back and fix small mistakes, so they try and hide them under the carpet and blame firmware, or expect firmware to just figure out how to make it work.

4) It takes month for a chip to come back from the fab plant, months later for firmware to finish a dev. kit for it, and start shipping it out to customers, months later when a customer finds a bug. A year turn around is normal, longer is not uncommon. Digital HW engineers are usually onto another project by that time

Now every time a bug is found there is knee jerk reaction to blame firmware, even when a firmware team approaches the problem as "we don't know where the bug is, but we need helping locating it". The HW engineer just doesn't want to open up that can of worms, they know was just a rush job. Excuses like "We aren't going to re-spin the chip for this bug, so why should I spend time looking into it", are the norm. This really doesn't help, firmware implement a workaround, when the firmware team doesn't know the root cause.

I believe the only way this can change is from a management viewpoint that digital designers aren't just designing a chip right now, but they need to support the chip for years, and that should be budgeted for right up front. Documentation including verilog/vhdl code comments can happen well after the chip has taped out. This needs to be a serious priority in any schedule.

Copyright © 2017, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds