Not logged in
Log in now
Create an account
Subscribe to LWN
LWN.net Weekly Edition for May 23, 2013
An "enum" for Python 3
An unexpected perf feature
LWN.net Weekly Edition for May 16, 2013
A look at the PyPy 2.0 release
I'm can't see how they themselves could solve this problem without having a perfect blacklist of broken motherboards.
PCIe, power management, and problematic BIOSes
Posted Jun 30, 2011 1:46 UTC (Thu) by BenHutchings (subscriber, #37955)
Microsoft generally doesn't have this specific information; it relies on OEMs to do the integration.
The OEMs buy in hardware components, which come with Windows drivers already written, and they buy firmware from other companies. Normally they get some specific hardware documentation and source code for the drivers and firmware. They also have documentation and tools for Windows (the DDK).
They need to make the combination of hardware, firmware, OS and drivers work, and quickly. Since they control both firmware and driver code (except for generic drivers provided by Microsoft) they may resort to dirty hacks in either or both of these. The result may work only as a result of undocumented behaviour of Windows.
The challenge for free OS developers is to replicate these driver hacks and undocumented behaviour of Windows where necessary.
Posted Jun 30, 2011 7:26 UTC (Thu) by ebirdie (subscriber, #512)
I see! That explains the phenomenon we have wondered for years, when working as Windows reinstaller junkies. The question asked, what makes a decent hardware become so unusable with Windows over short and longer periods of time?
Now I understand that the frequency on patching Windows also ruins the underlaying assuptions made when producing "a nice working decent hardware". The common solution is to fix the hardware (slow hardware, sleep states working poorly, poor battery duration, a device not working after awake, irritating moments in prepared presentations where reboot is often the cure to rescue the presentation etc. etc. and after the presentation the laptop is returned for "repair") with buying new hardware, what keeps the wheels going for the group in the party.
Posted Jun 30, 2011 19:29 UTC (Thu) by BenHutchings (subscriber, #37955)
Posted Jul 6, 2011 0:59 UTC (Wed) by cortana (subscriber, #24596)
Posted Jul 6, 2011 1:21 UTC (Wed) by BenHutchings (subscriber, #37955)
Posted Jul 6, 2011 12:09 UTC (Wed) by nye (guest, #51576)
If they know how they're supposed to act, why would they deliberately act differently in normal circumstances?
Posted Jul 7, 2011 17:13 UTC (Thu) by farnz (guest, #17727)
As an example; DirectX has a DoesDriverSupport method that it calls to see what functionality the driver supports. It's obvious that an implementation that always returns TRUE is faster than one that returns an accurate result.
Less obvious, but still true, is that a driver that can currently support everything the platform uses can return TRUE without checking, and will be a tiny bit faster. There are similar cases throughout any significant sized API, where being wrong happens to work for today's software, and is faster because you do less work - and when those cases are on the fastpath, the driver will do them.
WHQL tries to deliberately break these sorts of things - it looks for cases where the answer can be predicted, and checks that the driver gives the right answer; if it's lying, WHQL will break things.
Hypothetically, for example, imagine that your GPU only has a single thread of execution, used for 3D commands and for putting buffers in the hands of scanout to display, but lets you access buffers from the CPU directly, bypassing the GPU execution. A driver could implement glXSwapBuffers and friends by putting the swap in the GPU's thread of execution, and returning immediately; it could then make glFinish and glFlush no-ops, and not break anything obvious. If Microsoft thought drivers were doing this sort of trick, WHQL could do a glReadPixels immediately after a glFinish, and get the wrong result - the driver's been caught lying.
In the meantime, of course, the driver is faster than the competition's driver in benchmarks people care about - because it's not doing things by the spec, and hoping that you'll never notice the lie.
Posted Jul 12, 2011 13:06 UTC (Tue) by nye (guest, #51576)
That would presumably be caught by some things visibly breaking at some point, otherwise there's no point in having it in the first place. (I wonder what the modified version does in that example when it catches the driver lying.)
>Less obvious, but still true, is that a driver that can currently support everything the platform uses can return TRUE without checking, and will be a tiny bit faster. There are similar cases throughout any significant sized API, where being wrong happens to work for today's software, and is faster because you do less work - and when those cases are on the fastpath, the driver will do them.
This does at least make more sense - if it definitely isn't causing any problems now, then I can imagine somebody saying 'we can always update it in the future' - and possibly even believing it.
>In the meantime, of course, the driver is faster than the competition's driver in benchmarks people care about - because it's not doing things by the spec, and hoping that you'll never notice the lie
One might hope that driver authors would expect people to care whether their very fast driver is unstable or has rendering glitches, and if they have a more accurate WHQL-passing driver (as posited upthread) to provide that as an option.
I guess worse things happen at sea.
Posted Jul 12, 2011 13:56 UTC (Tue) by farnz (guest, #17727)
In Raymond's example, the implementation handles a detected lie by assuming that DoesDriverSupport always returns FALSE, and not using the accelerated paths. In other words, if you're ever caught lying, you're never going to be trusted to do anything sophisticated, even if you could do some acceleration.
Unfortunately, too many people buy hardware on the basis of benchmarks - for an example, look at the QUACK.EXE incident - a GPU driver was set up to detect a specific application used as a benchmark, and cheat.
The problem for buyers of devices with complex drivers is that until you work out what the cheats are, you don't know whether the driver is fast in benchmarks because it cheats, or because it's buggy, or because it's genuinely that fast, or because your applications are buggy and relying on things not guaranteed by the API. Add in closed-source drivers, which can do things like detect the presence of WHQL certification tests on the machine, and you end up with a driver that (for example) is slow and stable when you run the WHQL test suite (thus always passes), but takes shortcuts when WHQL is not running. As benchmarkers rarely have WHQL installed, the driver author gets the "best" of both worlds - stability if you try and test it with WHQL (so you have a WHQL-compliant driver), and fast if you try and benchmark it without WHQL.
Now throw in the idea that applications don't use complex functionality at first, and you see just how painful things can get - the bit that fails on you might be something that no application today uses, at which point it can be years before anyone writes test code that shows the problem is the driver. For some classes of driver (e.g. graphics drivers), people build up a whole set of mythology around things you cannot do, and you develop a set of shared assumptions that aren't actually in the spec, but that "everyone knows" are things that don't work, because drivers traditionally cheated.
same for everybody, not just us
Posted Jul 1, 2011 9:47 UTC (Fri) by tialaramex (subscriber, #21167)
My current laptop "forgets" it has built-in speakers if suspended with the headphones plugged in while running Windows. Only a reboot _into Fedora_ fixes the problem (rebooting just Windows has no effect).
The outwardly normal USB to SATA/IDE drive cases I bought recently turn out to occasionally give erroneous read results with IDE drives on Windows. The data on disk is undamaged, but understandably that's not good enough for me. Can't reproduce in Linux. If this had been the other way around, would the retailer have taken them back as faulty? I'm glad I don't have to find out.
Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds