February 22, 2012
This article was contributed by Neil Brown
I must admit that I love bugs. I also hate bugs which leads to a very
conflicted relationship with the blighters, but for now I just want to
focus on the positive - I love bugs.
The bugs I am talking about are, of course, the incorrect behavior of
software systems due to incorrect code and the reason that I love them
is that they are so educational. There are a number of reasons for
this, the significant one for the present being that they provide
motivation and focus to read and study new code.
One of the freedoms that free and open source software provides is the
freedom to study the source code. However having that freedom doesn't
mean it's easy to use it. When faced with a large body of code such as the
Linux kernel it can be hard to know where to start, or when to move
on. There is no story-line, no curriculum, no plot to guide your
study. This is where a bug comes in: it can provide a story line.
This may be a particular sequence of events or a particular
combination of features, or just a starting point to spiral out from.
But by providing a clear focus and a clear finishing point, it makes
study easier and more rewarding.
More than a phone
I have recently been spending some time trying to make the Linux
kernel run well on the
GTA04
"Phoenux" replacement motherboard for the
Openmoko "Freerunner" and "Neo 1973" mobile phones. This is an
attempt to revitalize the Openmoko effort to provide an open (or at
least "as open as possible") platform for a mobile computing device.
It fits in the same case, uses the same display and speakers, but
provides more memory, faster processor, and faster networking plus a few
other extras. During this (ongoing) effort I have discovered a number
of bugs and missing features, most of which only came with smaller
learning experiences. One, though, stands out for leading me on a
somewhat longer path of discovery than most, and so I would like to
share that path and those discoveries so that others might benefit. I
should note first that the path is still open. I have not actually
fixed the bug, so others could follow the path themselves. Anyone keen
to do that should stop after the next paragraph, get a GTA04,
and start hunting themselves. Others can read on.
The symptom of the bug is easy to describe. Like most computer
systems the GTA04 has an RTC (real time clock) which has an alarm
function. When the time in the clock matches the time in the alarm,
it generates an interrupt. The problem is that, though this interrupt
works perfectly while the system is awake, it does not reliably wake the
device when it is asleep (i.e. in suspend-to-RAM state). Other
interrupts configured much the same way (for example data received on
the serial console) wake the system with 100% reliability. However
the RTC alarm almost always fails. Understanding this bug leads on a
path through the interrupt management code, the suspend management
code, and even the USB code. But it must start with a brief
description of the hardware.
In the GTA04 there are two particular integrated circuits (ICs) that
are relevant. One is the OMAP3 SoC (system-on-a-chip) from TI. The
other is the PMIC (power management IC) which is referred to as
"twl4030" in the various drivers - various because, like the OMAP3,
it is a multi-function chip combining diverse functions such as battery
charging, keyboard control, and the USB electrical interface. There are a
number of connections between these chips but only two are relevant
for now. These are a single interrupt line and an I2C bus.
The interrupt line signals the interrupt controller (INTC) in the
OMAP3 that something has happened. The I2C (inter-integrated-circuit)
bus is a simple 2-wire bus that allows the CPU to read and write
registers in the PMIC. In particular it can be used to find out which
of the several functions generated the interrupt, and why. It can
also be used to clear the source of the interrupt.
Having one interrupt line that indicates any of a number of possible
events like this is not uncommon. For example the UARTs (Universal
Asynchronous Receiver Transmitter - a serial port) in the OMAP3 each
have a single interrupt, but it can be generated by a character
arriving, a character transmit having completed, or an error
condition. As with most such devices there is a register that can be
inspected which has one bit for each possible interrupt source.
When all the interrupt sources are related to a single device and
handled by a single driver this is all quite easy to manage. However
when the different interrupt sources are related to logically separate
devices, having to manage all these behind a single interrupt could
get clumsy.
So many interruptions
To address this situation, Linux has an abstraction called an
"irq_chip". An irq_chip represents a set of
interrupt sources each of which is assigned an interrupt number (as
listed in /proc/interrupts). It provides functions to enable or
disable each interrupt, to allow the interrupt to wake the device from
suspend, to set the trigger type (edge or level), and various other
functions. It also arranges that the interrupt handler for each
interrupt will be called when appropriate.
irq_chips with sources
Using this, the driver for the INTC in the OMAP3 defines an
irq_chip for all the interrupts that it knows about directly,
and the driver for the PMIC can define a separate irq_chip
which describes the various interrupt sources in the PMIC. Each
irq_chip will provide very different functions to configure
and control the interrupt sources, but will provide a uniform
interface to all device drivers. When the single PMIC interrupt
arrives at the INTC, it will call an interrupt handler which will
examine the PMIC registers and then call the appropriate handler
that was registered with the second irq_chip. This way the
drivers for the various parts of the PMIC don't need to know too much
about each other - the irq_chip mediates between them.
In our PMIC there is a "Primary Interrupt Status Register" which has
one bit for each module that can generate interrupts. Each of these
modules has their own "Secondary Interrupt Status Register" to report
what actually caused the interrupt, much like the UART described
earlier. There is normally no need for a tertiary irq_chip
to represent these registers as the one driver manages the whole
module and all of its interrupts.
There are two exceptions, only one of which interests us. There is a
"power interrupts" module in the PMIC which combines interrupts from a
diverse range of sources: press of the power button, insertion of USB
cable, over-temperature warning, and the RTC alarm. These are
sufficiently varied that a separate irq_chip makes sense here.
The twl4030-irq driver manages all this quite elegantly, providing a
secondary irq_chip for the device as a whole, and supporting
tertiary irq_chips, including for the Power Interrupts module.
So when the RTC alarm triggers, the primary interrupt is handled by
the INTC. This calls an interrupt handler which inspects the PMIC,
determines that a power interrupt is active and calls the relevant
handler. It in turn examines the PMIC again and finds that the RTC
alarm has fired, and so calls the handler that was registered for that
particular interrupt. This all works perfectly. But it doesn't seem
to wake the device from suspend. At least, not always. It seems that
something happens on the way to suspend that interferes with this.
A funny thing happened on the way to suspend
One of the many things that happens on the way to suspend is that
each individual interrupt gets disabled - unless it is flagged as
IRQF_NO_SUSPEND in which case it is left alone. However this doesn't
mean exactly what it sounds like it means. Being "disabled" just
means that the handler routine will not be run, it doesn't mean that
the interrupt will not be generated. We have a different word for
that, which is "masking". When an interrupt is masked the originating
source of the interrupt is told to never post that interrupt.
Linux uses a lazy scheme for disabling interrupts. When the disable
request is made, the fact is recorded in internal data structures, but
that is all. If the interrupt is subsequently delivered, only then
might the interrupt be masked. This can be a useful optimization as
masking an interrupt can take a lot longer than just setting a flag in
memory.
So, on the way to suspend, interrupts are disabled but not masked. If
the interrupt does actually arrive before we reach full suspend, the
fact is recorded. If it was an interrupt that should wake from
suspend, this is detected in
check_wakeup_irqs()
and suspend aborts.
If the interrupt doesn't arrive before full suspend, then it is still
unmasked and will successfully wake up the device, which will resume
and then handle the interrupt. This might all seem a bit complex, but
once it is fully understood it is actually quite neat and it works
well ... except for my RTC alarm.
Who turned out the lights?
As promised we will need to glimpse at the USB code as well and now is
the time to bring that in. Like most modern phones, the GTA04 can be
attached to a computer or a charger via a USB port. When the USB
interface detects 5 volts on the power line, it will route this to the
battery charger (BCI) which will use it to charge the battery and
power the device.
This is relevant because - when tracing is added to the twl4030-irq
driver - we see that during suspend we get an interrupt from the PMIC
which turns out to be due to the battery charger losing power and saying
"Hey, just thought you should know that I'm running on battery again
now".
What is happening is that when the USB port is told to go to sleep it
turns off its various power supply regulators. One of these (USB3V1) also
powers the voltage comparator which determines if 5V is present.
As soon as that presence is no longer signaled, the current stops
being routed to the BCI, and the BCI justifiably complains.
This results in the interrupt line from the PMIC being asserted
which causes the interrupt management code to run. This notices that
the interrupt was disabled, and so masks it. When we get to
check_wakeup_irqs(), the PMIC IRQ is pending, but as it has not been
set for wakeup it is just ignored.
The net result here is that all interrupts from the PMIC are ignored
during suspend, so the RTC alarm doesn't work, the power button
doesn't wake the device up, and plugging or unplugging the USB cable
has no effect either.
So now that we understand the problem ...
So, what to do? We could just unplug the USB cable when testing the
RTC alarm. Then the BCI would not notice the current disappearing and
so wouldn't generate an interrupt. This certainly works - once. It
doesn't seem to work a second time, but we'll leave that issue for the
moment.
We could tell the USB controller not to turn off USB3V1 when the BCI
is active and power is available on the USB bus. This is certainly a good
idea and would mean that the battery can charge during suspend. It
also brings up the interesting issue of how these separate drivers
should communicate, but it doesn't really solve the general problem,
just this specific one.
It could be that an interrupt occurs that we really do want to ignore
during suspend, that we have even explicitly disabled (but due to
lazy-masking has not been masked). So we want to be able to mask
those interrupts without masking the main interrupt. A few moments
reflection on what we have learned so far suggests that setting
IRQF_NO_SUSPEND on the intermediate interrupts should help. That way
they will not be disabled during suspend so the interrupt management
will trace through the chain of irq_chips far enough to find
the originating interrupt source and will mask just that. The other
interrupt source will still be available.
This may sound convincing, but reality has a way of getting in the
way: testing shows that just setting IRQF_NO_SUSPEND is not enough. Diving
back into the code and the data sheets reveals an interesting and important
fact. The PMIC allows each individual interrupt source to be masked, but it
does not allow a whole module as a set of interrupt sources to be masked. So
when an interrupt arrives from the BCI (one of the modules) the IRQ manager
notices that interrupt is disabled and so the secondary irq_chip is
asked to mask it. But as the irq_chip cannot mask it (the hardware
doesn't support that), it simply ignores the request. You can imagine what
happens next - an unending stream of attempts to mask the BCI interrupt, none
of which are effective and the interrupt keeps re-firing.
The root problem seems to be that while the TWL4030 PMIC appears to
have an interrupt structure that lends itself to being represented as
a small tree of irq_chips, this isn't actually the case. The
details of the behavior of suspend makes it essential that a working
"mask" function be provided, and that cannot be provided for modules
in the twl4030, only for individual interrupts.
So it seems that a complete fix requires some change to the structure
of the irq_chips in the twl4030 driver. Possibly we could
make the structure simpler (no tree), or possibly we could make the
implementation more clever in some way. The important point is that
each interrupt that is visible to the Linux IRQ management code must
support being masked in the hardware in some way.
Another funny thing on the way to resume
As mentioned, after one successful wake-from-alarm, the RTC alarm in
the GTA04 doesn't work a second time. In fact it stops working
completely, even when not suspended. It seems that something has gone
wrong on the way to resume.
During resume all the interrupts that were disabled by suspend are re-enabled.
The three interrupts which relate to the RTC - the base PMIC interrupt, the
interrupt for the "power interrupts" module, and the actual RTC interrupt - are
enabled in that order with PMIC first. As this interrupt is still asserted (as
it is what triggered resume) the PMIC interrupt handler runs as soon as it
is enabled. It reads the interrupt status register to see which module
was responsible, and triggers the relevant interrupt for that module.
This secondary interrupt is still disabled so nothing happens. In
particular the interrupt source is not cleared, nor is it masked,
so we again get a repeated sequence of attempts to handle an interrupt
which cannot be handled. This one isn't unending as the loop that
enables all the interrupts is still running and it eventually gets to
the one for the target module (the power interrupts module). This is
triggered again and this time it actually does something. It reads
the final status register (thus clearing the interrupt) and calls the
interrupt handler for the RTC alarm. Unfortunately that is still
disabled so nothing else happens.
Eventually the RTC interrupt is enabled but by now it is too late.
In some cases an interrupt that fires while disabled gets the
IRQS_PENDING flag set so that when it is re-enabled, the interrupt is
resent. But that is not the case when the twl4030-irq interrupt
controller tries to handle a tertiary IRQ. Whether this is a bug or a
misunderstood feature is as yet unclear. Another journey of discovery
would be needed to fully understand that issue.
Meanwhile, because the interrupt is delivered but the interrupt service routine
isn't run, the handshaking between the driver and the device gets
confused and no other interrupt is ever seen from the RTC.
We can work around this problem by a simple expedient. When
re-enabling the interrupts during resume, do it in the reverse order.
This ensures that the RTC interrupt is enabled before it is called,
and everything works. This even feels like the "right" thing to do -
it is balanced for "enable" to happen in the reverse order of
"disable". But even if it is "right", it is not really a fix. There
would still be bugs lurking and ready to spring out again.
Journey's end
The point of this little tour is not to point out bugs in the code or
to shame the relevant developers. In fact the code is in general
quite elegant, well structured, largely very effective, and the
developers are only to be congratulated. Rather, the point is to
highlight the complexity of the task being tackled by developers
working the embedded space:
-
There are complex interactions between distinct hardware. Here the
USB interface and the battery charger have important interdependencies
that need to be reflected in the drivers. Some interdependencies already
are but there are subtleties that are easy to miss.
-
The requirements of an "irq_chip" are not really documented
anywhere, and given the current rate of development there is a good
chance that such documentation would be out of date. Extracting the
requirements from the code is far from trivial. In this case a seemingly
correct and obvious implementation was only found to be deeply wrong
by a fairly obscure test case.
-
The IRQF_NO_SUSPEND flag is clearly important, but not easy to
understand. It is fairly obvious what it does, but much less obvious
why you would want to do that. I imagine that as a developer my
approach would be to not set it until I found a bug which could be
fixed by setting it. Then I would set it and hope it didn't break
anything. This is not really a robust way to do development.
With complex hardware interactions and complex software dependencies
it is not surprising that we seem to see a new class of problems in
the embedded space. As we have observed, there are often ways to work
around the bug rather than fix it, and that is usually much simpler
and quicker. They may be tempting, but for this developer at least
they are nowhere near as satisfying.
Because if you follow the bug all the way to the root cause, and if
you understand all that is happening and find the right fix, then you
have not only helped other future developers, but have learned a whole
lot in the process. And that is why I love bugs.
(
Log in to post comments)