My complaint about formal methods is that they aren't any kind of cure-all. They rely on the specifications being complete and that the specifications are correctly translated into the formal verification method.
In almost all programming, the specifications are incomplete or incorrect. Programmers, even without realizing what they are doing, fill in the gaps in incomplete specifications and may not even realize the specification was missing information.
Many bugs are caused by programmers on each side of an interface assuming different things about the specification. An example of this is the POSIX select() timeout parameter.
To fully specify the behavior of a system under all conditions is in my opinion, as prone to oversights and misunderstanding (aka bugs) as writing the program itself in assembly or C. (Well, you can make a lot more typo type mistakes writing in C, but I think it holds for logic errors.)
Reading through some of that seL4 paper you linked, I see they had to make about 300 changes in the spec during the process. In other words, they had to debug it. I wonder how they know if they found all the problems in their spec?