I did some moving of windows around to make it stutter a bit more, and the output from my test library shows it (200ms max latency, which corresponds to around 5 FPS). Note that the average time between glXSwapBuffers calls approximately matches glxgear's FPS printout.
It should be quite simple for someone who sees latency problems which seem to be cured by BFS to try to run the same 3D game with something like this library both in the mailine scheduler and in BFS and see if it shows any differences in the output. Of course, the code I posted can be enhanced to get better statistics (like a histogram of the latencies); I put the code under Creative Commons CC0 (a bit similar to "public domain").