I meant that the *implementation* of puts is faster than printf, as puts doesn't have to parse through the first param like printf does. In your example, gcc simply optimized the code into calling puts instead of printf, which it can only do for a very limited number of cases (e.g. with a fixed constant as the 1st param, which is not the case that this article covers). In this very similar example, you can see that the generated code is quite different and will be slower to run: