Not so. -mfpmath=sse only causes SSE to be used for math within a single function (including temporaries), and calls between functions with static linkage. All calls between functions with external linkage must still conform to the ABI, which means they must use the x87 registers or be spilled to memory. Thus, -mfpmath=sse can actually slow down code due to needless moves from SSE to x87 and back.
The option you're thinking of is -msseregparm, which elicits warnings whenever you use it because it breaks the ABI, meaning that you must link every single thing that you pass floating-point arguments to or receive floating-point return values from with the same option.
This includes libm, which you'll probably need to hack to expect its arguments in SSE registers, since a lot of its 32-bit code expects to receive them in x87 -- and sacrifice compatibility with everyone else's 32-bit x86 code, since nobody else uses that option. If you're doing that these days, you may as well use x32. :)