I agree with you that the management of deallocation is a significant issue/change.
As for the memory bloat, I think that these "separated spinlock" would be useful for only the most contended spinlocks which would somewhat mitigate the issue.
Of course contention isn't the same on all the systems and developers working on embedded or supercomputers would probably disagree on which spinlocks has to be "separated"..