It depends how much communication you need between the different processes that you are using.
Using a bunch of 1-package machines imposes a much higher communication cost than multi-package machines.
There's also a question of the features provided on the motherboards. There are very few 1-package motherboards out there that have 'server' features like ECC memory, serial or network consoles, etc. If any of these features are desirable to you, you will probably be using 2-package motherboards at the smallest.
In terms of cost for computing power, 2-package machines seem to still be the sweet spot, you avoid having to pay for multiple power supplies, drives, and other per-system infrastructure, but you don't have the price premium that larger multi-package systems tend to have.
But even a 2-package machine can see advantages with proper NUMA handling (not as drastic as larger systems, but there is still an advantage)
In addition, a large amount of workload on large systems nowdays is virtual machine based with the hardware systems being oversubscribed. While a VM can be moved from one machine to another, it can be a fairly expensive thing to do, so larger systems work better in practice as they allow for the peaks and valleys of demand to average out better, so you can oversubscribe your hardware more.