One big powerful server's architect and circuit design is not that simple as what it looks like, in the other hand, 2000 standard servers are not that complex as what they are.
The way of how you think is the key. You can think that the 2000 servers is a big computer cluster, but I prefer to think each one of them is a simple replaceable black box.
I can show you some scenarios at the following to explain why 2000 servers make more sense than one big machine,
Firstly, When I want to upgrade the system to deal with a suddently increased load I don't need to call vendor to arrange an onsite upgrade service, I can do it by connecting more servers right away, and when the load decreased, then I can disconnect some servers. This makes the maintenance job is more flexible and more convenient.
The second one is when I estimate the performance of the whole system, I can focus on estimating the performance for one server first, and then add them on, even there are some variables need to be involved into the calculation, it is still more clear and simpler than estimating the performance by looking at the vendor provided system specification.
The last scenario is that you cannot separate one big machine into different geolocations to handle the access all over the world, but seperate 2000 servers is much easier and doable. Moreover, different geolocation deployment can provide a genric 24*7 online service as some nature disasters happens.
When we engineer things, we should split one big(complex) thing into multiple small parts and keep it with a simplest structure or function, and later, we connect these small parts together.
English is not my first language, so if somewhere sounds wired, please excuse me.