Short version: Scale models (like wind tunnels) are useful because the most accurate simulations are extremely computationally expensive or computationally intractable, and the faster less accurate simulations are often so inaccurate that they are untrustworthy. Scale models are not 100% trustworthy themselves, and to construct and use them you need to understand similarity theory.
Long version:
The general field is called computational fluid dynamics (CFD for short). There are broadly two types of turbulent computer simulations of flows: DNS and not-DNS.
DNS stands for direct numerical simulation. These simulations are very accurate, and sometimes are regarded as more trustworthy than experiments because in a particular experiment you may not be able to set a variable precisely, but you can always set variables precisely in a simulation.
Howver, in DNS you need to resolve all scales of the flow. Often this includes the "Kolmogorov scale" where turbulent dissipation occurs. It could also include even smaller scales like those involved in multiphase flows or combustion. This is so extremely computationally expensive that it's impractical (in terms of something you could run on a daily basis and iterate on) for anything but toy problems like "homogeneous isotropic turbulence". In terms of real world problems, DNS is limited to fairly simple geometries like pipe flows. Those simulations will take weeks on the most powerful supercomputers today. It's very rare for someone to attempt a DNS of a flow with a more complex geometry, and I'd argue that such works are mostly a waste of resources. Here's an interesting perspective on that: https://wjrider.wordpress.com/2015/12/25/the-unfortunate-myt...
"Not-DNS" includes a variety of "turbulence modeling" approaches which basically try to reduce the computational cost to something more manageable. This can reduce the cost to hours or days on a single computer or cluster. The two most popular turbulence modeling approaches are called RANS and LES.
Instead of solving the Navier-Stokes equations as is done in DNS, modified versions of the Navier-Stokes equations are solved. If you time average the equations instead, you'll get the Reynolds averaged Navier-Stokes (RANS) equations: https://en.wikipedia.org/wiki/Reynolds-averaged_Navier%E2%80...
These equations are "unclosed" in the sense that they contain more unknowns than equations. In principle, you could write a new equation for the unclosed term (which is called the Reynolds stress in the RANS equations), but you'll end up with even more unclosed terms. So, the unclosed terms are instead modeled.
RANS is older, computationally cheaper, and usually computes the quantity that you want (e.g., a time averaged quantity). LES is newer, and has better justification in theory (e.g., good LES models converge to DNS if you make the grid finer, but RANS will not), but it often doesn't compute precisely what you want and the specifics of the LES models are often specified in inconsistent ways. My experience is that people tend to ignore the problems with LES or be ignorant of them. (Though I do believe LES is more trustworthy.)
The problem is that modeling turbulence has proved to be rather difficult, and none of these models work particularly well. Some are better than others, but the more accurate ones typically are more computationally expensive. Personally, I don't trust any turbulence model outside of its calibration data.
Some people lately have proposed that machine learning could construct a particularly accurate turbulence model, but that seems unlikely to me. People said that same things about chaos theory and other buzzwords in the past, but we're still waiting. Many turbulence models are fitted to a lot of data, and they're still not particularly credible. Also, machine learning doesn't take into account the governing equations. Methods which are similar to machine learning but do take into account the governing equations are typically called "model order reduction". If you want to do machine learning for turbulence, you actually should do model order reduction for turbulence. Otherwise, you're missing a big source of data: the governing equations themselves. (I could write more on this topic, in particular about constraints you'd want the model to fit which machine learning doesn't necessarily satisfy.)
Anyhow, scale models are basically treating the world as a computer. Often testing at full scale is too expensive, particularly if you want to iterate. "Similarity theory" gives a theoretical basis to scale models, so that you know how to convert between the model and reality.
This theorem shows that two systems governed by the same physics are "similar" if they have the same dimensionless variables, even if the physical variables differ greatly.
If any of this is confusing, I'd be happy to answer further questions.
Wow, thanks for your well-written response. I didn't quite follow all the details; in any case, I have a slightly better idea of what is going on. Next, I look forward to learning a bit more about laminar versus turbulent flow.
I can relate to your comment: "Some people lately have proposed that machine learning could construct a particularly accurate turbulence model, but that seems unlikely to me". A healthy skepticism is important. Different inductive biases in various machine learning algorithms will have a significant effect here, I'd expect.
Here's some additional comments you or some other reader might find useful:
Dimensional homogeneity is the most important constraint I think most machine learning folks would miss. It's not really an "inductive bias", rather something which everyone agrees models need to satisfy, so it should be baked in from the start. This is trivial to meet, actually; just make sure all of the variables are dimensionless and it's automatically satisfied. (Depending on the larger model, you might have to convert back to physical variables.)
In terms of "inductive biases", I'm not certain what that would entail in terms of turbulence, but I'll think about it. Might be something to figure out empirically.
Turbulence models which satisfy certain physical constraints are called "realizable". Some of these constraints are seemingly trivial, but not necessarily satisfied, like requiring that a standard deviation be greater than zero. (Yes, some turbulence models might get that wrong!) The "Lumley triangle" is a more advanced example of a physical constraint that a (RANS) model needs to satisfy that often is not satisfied.
I'd be interested in applying machine learning type methods (combined with the model order reduction approaches to include information from the Navier-Stokes equations), but I'm not knowledgeable about them. My impression is that most people applying machine learning to turbulence are novices at machine learning. And I imagine most machine learning people applying machine learning to turbulence are novices in turbulence and wouldn't know much anything about the realizability constraints I mentioned.
Another issue worth mentioning is experimental design. I think the volume of data needed to make a truly good turbulence model is probably several orders of magnitude higher than anything done today for turbulence. Experimental design could make this more efficient. I don't think most machine learning people worry much about this. They seem to focus on problems which can be run many times without much trouble. Acquiring data for turbulence is slow and hard, so it's outside their typical experience.
Short version: Scale models (like wind tunnels) are useful because the most accurate simulations are extremely computationally expensive or computationally intractable, and the faster less accurate simulations are often so inaccurate that they are untrustworthy. Scale models are not 100% trustworthy themselves, and to construct and use them you need to understand similarity theory.
Long version:
The general field is called computational fluid dynamics (CFD for short). There are broadly two types of turbulent computer simulations of flows: DNS and not-DNS.
DNS stands for direct numerical simulation. These simulations are very accurate, and sometimes are regarded as more trustworthy than experiments because in a particular experiment you may not be able to set a variable precisely, but you can always set variables precisely in a simulation.
Howver, in DNS you need to resolve all scales of the flow. Often this includes the "Kolmogorov scale" where turbulent dissipation occurs. It could also include even smaller scales like those involved in multiphase flows or combustion. This is so extremely computationally expensive that it's impractical (in terms of something you could run on a daily basis and iterate on) for anything but toy problems like "homogeneous isotropic turbulence". In terms of real world problems, DNS is limited to fairly simple geometries like pipe flows. Those simulations will take weeks on the most powerful supercomputers today. It's very rare for someone to attempt a DNS of a flow with a more complex geometry, and I'd argue that such works are mostly a waste of resources. Here's an interesting perspective on that: https://wjrider.wordpress.com/2015/12/25/the-unfortunate-myt...
"Not-DNS" includes a variety of "turbulence modeling" approaches which basically try to reduce the computational cost to something more manageable. This can reduce the cost to hours or days on a single computer or cluster. The two most popular turbulence modeling approaches are called RANS and LES.
Instead of solving the Navier-Stokes equations as is done in DNS, modified versions of the Navier-Stokes equations are solved. If you time average the equations instead, you'll get the Reynolds averaged Navier-Stokes (RANS) equations: https://en.wikipedia.org/wiki/Reynolds-averaged_Navier%E2%80...
These equations are "unclosed" in the sense that they contain more unknowns than equations. In principle, you could write a new equation for the unclosed term (which is called the Reynolds stress in the RANS equations), but you'll end up with even more unclosed terms. So, the unclosed terms are instead modeled.
RANS is older, computationally cheaper, and usually computes the quantity that you want (e.g., a time averaged quantity). LES is newer, and has better justification in theory (e.g., good LES models converge to DNS if you make the grid finer, but RANS will not), but it often doesn't compute precisely what you want and the specifics of the LES models are often specified in inconsistent ways. My experience is that people tend to ignore the problems with LES or be ignorant of them. (Though I do believe LES is more trustworthy.)
The problem is that modeling turbulence has proved to be rather difficult, and none of these models work particularly well. Some are better than others, but the more accurate ones typically are more computationally expensive. Personally, I don't trust any turbulence model outside of its calibration data.
Some people lately have proposed that machine learning could construct a particularly accurate turbulence model, but that seems unlikely to me. People said that same things about chaos theory and other buzzwords in the past, but we're still waiting. Many turbulence models are fitted to a lot of data, and they're still not particularly credible. Also, machine learning doesn't take into account the governing equations. Methods which are similar to machine learning but do take into account the governing equations are typically called "model order reduction". If you want to do machine learning for turbulence, you actually should do model order reduction for turbulence. Otherwise, you're missing a big source of data: the governing equations themselves. (I could write more on this topic, in particular about constraints you'd want the model to fit which machine learning doesn't necessarily satisfy.)
Anyhow, scale models are basically treating the world as a computer. Often testing at full scale is too expensive, particularly if you want to iterate. "Similarity theory" gives a theoretical basis to scale models, so that you know how to convert between the model and reality.
One of the most important results in similarity theory is the Buckingham Pi Theorem: https://en.wikipedia.org/wiki/Buckingham_%CF%80_theorem
This theorem shows that two systems governed by the same physics are "similar" if they have the same dimensionless variables, even if the physical variables differ greatly.
If any of this is confusing, I'd be happy to answer further questions.