Real historical data might not cover current market state.
For example, option price among other params depends on the interest rate. For the last decade interest rate was around 0% in Europe and slightly higher in US. If you train on that data only, there is no chance to "learn" option prices in the high-interest-rate environment which we saw for the last few years. Hence, you need synthetic data to learn that region of the market space.