The data we use comes from relational databases and document stores operated by different departments, external APIs and third party services, SalesForce, server log files, etc. A stats PhD does not have the training to gather this data themselves.
In terms of a hybrid scientist/engineer role, I don't know many software engineers who are also good at stochastic calculus or ensemble learning. Likewise, I don't know many data scientists who are also comfortable writing cronjobs to retrieve external API data or have the ability to diagnose server problems.
"In November 1997, C.F. Jeff Wu gave the inaugural lecture entitled "Statistics = Data Science?" for his appointment to the H. C. Carver Professorship at the University of Michigan. In this lecture, he characterized statistical work as a trilogy of data collection, data modeling and analysis, and decision making. In his conclusion, he initiated the modern, non-computer science, usage of the term "data science" and advocated that statistics be renamed data science and statisticians data scientists."
From the same article, a quote from Nate Silver:
"I think data-scientist is a sexed up term for a statistician....Statistics is a branch of science. Data scientist is slightly redundant in some way and people shouldn’t berate the term statistician."
If your skillset differs from a statistician, then calling yourself a data scientist is not going to be a differentiating title in common parlance.