Having said that, there is no need to create an exact copy of Google Analytics because most of the people probably use only 20% of the features anyway. Each business has its own use-case and data source so it would be much more convenient to ingest all the raw event data into your data warehouse either using third-party tools such as Segment or open-source tools such as Snowplow and Rakam. This is the only way to have full control over your data.
1. If you don't want to store sensitive user-data, just don't send it to your servers.
2. Create the reports either using SQL or something like Rakam that provides you an interface similar to Amplitude / Mixpanel but on top of your data-warehouse so that you don't need to share your data with a third party service.
Shameless plug: I'm working for the company behind Rakam. (https://rakam.io)