We recently moved our data warehouse to Snowflake. Along the way, we adopted an ELT paradigm instead of ETL, where you load your raw data into the warehouse and then perform any transformations in the warehouse. A tool called dbt (https://www.getdbt.com/) has made this a joy. You write SQL select queries in SQL files, and dbt creates a DAG based on this SQL so it can efficiently create tables/views in parallel. dbt also comes with some nifty testing facilities to validate your data once it is transformed.
This is a great article, thank you for writing it!
Just because you are using a database as your work space
doesnt mean you Load your data before transformation
Most complex dw, have multiple layers of transformation, and several workspaces along the way
There is always a final step where you load the clean data, into your clean schema, used for querying the data
Extract = Fetch the data you want
T = Make all necessary change
L = Create a clean copy
1. The raw data layer where data comes directly from buisiness log
2. Common data layer where one copy of data serves multiple purposes
3. Application layer where final result is presented