The reason for data lakes appears in large enough organizations where it becomes exceedingly likely that there is some data that may be useful to you that's maintained by people you'll never meet in a department you don't know about, where it's impractical or even impossible to get them involved in your project that would consume this data.
It's not so much about data itself as an attempt to solve a communications and coordination organizational problem; you decouple sources of data and consumers of data (not the technical systems/databases, but the people and organizational units) to a 'hub-and-spoke' model where the providers of data just supply raw data without getting into a multinational project that takes a year just to identify the potential stakeholders for that data throughout a distributed organization with tens of thousands of employees.
It's not so much about data itself as an attempt to solve a communications and coordination organizational problem; you decouple sources of data and consumers of data (not the technical systems/databases, but the people and organizational units) to a 'hub-and-spoke' model where the providers of data just supply raw data without getting into a multinational project that takes a year just to identify the potential stakeholders for that data throughout a distributed organization with tens of thousands of employees.