Glean are already well established in that space.

lolive · 2024-12-07T08:13:47 1733559227

Did not know that stack, thanks. From my perspective as a data architect, I am really focused on the link between the data sources and the data lake, and the proper integration of heterogenous data into a “single” knowledge graph. For Palantir, it is not very difficult to learn their way of working [their Pipeline Builder feeds a massive spark cluster, and OntologyManager maintains a sync between Spark and a graph database. Their other productivity tools then rely on either one data lake and/or the other]. I wonder how Glean handles the datalake part of their stack. [scalability, refresh rate, etc]