
Amundsen – Data Discovery from Lyft - photoft
https://github.com/lyft/amundsenfrontendlibrary
======
photoft
Amundsen is a metadata driven application for improving the productivity of
data analysts, data scientists and engineers when interacting with data. It
does that today by indexing data resources (tables, dashboards, streams, etc.)
and powering a page-rank style search based on usage patterns (e.g. highly
queried tables show up earlier than less queried tables). Think of it as
Google search for data. The project is named after Norwegian explorer Roald
Amundsen, the first person to discover South Pole.

It includes three microservices and a data ingestion library.

amundsenfrontendlibrary: Frontend service which is a Flask application with a
React frontend. amundsensearchlibrary: Search service, which leverages
Elasticsearch for search capabilities, is used to power frontend metadata
searching. amundsenmetadatalibrary: Metadata service, which leverages Neo4j or
Apache Atlas as the persistent layer, to provide various metadata.
amundsendatabuilder: Data ingestion library for building metadata graph and
search index. Users could either load the data with a python script with the
library or with an Airflow DAG importing the library.

------
diehunde
They did a very interesting podcast on this tool describing it in more detail:
[https://softwareengineeringdaily.com/2019/04/16/lyft-data-
di...](https://softwareengineeringdaily.com/2019/04/16/lyft-data-discovery-
with-tao-feng-and-mark-grover/)

