Hacker News new | past | comments | ask | show | jobs | submit login
Ask HN: Do companies still do ETL/BigData in house?
3 points by diceduckmonk 10 months ago | hide | past | favorite | 3 comments
I remember a time when pure tech and traditional companies alike were hiring to expand their data engineering teams.

This was during the Hadoop and Spark era when people would implement data transformation and business logic in code.

Have low-code/3rd party SaaS since replaced this space or have most companies realized they aren’t getting much value out of data science and abandoned these BigData initiatives completely?




You can do quite a lot with cloud. A lot of regular company data engineering is not steady state load, it's bursty with having to churn through ancient data on occasion for financial reasons, audit reasons, migration to new systems, etc.

My previous company did a lot of work to move to BigQuery, which really does work quite well for data we needed to regularly access, and for things that were more rare we'd just store in GCS.

We used Apache Beam/Dataflow to do the imports/exports and the occasional custom script for data munging when necessary.

At one point we needed hundreds of nodes to do some data transformation from on prem to cloud, but on average we only needed a handful of nodes running much smaller jobs.


Neither. Although less hyped, at the enterprise level, there are still plenty of Hadoop/Spark implementations. Organizations are just trying to decide on whether to move these implementations to the cloud, retain some form of on-prem to retain control, or migrate to other platforms like Snowflake.


I think it's less hyped but still widely used. There are new tools such as flink as well.

For analytic transformation Snowflake, BigQuery and other modern column based DB are fitting the bills too so that's probably why they are the new hype.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: