
Ask HN: What is a broke person's way to learn data engineering? - pendergast
As a broke individual living in a third world country who wants to learn data engineering, how do I go about it? AWS&#x2F;similar options seem to be super expensive to me, and I don&#x27;t have access to clusters. Any advice?
======
diehunde
AWS, GCE, Azure, all of them offer credits for new accounts to experiment. You
might be limited in some resources but you can learn a lot. Also, a lot of
data tools such as Spark, Airflow, Kafka, etc can be deployed in Kubernetes,
which can be run locally using Minikube. Just that and read a lot of blog
posts.

------
karterk
Most people (regardless of where they are in the world) are not going to be
able to afford a cluster of any meaningful size. So, the best option is to
learn these tools by running them on your local machine. You can use
Virtualbox or Docker to set up virtual hosts within your machine to simulate a
cluster.

Also, data engineering is a vast field. Pick an area that you would like to
pursue first and go deep: Machine learning (feature extraction+training),
Hadoop/Spark job processing, event streaming & aggregation etc.

Then try to find a entry level job that would allow you to apply what you
learned and also get to expand your knowledge of running actual clusters.

------
usgroup
You need representative data; the sort you’ll be working with. There’s plenty
of it for free on AWS. E.g. GDELT.

You then need use cases: things to do to the data. From this you learn how to
process it using whatever tools you like.

How to set up clusters ... worry about that less. It’s more and more
commoditised over time and it’s the admin part of data engineering anyway.

------
analognoise
We might be able to give you more targeted advice for your country.

I'm a massive advocate of community colleges over "boot camps" and other such
educational vectors, but I'm not sure what infrastructure is already available
in your country that would help you get on that path.

~~~
pendergast
Thanks for replying! I'm Indian. I only have access to a laptop at the moment.
I'm not sure that educational programs in data engineering are so accessible
to me, so I was looking to go the self learning route.

------
whb07
what is data engineering to you? You don't need a cloud provider to setup a
"cluster". Look at kubernetes where you can run a "cluster" in 1 machine that
will have multiple pods/services running.

------
Dikesm
Hi, I can help you with logistics and learning. Can you ping me on
cradleofdata@gmail.com.

