
Big data natively on Kubernetes? Spark scheduled by k8s, Kafka on etcd without ZK - matyix
https://banzaicloud.com/blog/spark-k8s/
======
ahm4
Is this compatible with Spark 2.2 version? What are the storage options if we
don't want to use HDFS?

~~~
tarokkk
All the storage options supported by Spark are available when running on
Kubernetes. You can use blob storage (S3, WASB, etc) or volumes (by
k8spersistent volume claims for such as AWS EBS, GCE PD, Azure Disk, Cinder
volume, etc). I’ve heard some folk use Minio or Rook

------
svetlana-palto
We are running big data workloads on Kubernetes on GKE and love that we can
reuse the same cluster for our other applications. We were waiting for YARN
support for containers but it’s immature and still more like a roadmap thus we
have moved away and now schedule with k8s. It’s a pity that EKS is not
released yet, as we were running Spark on EMR but now we have moveinb to GKE.
Anybody has experience with ACS?

~~~
matyix
+1 for EKS. Is Azure ACS still there? We use AKS on Azure, though it's preview
but apart from some limitations (and odd cluster delete stuck) it works well
for us.

