We’ve been testing a few Graviton instances since before GA and have gone “all in” with our latest release targeted for EOY.
We went live with R6g (Redis) and R6gd/T4g (ES) instances at the beginning of the month. We were coming from an ancient cluster from AWS managed ES (2.3) and using their i3 elasticsearch instances so it’s been an Apples to Oranges comparison.
We are running T4g/C6g/R6g
K8s clusters for web servers.
We do still have Postgres on R5 instances but I’m interested to check out R6g there since we can do a direct comparison
Regarding using c6g’s instead of i3’s to power Elasticsearch machines, could you tell me more? (We’re in the same boat, considering a switch)
Specifically around hyper threading, my understanding is c6 don’t have “vCPUs”, and just have “CPUs“, so the effective number of cores doubles. Did you find similar throughput (in terms of Elasticsearch Search/Write TPS) between a virtual and real core?
From my POV now, the i3 instances should never be used unless you absolutely need the dedicated local storage in those amounts. The per vCPU performance is horrid in comparison to any of the modern instances (R5/M5/C5) let alone comparing those vCPUs to the actual CPU of R6g/M6g/C6g like you said.
Another advantage of the Graviton processors is 50% more dedicated storage compared to equivalent Intel/AMD instances. To get enough storage we would have had to bump up to r5d/r5ad.24x or metal which when testing we also saw more “jitters” in latency on the long tail. Despite the x86 instances being larger they traded blows in different tests we had, aggregates were one thing that x86 easily beat out ARM but a lot of our aggregates come from another data source so it wasn’t a deal breaker. Overall we are happy with performance, compared to old stack we are at around 10% of the cost and I think our savings was more than 2x compared to x86 after locking in some rates. R6gd.metal (16x) vs R5d.metal (24x)
Interesting, we have an Elasticsearch cluster of around 300 i3s (and maybe soon some i3ens) I think mostly because of the NVMe storage. But yeah there's not much compute to go with all that disk space.
> aggregates were one thing that x86 easily beat out ARM
This is actually a lot (most?) of our ES workload, so that's a really interesting detail.
How are you deploying your web servers ? I assume that is where your custom application code lives.
What's the developer experience here - I have been considering cross building a docker vm for arm and deploying it.
But I wasn't sure how comfortable is it.
We have a base Debian image maintained by our Platform/Security team (compliance) that we build our Flask apps off. Other than them forgetting to push an ARM compatible image when they update we haven’t ran into issues, but I don’t know what went in behind the scenes to get the image working if anything.
I would say I’m the bridge between the infra/DevOps teams and our back end team of 12 people and as far as everyone else on the team is concerned it “just works”. Still waiting for the first big time it doesn’t
1) most of the team is on MBP 16 inch models. I know there is one and may be two on 13 inch models.
2 and 3 I’ll answer to the “best of my knowledge” but on the professional level it’s something my team doesn’t handle (I do wish I knew more!). Features get merged to dev/stage/main and trigger blue-green rolling deploys based on whatever is configured. The deploy is handled through AWS CodeDeploy (not my choice, also not my teams jurisdiction) which handles ticket validation/testing/deploying to whatever k8s cluster/manual deployment rollover if needed.
I believe you can build ARM images from Docker, or at least I have a vague recollection of doing so for a Raspberry Pi, on a normal x86 machine
this is super interesting. it stands to reason that CodeDeploy would support docker crossbuild.
In my team we mandate testing docker images on local dev machines before rolling to production, so i was wondering how the cross-compiles, etc would work . But this is helpful. thanks!
We went live with R6g (Redis) and R6gd/T4g (ES) instances at the beginning of the month. We were coming from an ancient cluster from AWS managed ES (2.3) and using their i3 elasticsearch instances so it’s been an Apples to Oranges comparison.
We are running T4g/C6g/R6g K8s clusters for web servers.
We do still have Postgres on R5 instances but I’m interested to check out R6g there since we can do a direct comparison