These graviton2 instances are no joke. As soon as they released, it probably became the new lowest hanging fruit for many people (from a cost efficiency standpoint). The increased L1 cache has an incredible effect on most workloads.
Of course, you can't compare those directly with epyc/xeon because of the architectural/instruction set differences.
EDIT: meaning that a larger sized cache wouldn't necessarily deliver better performance (though I think in this case the architectural differences do favor Graviton making better use of cache)
ARM is smaller due to simpler instruction decoders mostly, which leaves more room for a huge cache in the transistor budget.
I am hoping Apple does this with their desktop chips: basically take the fastest phone/tablet processor, give it better thermals, and pump up the cache. That would be an easy path to beating Intel on a laptop.
My understanding was that large cache was mostly a cost tradeoff, and that there's no technical reason why x86 processors couldn't do this too. I believe Apple's A-series chips already have a huge cache which is a big part of why they are so fast.
Cost/die size is part of the consideration especially for L3 but with L1 and L2 it is much more about latency than cost: bigger caches will have higher latency than smaller ones. That's the entire reason for the multi-level cache hierarchy to exist in the first place.
Graviton2 doesn't even have a particularly large total amount of cache, the only part of the cache system that's bigger than both AMD and Intel is the L1D. L1I is the same as AMD, L2 is same as Intel, L3 is smaller than both (talking about current-gen server chips). That L1D can indeed make a big difference in certain workloads though.
The L3 is actually pretty small for how many cores there are, ARM recommends 1-2MB/core for the N1 cores in the Graviton2 and it has 512KB/core. AMD has 4MB per core in Zen2 albeit with a slightly weird setup where the L3 is localized to 4 core clusters.
Ryzen CPU's seem to have a decent amount of cache's built in:
Level 1 cache size 12 x 32 KB 8-way set associative instruction caches
12 x 32 KB 8-way set associative data caches
Level 2 cache size 12 x 512 KB 8-way set associative unified caches
Level 3 cache size 4 x 16 MB 16-way set associative shared caches
I am trying to backup gmail emails my employees send and receive and make them searchable. This will enable a new account manager on an account to easily search and find previous discussions with clients and come up to speed to help them quickly.
We were pulling in millions of emails from the past few years, so we had to do a larger size RDS and we were killed on the IO charges.
We are now just on a medium sized EC2 server.
Do you have suggestions for better ways of doing this?
I can't give you direct recommendations, but it does sound like you might have used Provisioned IOPS storage instead of General Purpose storage which is simply pay by the byte.
If you're willing to post a bit more details, I'm sure people could give you some more detailed options. You basically switched from a fully-managed database solution to a fully DIY solution, and I honestly think you might have just over provisioned the RDS instance. I'm willing to bet you can switch back to RDS and get all the benefits there while remaining within budget. You can get a 1-year reserved t3.medium RDS instance with 50 GB of storage for $40/month.
Here is the rest of the details from my developer on the project-
What was our specific server size and setup and costs with RDS?
We were using db.t3.medium (1 CPU, 2 vCPU, 4GB RAM), we were using a DB cluster that has 2 instances db.t3.medium overall cost (~$118 monthly)
And now with EC2?
We are using c5d.xlarge (4 CPU, 20 ECU, 8GB RAM), we are using one instance c5d.xlarge overall cost (~$140 monthly)
What was the big cost savings?
Our main issue was in I/O (Read and write), we have very high I/O demand so we was paying for I/O and storage (~$600) and the second issue is CPU credit (~$90 monthly) so the total save is ($600+$90+$118)-$140= $668 per month.
Also EC2 performance is better than RDS because we get a very good cheaper than RDS server.
Note:
RDS is configuration less and has support while if you want to use EC2 then you need a person to do configuration and support for DB on it.
Or you know, just use a smaller RDS instance. The price difference is not 20x over an EC2 instance. It's been a while since I checked, but I remember the markup being less than 2x.