
Ask HN: Dynamic memory/CPU provisioning for VMs? - sscarduzio
When AWS EC2 (Elastic Compute Cloud) was launched, the young inexperienced me initially understood this was a service to hire virtual servers by the hour, but the price would vary (in an &quot;Elastic&quot; way) according to how much RAM or CPU resources you use.<p>To my disappointment, this was obviously not the case.<p>Now, 15 years of technological development later, would such a service be possible?<p>What is the closest service to a truly &quot;elastic&quot; VM instance to date?
======
dilyevsky
GCP e2 instances (that are like 30% cheaper) are closest match to what you are
asking. These VMs run on overcommitted capacity and migrated to a different
physical host seamlessly when the resources are reclaimed

Edit: e2 not n2 -
[https://www.google.com/amp/s/cloudblog.withgoogle.com/produc...](https://www.google.com/amp/s/cloudblog.withgoogle.com/products/compute/google-
compute-engine-gets-new-e2-vm-machine-types/amp/)

------
wmf
For CPU you have hotplug and quota scheduling; for memory you have hotplug and
ballooning.

But when you say "the price would vary according to how much RAM or CPU
resources you use" you get into the real complexity: resource sharing. If your
VM temporarily gives up some RAM, can another VM use that RAM? This is very
hard to do, because the provider doesn't know when/if you'll want that RAM
back. They don't want a physical server to get into a situation where the RAM
demand is higher than the installed RAM because there is no good solution to
that scenario. If you're running hundreds of "micro" VMs/containers on one
server you can rely on statistical multiplexing and luck, but it doesn't
really work for large workloads.

A provider called NearlyFreeSpeech has been charging based on "the use of one
gigabyte of RAM for one minute, or the equivalent amount of CPU power" since
even before EC2 existed AFAIK, but I suspect this complexity is more scary
than attractive for most people.
[https://www.nearlyfreespeech.net/services/hosting](https://www.nearlyfreespeech.net/services/hosting)

------
soamv
Turns out to be a somewhat annoying problem at the VM level. Not impossible
but complex enough that maybe higher-level solutions like functions are
better.

Consider memory usage -- but operating systems (and some applications) are
designed to grab all the memory they can, and use it for caching etc. So it's
hard for the VM host to known when it can grab memory and stop billing the
user for it.

But there is this idea called memory ballooning -- you have a little process
running on the VM guest OS that grabs lots of memory, but is actually in
cahoots with the host, and just tells the host -- "hey I got all this memory,
you can take it back and use it somewhere else".

Okay, so doesn't ballooning solve the problem? There are a few problems with
it -- you can't balloon when you need the memory, because it's not fast
enough. So you have to balloon pro-actively. And you don't know how much to
balloon, so you have to guess: do it wrong and the guest OS will start
swapping, or it might activate its OOM killer and start killing processes.

So making memory usage follow the application is kinda sorta possible but
comes with hairy problems. What about CPU? CPUs usage already follows the
application so you could just measure and bill accordingly -- except that
nothing is gained if memory doesn't also follow usage.

All in all it's way simpler to get towards this goal with clearly-defined
higher level services like Lambda.

~~~
k__
And even with Lambda you have to decide before. At least powertuning helps
here a bit.

~~~
Mandatum
And keep in mind the premium price that follows cloud Functions as a Service.
Unless you're running a very intermittent batch job with strict control and
security policies, most traditional organisations really shouldn't need to
build stuff out on this type of architecture model. If you ops and dev team is
less than 5 people, maybe consider this to save on ops costs.. But I think
it's fair to say for some orgs, this premium just isn't worth it. And they
don't know it until they've been billed millions of dollars over years and
finally a CFO/CTO is running cost-cutting programs.

~~~
k__
The idea is that the Ops for a coparable architecture without FaaS is more
expensive than just running FaaS and not having to pay or to talk to as many
people.

------
purpleidea
[https://github.com/purpleidea/mgmt/](https://github.com/purpleidea/mgmt/) can
dynamically add and remove vcpus to a running vm. Each change has sub-second
precision, and a second
[https://github.com/purpleidea/mgmt/](https://github.com/purpleidea/mgmt/)
running in the vm can detect this and charge workloads accordingly if so
desired.

There are videos of it happening, but no blog post yet.

------
phamilton
t3 instances on AWS have burst capacity charges if you choose unlimited. It's
$0.05 per vCPU-hour, and is only charged if you exceed the accumulated burst
capacity.

So running a t3 would allow you to pay for a baseline and then only pay for
the CPU you end up needing beyond that baseline.

------
trebligdivad
Removing RAM from VMs turns out to be actually quite tricky - hot-unplug very
rarely works because the OS tends to have allocated stuff all over so you
don't have a nice large DIMM like quantity to unplug. Ballooning kind of works
but it's more advisory, there's nothing to stop a guest gobbling that ram up
again (and it has other issues). David Hildenbrand's Virtio-mem might help
solve this; see:
[https://www.youtube.com/watch?v=H65FDUDPu9s](https://www.youtube.com/watch?v=H65FDUDPu9s)

------
oneplane
Not for VMs but definitely for containers and functions (like AWS Lambda). You
can configure them with soft and hard limits but also invocation counts and
runtime.

Doing the same in a VM might be possible (the technologies exist) but it's
often the task or workload that needs to be modified to support it and when
you are already doing that a step to containers, functions or horizontal
scaling is just as easy (or hard). Horizontal scaling based on load is pretty
common (classic ASGs but also overcapacity cost bidding based scaling).

------
rbanffy
It would be possible. Paying for CPU and memory was the business model of
hundreds of data processing companies that ran mainframe batch jobs and time-
sharing services on partitioned machines. Mainframe companies billed users by
machine usage (with the machine on-site). This is obviously doable.

It raises some isolation concerns, however. To make this work, the host needs
to know how much memory is allocated to a given tenant, and that's difficult
without having access to the OS running inside the machine, which is easy with
containers, but not so much with VMs. The tenant can turn off virtual CPU
cores in the VM and the hypervisor can pick that signal up (at least in Linux)
but I'm not sure it's possible right now to do the same to virtual memory
modules. I'd love if AWS allowed me to do that because the boot process of
some of my workloads is more CPU-bound than the rest of the machine's lifetime
and having a bunch of extra cores at boot would do wonders. If, under memory
pressure, I could "plug in" more virtual memory modules, it'd also be quite
nice.

There could be a market opportunity, but whoever does it would need to beat
the current incumbents in price or this would not fly. Also there would be
some difficulty scaling up - all these extra CPUs and memory would need to
come from somewhere and would necessarily fail (or need to trigger a live VM
migration, which, IIRC, nobody does in cloud environments right now) if the
host box's resources are fully allocated.

~~~
sebazzz
I have a related question: how can you know that a VM is being limited in
terms of CPU power?

As an experiment I deployed several apps to Azure S1 tier and they appear to
be good bit slower than running on a VM on our Dell R740 on-premises server.

~~~
WrtCdEvrydy
Be aware, most cloud companies consider a 'core' to be a hyperthread.

When ARM released on AWS, it was a double punch because it has no
hyperthreading on the chips so you get a full core at a cheaper price.

------
PaywallBuster
Possible in vmware to add additional CPUs
[https://blogs.vmware.com/performance/2019/12/cpu-hot-add-
per...](https://blogs.vmware.com/performance/2019/12/cpu-hot-add-performance-
vsphere67.html)

However we kind of moved on to disposable servers which can be scaled on
demand. In which case you just add additional servers or adjust server type
depending on requirements. Same with containers.

The need to add RAM or CPU to a running instance was from the time you used to
have a singe long living instance serving an application.

------
pstrateman
Yeah of course you can.

Simply allocate each vm one core for each hyperthread and record the CPU time
used in total.

Nobody actually does this because it makes billing complicated. Both
practically and for sales.

My guess is this would result in overall less revenue as well, AWS is for sure
making lots of money selling the same cpu time to a dozen people not actually
using it.

~~~
wmf
_AWS is for sure making lots of money selling the same cpu time to a dozen
people not actually using it._

They "for sure" aren't; their prices are so high precisely because the
resources are guaranteed (in most cases).

~~~
jrockway
I agree with that. Amazon pretends to sell you "burstable" instances, but they
know that most people are going to be bursting at the same time a day because
most applications want low latency and host them close to their users,
applications that are worth paying for are used by businesses, and business
users use the app between 9-5 on weekdays. As a result, you don't get much
bursting and the instances aren't much cheaper than non-burstable instances.

Good resource utilization will only happen when you have non-interactive tasks
that can be scheduled to smooth the demand. When I worked at Google, this is
something we had; plenty of people were willing to run their mapreduces
overnight, and take advantage of the purchased interactive capacity that other
services needed during the day. In the real world, I've never had anything
like that. Users use our website during the day. Employees use the internal
apps during the day. Developers send their code off to CI during the day. That
means from 9-5 there isn't enough capacity (builds have to wait) and from 5-9
the computers are just sitting there bored. I don't see a way around it, even
at cloud provider scale, just because so many users are like me. I might be
willing to send CI builds off to some part of the world where it's nighttime
-- if you'll give me the CPUs for free. Nobody appears to have written
software to do this (maybe the big CI providers do this in the background; but
I haven't noticed a lot of latency while ssh-ing to CircleCI jobs for
example), so we sit here using computers very inefficiently.

Ultimately, computers are cheap enough that it doesn't matter. I have a 10
core workstation that sit idle most of the time. Having a fast build when I
need one is nice; and the cost to have those cores sitting idle is minimal. I
would be happy spending even more money on my workstation for that reason. I
imagine people treat their servers the same way, and aren't tempted to tweak
things because their workloads are interactive and autoscaling introduces
latency.

------
the8472
As far as memory goes your OS would normally gobble up as much ram as it can
get for caching. You would have to go out of your way to make it memory-frugal
so it could yield ram back to the hypervisor, and that could impact
performance.

It's easier with CPUs where you can just yield back if there's no work.

------
spullara
AWS’s Aurora Serverless is charged for in this manner albeit with a minimum
level.

------
hacker_newz
Your title implies dynamic hardware provisioning while the post is about
pricing for use? Which is it?

~~~
inetknght
I understand it to be about both. VMs can have CPUs and RAM added or removed
while running as long as the kernel supports it. Linux supports it. I'm not
aware of a service that sells this feature though

------
gautamkmr89
Use AWS Fargate or other lambda or other serverless technology.

