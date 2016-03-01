Hacker News new | comments | show | ask | jobs | submit login
Amazon EBS Update – New Elastic Volumes Change Everything (amazon.com)
133 points by chrisbolt on Feb 14, 2017 | hide | past | web | favorite | 57 comments



This is an amazing new feature... other than the lack of "shrinking a volume" support... I'm definitely going to be using this from now on and this feature is going to make a big difference in my day to day work.

That said... I'd still rather have access to Elastic File System in more regions, and in a perfect world, be able to talk to EFS from Lambda.


> other than the lack of "shrinking a volume" support

It looks like you can shrink a volume, the edit box to change this just a "set a size" box. We have tested this and downsized one volume now and appears to have worked.


This is a pretty swell announcement.

While I think I still might do an LVM for adding capacity, the biggest change here would appear the ability to scale provisioned IOPS. While gp2 disks IOPS are primarily tied to storage capacity, which often leads to overprovisvioning for peak loads, the alternatively high cost of smaller and faster storage with PIOPS often times isn't worth it when peak loads only happen occasionally. That changes with this and I am sure this will make a lot of DBAs happy.

AWS in the last year has really impressed with giving you even more knobs to turn to get exactly what you need. While it's still not quite as flexible as GCEs custom instance sizes, the GPU and now storage elasticity sort of make up for it. While it's still pretty expensive, AWS feels more and more like I can use an API to build a custom spec'ed server instead of trying to shove my workloads into only a handful of options.


There's one thing that really annoys me about EBS volumes.

You can't attach existing (unattached) persistent volumes to an instance at instance creation time.

So for instance, I can't have a machine boot and expect to see ebs vol-xyz on device /dev/xvdg at boot. I have to boot the instance, use the API to attach it and then have my instance mount it.


> Spiking Demand – You are running a relational database on a Provisioned IOPS volume that is set to handle a moderate amount of traffic during the month, with a 10x spike in traffic during the final three days of each month due to month-end processing. You can use Elastic Volumes to dial up the provisioning in order to handle the spike, and then dial it down afterward.

At first I was a little dubious of the usefulness of this update, but that features is actually really useful. I'm not sure what the granularity is, but it would be nice to crank up he iops on a DB when doing some large batch operations.


If this is possible with RDS, that's huge. It's extremely frustrating to deal with spikes of transitory data into RDS and have to have the RDS disk space be sized to the maximum amount of space always-- it's a huge cost waste.


Reading through the caveat page [1], it seems the marketing copy is not quite in line with reality in terms of ability to scale down directly:

> Decreasing the size of an EBS volume is not supported. However, you can create a smaller volume and then migrate your data to it using application-level tools such as robocopy.

[1] - https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/consider...


The blurb is not misleading. PIOPS are a performance measure distinct from volume size.


I had a volume that needed resizing, so went ahead and tried moving from a 50GB gp2 to a 100GB gp2 disk online. It took 1 hour 40 minutes to do the resize. It's not a quick operation, and from looking at the docs you're limited to 1 modification per 6 hours on any given volume.


Whoa I guess that's more or less the same as downtime. Imagine if you were out of space but couldn't do anything for 100 minutes. I am watching EFS and hoping it falls in price or that EBS that backs EC2 instances becomes more automatically elastic.


To further my point: what I'm getting at is that EFS has the chance to be the answer to persistent storage for docker swarm and just persistent storage for docker in general. But at .30 cents a gig it's just too much right now.


was it unusable during resize?


I didn't explicitly test that, though I expect it's usable as the docs suggest that after you resize a live root partition you should reboot the box after modifying the filesystem extents (resize2fs). I was modifying a scratch partition -- nothing blew up while the resize happened.


What I would really like to see (on any cloud, really) is volumes that can be bound to multiple nodes/images/vm's/whatever. I have yet to find a solid distributed FS alternative to OCFS2 that actually performs in the same league and ease of use.


> What I would really like to see (on any cloud, really) is volumes that can be bound to multiple nodes/images/vm's/whatever. I have yet to find a solid distributed FS alternative to OCFS2 that actually performs in the same league and ease of use.

Have you tried EFS? https://aws.amazon.com/efs/


No, since it is stupidly expensive in itself. Moreover, AWS services in general are ridiculously expensive for what has always turned out to be a sub-par experience. Overly complicated to set up and manage, lacklustre performance, and premium pricing. Thanks, but no thanks.


> No, since it is stupidly expensive in itself.

EFS is ridiculous cheap if you use it the way it's intended. It's billed per-byte so you only store things that you actually need shared on there.

If you're looking for an OCFS2 replacement for storing large volumes of data or to run something like Oracle RAC atop then you're probably better off either doing it yourself.

> Moreover, AWS services in general are ridiculously expensive for what has always turned out to be a sub-par experience. Overly complicated to set up and manage, lacklustre performance, and premium pricing. Thanks, but no thanks.

Besides the pricing argument, I'd disagree with all of those points. It's a joy to set up once you know what you're doing and performance is more than adequate for all the services I've used (with plenty of knobs to fiddle). Outside of external bandwidth pricing I don't even consider it that far out of line pricing wise.


Hey koolba, I apologise in advance for nitpicking, but I'd like to make some counterpoints:

It's a joy to set up once you know what you're doing

Which is (in my own long and active AWS experience) another way of saying "You need to make a significant investment in time, effort, and tooling to get to a point where it is fast and simple to bring up a reasonably complex environment. You are re-learning domain knowledge you likely already have, just so that you can use Amazon's product. This in-itself isn't a bad thing, depending on what you get back for it, but the return value is "I now know what I am doing with AWS".

performance is more than adequate

I can get more than adequate pretty much everywhere. I am paying a serious premium, so I really want to be in the amazingly fast bracket, not the adequate.

Outside of external bandwidth pricing I don't even consider it that far out of line pricing wise.

Yeah, they are, both directly as well as in terms of value for money. There are many ways to get increased performance at lower costs with reduced complexity (complexity also cotst real money). I change hosting provider roughly every 12 months to take advantage of investments in new generation hardware and good deals, and everytime I do so, I check out new features, updated pricing, and do the sums to work out value for money (I offer high performance, managed, HA hosting to selected clients)


You're charging people to manage their environments (providing expertise) and complaining about needing to develop additional skills to do so efficiently.

If you have the data with price/performance comparisons, I'd think this is something the rest of us would be interested in.


You're charging people to manage their environments (providing expertise) and complaining about needing to develop additional skills to do so efficiently.

That is a straw man argument. AWS isn't the only way to efficiently manage infrastructure, and it can be argued it isn't even efficient, full stop.


Your making value decisions based on expertise in your existing delivery methods and trying to map them 1-1 to aws features, that may or may not map 1-1. You keep disparaging aws' pricing (stupidly expensive, serious premium), without providing an alternative/baseline.

Again, if you have numbers and not broad statements we'd appreciate them.


Your making value decisions based on expertise in your existing delivery methods and trying to map them 1-1 to aws features, that may or may not map 1-1

eh, no. You have no idea about my areas of expertise, what features I require, or how I map them, or what I map them to, so you cannot make this kind of statement (well, you can, and you did, but it is a fallacy).

you have numbers and not broad statements we'd appreciate them.

I'm sure you do. Doing what I do today in terms of functional parity, but doing it on AWS, will work out roughly 2.5 times more expensive, and will not gain me any additional features or functionality, and will leave me with about two-thirds of reduction in overall performance against metrics that are of interest to me. Also "we"? You speak for others? Who are the "we" that have appointed you as their speaker?


You started with general pricing arguments with hostile language without providing numbers to support - making generalizations without that data is useless, right? How is a reader supposed to take those statements? It leads to using your experience statement (I offer high performance, managed, HA hosting to selected clients) to make judgements on the comment.

The we should be anybody bothering to read this comment chain - who wouldn't be interested in seeing AWS' lack of value in your use case / competitors that succeed? Having done migrations to and from AWS for past employers, this would definitely interest me in the least.


I apologise if this comes across as an ad-hominem, but you sound like my ex-wife: you keep talking about how I say things, and keep ignorning the actual content of my comments. So, I'll do the same with you as I did with her - goodbye! :)


Here are some numbers:

AWS Internet traffic costs ~60x as much as renting bare metal servers.

No change in architecture can change that fact.


That's one example and matches a use case where AWS / "Cloud" providers may not be ideal. Hybrid setups with external CDN can mitigate some of this as well.


Except the EFS is really slow with small amounts of data and you collect/burn credits based on throughput so small amounts of data and load means you will probably be better of with a tape drive.

For a 10 GiB EFS drive you get a base throughput of 0.5 MiB/s and can burst up to 100 MiB/s for 7 minutes per day


Cloud is generally more expensiveness if you're still thinking in a non-cloud mindset. One has to rethink architectures to effectively use cloud providers.


I'd love to hear some more specifics, because this sounds like deliciously vague marketing speak. Cost is cost. 100$ isn't all of the sudden 50$ because I have "rethought my architectures". Personally, my mindset isn't cloud or non-cloud. It is about how how to solve an engineering problem in a reliable, maintainable, repeatable, and cost-effective manner.


Can you scale up and scale down seamlessly? Can you turn off unused instances? Can you automatically and seamlessly move to the latest hardware running on-prem?

Do you want to focus some significant percentage of your time on managing infrastructure or do you want to focus on making what you sell to customers better?

I know it sounds like marketing and I know there's a point at which the cost of cloud doesn't make sense, but I do believe that that inflection point is pretty low if you're in a field that is moving quickly or are say a new startup and need to get to market with a minimum-viable-product to get an idea off the ground -- what a shame it would be to fumble that opportunity re-inventing the wheel with infrastructure.


Can you scale up and scale down seamlessly?

Yes

Can you turn off unused instances?

Yes

Can you automatically and seamlessly move to the latest hardware running on-prem?

Yes

Do you want to focus some significant percentage of your time on managing infrastructure...

We spend about 10 minutes per day looking after the stack, with about 4 to 6 hours every 6 weeks looking at if and where we can improve things. We then spend about a week or so in depth once every year roughly evaluating different providers, and we typically end up moving to a different provider, which takes a few days.

...or do you want to focus on making what you sell to customers better?

This (infrastructure) is a big part of what we sell to our customers. What we do (and have been doing for about 15 years) is what you kids appear to be calling "devops" now. I was an early adopter (and discarder) of CFEngine...


Well then cloud probably isn't for you.


really? Thanks for your advice. I am on cloud, just not AWS.


I think I might have misunderstood your critique of my earlier points: are you on a public cloud or a private-distributed set of datacenters that your team has built up over the last 15-years? if the latter, how does that differ from a public cloud other than on price? And wouldn't a new firm (not yours with 15-years of investment) be better off on AWS or Azure or GCE and get all the benefit (albeit at higher cost) of your investment? A new firm doesn't have 15-years to spend on infrastructure.

Secondly, none of my points were specifically about AWS but they were about cloud in general.


Google Cloud can do this, however the volumes must be mounted in read-only mode.


While that is an interesting proposition, it isn't what I am looking for.


Unfortunately, what you're looking for probably isn't possible. You can't just attach multiple writers to a block device, slap a standard filesystem on top, and expect it to work. Standard filesystem implementations are built on the assumption that they have exclusive access to the underlying block device.

The block device interface provides no primitives suitable for concurrency control (no atomic compare-and-set, much less anything more sophisticated). Even if they did, you'd have to pretty much re-write the filesystem to make proper use of it. Even read-only multiple mounts are dodgy: the RO mounts aren't going to do cache coherence, and could trip up hard over intermediate states from the writer, but at least they won't corrupt the filesystem.

Realistically, if you want multiple writers on a network block device, your application is going to have to go to the raw block device, and you get to figure out intra-client co-ordination on your own. This isn't actually terribly useful. As an ex-insider, the nonexistence of this feature is not due to technical difficulty (it should be pretty easy for EBS to let you have multiple writable attachments), it's because 0.1% of customers would find the feature actually useful, and the other 99.9% would use it to shoot themselves in the foot.

If you want multiple machines accessing the same filesystem, you want a network attached filesystem. You can easily put samba / nfs-kernel-server / windows server in front of EBS. There are a few distributed filesystems, FOSS and propriety, if you need more scale. If you don't like any of the above, you're gonna have to build what you like yourself, or wait for somebody else to decide to build it for you.


Unfortunately, what you're looking for probably isn't possible. You can't just attach multiple writers to a block device, slap a standard filesystem on top, and expect it to work. Standard filesystem implementations are built on the assumption that they have exclusive access to the underlying block device.

Yes, it is possible, and even in some cases (mine being one of them) desirable. I mentioned OCFS2, which is designed specifically for doing exactly that. When you do this right, it is an extremely high performance, simple way of sharing storage in a highly available manner.


Ah, somehow I missed the part where you mentioned you wanted to use a shared-disk FS. That puts you in the unenviable position of wanting to use properly a niche feature that would be misused to catastrophic effect more often than not.


Indeed! Technically, what I need is possible. I appreciate the need to keep support calls to a minimum, especially when they are of the "I thought this would be a good idea but all my data is now shredded" type. However, if there were an option to contact support to have something enabled, or some other way to enable this, that would be really cool.


GCE supports mounting one volume as read-only from multiple instances (only one instance can write to the volume).


What's your workload like?


workload in which terms? :)


This is really fantastic. Volume management was always my biggest pain point with EC2. I never understood why they had to make it so complicated.


Do you still have to stripe over multiple volumes to get good performance?


For those of us running on EBS, this is welcome news. Now you can put off swapping out EBS volumes until you hit the cap of 16TB (or thereabouts).

If you're really clever, you'd set up your AMIs to use LVM and keep adding additional EBS volumes before you reach the 16 TB per volume limit. Though I'm guessing if you have the issue where 16 TB of storage isn't enough for you then you probably can afford to find a real distributed solution to this.


There's an entire class of engineers who cooked up clever lvm/zfs partionining schemes to work around this. It's bittersweet to throw away a working, custom solution when something more automated and standard comes around. c'est la vie!


> M3.medium instances are treated as current generation. M3.large, m3.xlarge, and m3.2xl instances are treated as previous generation.

Why? These other m3 instances are listed as current generation here: http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/instance-...


LOL somebody's going to set up an alarm to automatically increase volume size on low space and one day end up with max volume size on accident.


This is excellent, too bad I had to increase the volume size on our prod stack two weeks ago.


That's still terrible performance, your average MacBook Pro these days will provide between 100,000 and 300,000 IOP/s on 4k random reads and the average VM we deploy from our storage servers (commodity SSD based linux clusters that are cheap and ultra reliable) provide between 70,000-200,000 random 4k read (and write) IOPs per VM without a sweat and their TCO is significantly lower than amazons SSD storage that's 3 ½ - 10x slower per volume.

Even if you go and provision storage on Vultr, Ramnode or Sitehost all of which are a lot cheaper options and just as if not more reliable provide considerably faster storage with a minimum of 20/30,000 IOPs up to 150,000 IOPs per volume and you don't have the vendor lock-in or the hidden costs such as inter-zone data transfer etc...

Amazon really needs to pull finger on their storage performance, but it's not in their best interest. What they want you to do is scale horizontally which means not only do you need to invest in your system / product design to do this but you also have to put more money into Amazons pockets for additional compute / storage nodes and again the associated hidden costs.

I'm not digging on Amazon / outsourced hosting providers but they truly are pulling the wool over people's eyes while still trying to make everyone think that they're not only the best option they're the only option.


EBS is network-attached storage and has its benefits and downsides. I don't think it's fair to compare it to local storage.


Amazon's options for local, high speed SSDs aren't very compelling either. You're limited to the I2 instances and have to increase the instance size in order to increase storage performance

https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/storage-...

Compare this to Google Cloud, where you can attach a local SSD to any instance type and you can increase the storage performance by increasing the volume size.

https://cloud.google.com/compute/docs/disks/performance


I didn't compare it to local storage, I compared it to highly available network storage.


"Change everything"? "Game-changing"? I'll just leave this here:

https://cloudplatform.googleblog.com/2016/03/introducing-Goo... (March 2016)

(Disclaimer: Yes, I work on Google Cloud.)


Change everything is in context of this feature. Game changing is mentioned in the end in context of EBS and not this feature.

And the entire article is about AWS customers and it really is game changing for a customer like me.

Your comment is in poor taste.


Pretty sure it's a pun, since previously you couldn't change anything.




Applications are open for YC Summer 2018

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: