Hacker News new | past | comments | ask | show | jobs | submit login
Microservices without the Servers (amazon.com)
273 points by alexbilbie on Sept 6, 2015 | hide | past | favorite | 136 comments



This is Amazon's wet dream. Your app isn't an app at all, it's just a collection of configs on the AWS Console. When and if the time comes to migrate off of AWS, you realize you don't actually have an app to migrate.


Or, you realize that your processes are so minimalistic and well structured due to lack of options, that you only have to write a custom request router over the weekend to migrate.

Also, Lambda-like options are available with most PaaS providers now.

It is not much different and might be easier than migrating web app from a custom PaaS. The only issue is, and has always been, the migration of data. And I don't see that getting solved until some startup writes a bunch of layers on top of a bunch of providers. It's very tricky, for good reasons.


> Also, Lambda-like options are available with most PaaS providers now.

Which other ones are there? I used to use PiCloud until they were bought out by dropbox and it atrophied. Shame, it was exactly what I wanted in a service.


http://hook.io is an open-source microservice platform.

We launched a month before Amazon Lambda, and have better features like full support for streaming HTTP.


But you don't have future price information; so it's a bit hard for an actual enterprise recommendation. This will cost X in the future, but it's free for now? Free for now is great for me as a tinkerer/developer; but I couldn't recommend it to a client?


We do offer paid accounts, and in fact already have a nice size group of paying customers.

Still in the process of establishing our service tiers, but our basic hosting plan starts at $5.00 per month.

http://hook.io/pricing


Thanks, I'll have a look into that.


IronWorker from Iron.io is similar to Lambda.


ironworker is not per-request, it's like jobs that you schedule from a queue


Iron workers can be triggered by HTTP post webhook requests.


Pivotal Web Services (which is like BlueMix, Cloud Foundry based, and you can run your own as it is open source) http://run.pivotal.io


Cheers, being able to move on if I wanted is a definite benefit, thanks.


Webtask by Auth0 https://webtask.io/


Google Cloud - Google Dataflow, Certain App Engine APIs

IBM Bluemix - IBM Workload Scheduler, RabbitMQ (this is weak, but similar functionality can be achieved)

Microsoft Azure - Apache Storm (generally available now)


Thanks for the list, I'd not looked into the google offerings for a while, they're actually much closer to what I'm after now.


I made this, so you can run a Node Express app in the AWS API Gateway. If the API Gateway ever goes away, it wouldn't be too hard to host your express app elsewhere:

https://github.com/johntitus/legerdemain


Interesting that the article talks about load tests but omits any results.

I was trying out a Gateway API + Lambda + DynamoDB setup in the hope that it would be a highly scalable data capture solution.

Sadly the marketing doesn't match the reality. The performance both in terms of reqs/sec and response time were pretty poor.

At 20 reqs/sec - no errors and majority of response times around 300ms

At 45 reqs/sec - 40% of responses took more than 1200ms, min request time was ~350ms

At 50 reqs/sec - v slow response times, lots of SSL handshake timeout errors. I think requests were throttled by Lambda but I would expect a 429 response as per the docs rather than SSL errors.

My hope was that Lambda would spin up more functions as demand increased, but if you read the FAQs carefully it looks as though there are default limits. You can ask these to be changed but that doesn't make scaling very realtime.


Correct. Lambda isn't designed for high data through put. That's what Amazon Kinesis is for. Each Kinesis shard can handle 1000KB/s data injestion rates. You would write your data to a kinesis stream, then use Lambda to respond to the kinesis event to write data to your DynamoDB table.


Thanks for the info on this, I hadn't seen Kinesis before. I also tried something similar with S3 upload but Kinesis looks a much better solution for what I'm trying to do.


Kinesis isn't a good idea for low-latency queueing. It can handle high throughput, but it can often take anywhere from one to ten seconds for a message to make it through the queue.

Given that DynamoDB can reliably write in the 4-5ms range, a kinesis queue may not be necessary. Unless the point of the Kinesis layer is to keep the cost of DynamoDB provisioning low?


Are you using Node.js or Java for your Lambda function?

If you are using "Node.js", you may be seeing slow times if you are not calling "context.done" in the correct place, or if you have code paths that don't call it.

Not calling context.done could either cause Node.js to exit, either because the node event loop is empty, or because the code times out, and Lambda kills it.

When node exits, the container shuts down, which means Lambda can't re-use the container for the next invoke and needs to create a new one. The "cold-start" path is much slower than the "warm-start" path. When Lambda is able to re-use containers, invokes will be much faster than when it can't.

Also, how are you initializing your connection to DDB? Is it happening inside your Lambda function? If you either move it to the module initializer (for Node), or to a static constructor (for Java), you may also see speed improvements.

If you initiate a connection to DDB inside the Lambda function, that code will run on every invoke. However, if you create it outside the Lambda function (again in the Module initializer or in a static constructor) then that code will only run once, when the container is spun up. Subsequent invokes that use the same container will be able to re-use the HTTP connection to DDB, which will improve invoke times.

Also, you may want to consider increasing the amount of RAM allocated to your Lambda function. The "memory size" option is badly named, and it controls more than just the maximum ram you are allowed to use. It also controls the proportion of CPU that your container is allowed to use. Increasing the memory size will result in a corresponding (linear) increase in CPU power.

One final thing to keep in mind when scaling a Lambda function is that Lambda mainly throttles on the number of concurrent requests, not on the transactions per second (TPS). By default Lambda will allow up to 100 concurrent requests.

If you want a maximum limit greater than that you do have to call us, but we can set the limit fairly high for you.

The scaling is still dynamic, even if we have to raise your upper limit. Lambda will spin up and spin down servers for you, as you invoke, depending on actual traffic.

The default limit of 100 is mainly meant as safety limit. For example, unbounded recursion is a mistake we see frequently. Having default throttles in place is good protection, both for us and for you. We generally want to make sure that you really want to use a large number of servers before we go ahead and allocate them to you.

For example, a lot of folks use Lambda with S3 for generating thumbnail images, sometimes in several different sizes. A common mistake some folks make when implementing this for the first time is to write back to the same s3 bucket they are triggering off of, without filtering the generated files from the trigger. The end result is an exponential explosion of Lambda requests. Having a safety limit in place helps with that.

In any case, if you are having trouble getting Lambda to scale, I'm happy to try and help.


Thanks for detailed reply.

I'm using Node.js, this is a gist of the Lambda function: https://gist.github.com/paulspringett/ec6d3df65e977342d6ea

I'm initialising the DDB connection outside the function as you suggest. However, I'm calling context.succeed() not context.done() -- would this be problematic?

I'll try increasing the "memory size" and requesting an increased concurrent request limit too, thanks.


Your code looks correct. I would expect something closer to 50ms in the warm path (300ms in the cold path seems about right).

I'll take a look tomorrow and see if I can reproduce what you are seeing. I'm not super familiar with API gateway, so there could be some config issues over there.

If you want to discuss this more offline, feel free to contact me at "scottwis AT amazon".


The usual complaint on HN is that it is too easy to rapidly consume AWS resources. The usual solution proposed is low initial limits or caps.

You can't have it both ways.

I suggest making that limits request, then retesting and reposting; otherwise, you've sold this benchmark short.


I see a lot of people disagreeing with the overall direction of "less servers, more services". I totally get it, I used to be one of those people, but I think the shift to "less hassle development" is inevitable.

5 years ago people used to debate whether we should use a virtualized server vs. a physical one. You still can see similar discussions but rarely - we all have more or less agreed that using AWS/Rackspace/etc. is good for a business in majority of use cases.

I think 5 years from now we'll still be debating servers vs. services, but the prevailing wisdom will be that "services" have won.


It may well be so that companies will run their private clouds on the colocated servers. What wins in that case?


Maybe for some companies/use cases, however my feeling is that the setup time and dealing with hardware directly will always be too much hassle for the majority.


Except they will deal with hardware anyway - there are about several thousand different devices in our offices. How difficult is it to get another 2 admins - and two could be enough in the modern world of everything automated.


It is pretty cool but not really serverless, you are still handling http requests via Amazon API gateway and in general you are relying and paying for quite a lot of Amazon services. Not sure how much better this approach is to serving image magic via PHP for example, it would be good to see some numbers.


This removes nearly all of "devops". You don't have to mess around with figuring out how many ec2 instances you need (or deal with auto-scaling groups), how to secure the linux or whatever you stick on the ec2 instances, etc.

There's still a ton of creating zip file artifacts of your lambda payloads (instead of pushing to a magic git repository that amazon controls, say), so there's a bit of "build monkey"ing to do instead of "devops"ery. But I think a lot of shops will be happy to make that trade, as "build" is closer to their core experience than "devops".


Yes, you get rid of devops.

You gain vendor lock-in. You are now tied to the Amazon platform. If they shut down or suspend your account, for any reason, you are out of business. You are also paying premium for the platform, with the cost of devops built in.

I'll take an open ecosystem that gives me options to migrate my business anytime over a proprietary solution.


There's always this tension, its really interesting.

I think the open ecosystem approach will work well for apps/systems that are more established and have a predictable and sustainable user base and related revenue stream to maintain the system.

But for new ideas, early stage startups, the open eco-system approach will be a lot more work than using the high-level services providers like amazon are developing ...basically a no-stack approach can help one quickly find some product / market fit. Once some level of baseline level of utilization is understood then switching to a more difficult, but less-locked in approach would be smart.

all that said, there's probably a middle ground here too (but arguably not) -- where one deploy's using docker containers, private registry & image repo, s3 for static immutable storage and some open database like PostgreSQL on RDS. All that could be layered on top of AWS and other providers -- but it will still be a PITA to move the whole thing.

BTW -- take a look at the elasticbox product -- they really have an interesting approach to cloud that sort of obviates the lock-in issue by allowing one to build apps across clouds from the start using their "boxes" metaphor.


pretty much this. amazon can at any point shut down any of these services or nerf it. if you built everything up to this point on amazon and they shut it down, it's more work.


To add to your point, they have done this before and are still doing it. There is no guarantee of continued service.

http://recode.net/2015/03/18/amazon-will-shut-down-amazon-we...


Amazon Webstore wasn't part of AWS, though. It was part of their commerce wing - very different.


remember PiCloud? they shut down after I spent a quarter building around their API calls. I never want to repeat that mistake and you also get the bonus of being able to sell your source code or deploy local cloud if an enterprise are willing to pay extra for it.


Sorry, this is actually factually incorrect.

As shown in another thread here, this service does not infinitely auto-scale (the recommendation was to use Kinesis), so you still have to know which services to choose, which is pretty much a full-time job with the number of different services offered these days.


I'd say that most people don't need auto-scaling, and this configuration is almost definitely harder to understand and learn than a simple php script that works everywhere.


NO it's not harder to understand. If you are doing anything remotely serious, then your php script needs to be on multiple servers for redundancy, it needs to be behind a load balancer, you have to be responsible for the security of the VMs the php script runs on, and on and on and on and on.

Edit: well, unless you run your php script in a PaaS like Google App Engine.


Yet, most people probably still FTP their php files onto a shared hosting environment. And for most people, that's actually good enough.


yep -- the devops part is valuable on the app creator/dev's side and this architecture is very valuable on the amazon/cloud service provider's side too.

The smaller the scheduled unit of code is, the more densely they can pack the workloads and make more efficient use of their system ...squeezing out the pennies at their scale makes a lot of sense.

I've seen some vendor lock in comments here -- but certainly the big 3 or 4 service providers will have these features figured out soon - and seems to me some open source to schedule functions across compute will appear in no time that could be used by the smaller providers. replicating all the other features is much harder -- this is a great moat amazon has built.


Maybe this statement will date me, but I don't really like this trend of having to deal with less things. I mean, sure, it's more simple to get started. But when things break you are screwed because nobody really knows what is going on.


The point is you don't provision your own servers, not that servers don't exist somewhere executing your code. People seem to be taking the marketing too literally.


No it's not taking this too literally. I re-read the architecture a few times then stepped through the source to make sure I wasn't missing something. I was thinking that maybe the servers were only used to deploy the image processing code to the client which could then run the code locally. But that's not the case.

Serverless implies that a server is not required after an installation step and that you only need your local device. Or perhaps too that you only need a collection of like clients to do something P2P.

Just because you as the developer don't have to think about the server doesn't make it serverless.


It's good add copy and ranks high on the link-bait scale.

They say after all:

"and then unit and load tested it, all without using any servers."

...with a title of "Microservices without the Servers" when what they mean is

"without provisioning any servers yourself" which is obviously much less attractive.

My curiosity was peaked by the title the title, inserting "w/o provisioning" would have not gotten me anywhere near as interested in the topic.


Would you apply the same analysis if I claimed I could buy a t-shirt from a store without using any factories?


Are servers really that hard to manage these days? This seems like way more work and pretty limited in what it can really do, especially compared to a few lines of code in any decent web framework that can perform a lot faster.


I personally have an old-school ops background, so for me managing servers is fine. However, I just had to drag some of my colleagues through an explanation of the infrastructure I set up earlier this year. For them, it was about 80% irrelevant ops-ish details versus 20% stuff they actually cared about. And I get it. They shouldn't have to care.

The whole notion of a "virtual server" is basically like saying, "radio with pictures" or "horseless carriage". It only makes sense in a specific historical context. We're still figuring out what comes next, but I think it's safe to say that 20 years from now the main building block will be something different than the simulation of a 1970s university department minicomputer that we're all using now.


Love that analogy


If you're a single developer/small team with a very small product, managing servers is a chore that won't add any value to the product you're building.

So you either spend very little time on it and build servers adhoc ("snowflake" style - SSH in, install some stuff, etc), or you spend precious time doing "the right thing" - which right now is a huge universe of options (Chef/Puppet/Ansible, Docker/other containers/no containers, etc).

If you're part of a larger team, not having a properly structured infrastructure is a nightmare - specially when it comes to scaling or dealing with failures of all kinds.

TLDR; - yes, I'd say it's somewhat hard...


What I really hate about the adhoc servers and how people keep using them is that alternatives are seen as spending lots of time on the deployment itself. But actually it's so much more - scripting your setup gives you your first disaster recovery scenario. Gives you a way to keep a list of "incidental dependencies" (things your apps don't import, but you need within some process). Gives you an audit trail of "things that changed" if you keep scripts in a repository. It's not only things you should have, it's things that you'll rarely add later on, but may save your ass at any point.


> I'd say it's somewhat hard...

I don't think it's hard, it's just time consuming. And we all know that "time is money", specially for small teams or solo devs (as you pointed out).


Doing it right is hard. Scaling infrastructure throughout multiple zones while keeping data as consistent as possible, deployments as easy as possible and having as few SPOFs as possible is pretty difficult (and done differently by every single team). The range of things that can go wrong is huge...


Not every product needs AZ from the start (specially for small teams or solopreneurs), in most cases you'll be doing over-engineering. And in the use cases you need AZs and do it right new tools like convox* are really easy to use and can save you a lot work. It's never been this easy to manage your own infrastructure.

During the years I've used ssh, puppet, fabric, ansible, capistrano, cloud formation, etc for managing servers and infrastructure. And I think that the main benefit of any PaaS, AWS Lambda or AWS API Gateway is (obviously) that they're time saving and abstract the internals. In fact I use them in several small projects.

* https://www.convox.com


I think doing it right is somewhat hard. Not hard just takes time.


I think biggest news here is the possibility to share a server with someone else, with amazon acting as a broker of CPU time.

With virtual machines, we started to share physical resources which were otherwise wasted in dormancy. But even virtual machines were a waste of resources, as you need someone to set them up, keep them online and working. What if you could share the money you spent with those people and their knowledge?

Amazon just did that to you.

All some people want is to run code somewhere on the internet.

Amazon gather them together to share a server and an admin team.

In my point of view, Amazon Lambda is a huge improvement in efficiency for people around the world, as I (or someone else) can pay for a very tiny part of the work of a very specialized sysadmin and focus on my code.


If your scale is small (you're in the 'two people in a garage' phase of your company), then the cost of having your own server is small, even though you waste most of it. Hetzner will rent you a real physical server with an x86 processor and 32 GB of RAM for about 40 euros a month. I'm sure you can get smaller virtual servers elsewhere for less.

If your scale is large, you have enough work to make full use of however many servers you rent.

Reducing infrastructure waste at small scale seems pointless to me. The actual cash money saving is just lost in the noise.

The only way i can see that i could be wrong is if there's a suppressed demand for huge numbers of tiny-scale services. Is there?


My scale is unpredictable. I don't sell software, I sell consulting and training services. To make our business run smoothly and maximize the available time of the two of us who run the place, we can't be held down by managing servers and services.

Investing in things like API Gateway and Lambda gives me the ability to keep my programming desires satisfied, while reducing the friction of tedious, time wasting operations, like automatically encoding a video of a class, extracting thumbnails, and setting permissions via API calls back to our home-brewed DAM system to make the video available for our students. Sure, I could do it myself each time. Six times per day of teaching. And turn down the process priority so it won't interfere with the rest of my job.

The costs of using AWS at that scale, while certainly nothing like the scale some of you are dealing with, are a considerable savings over the hourly rate I can charge for the time spent in my subject domain, and would have otherwise lost. We manage VMs for classes already, and most of that is now automated by our Hubot. In the coming weeks, the services/sites/apps we do have running will be migrated over to the docker-based infrastructure I'm building (in my copious free time, ha!), and the services therein will be interconnected with API Gateway and Lambda functions. It's a beautiful thing. I'm really quite pleased with it, and proud of what we've been able to do.

I'm know that I'm not alone, and maybe this post will encourage others in similar situations to share their experiences too. Maybe my peers don't frequent HN - that's an unknown to me, I guess. But there are far more entrepreneurs like me who enjoy the job we do, and will do everything we can to offload the time sucking administrivia to some other system, especially if we get to flex our programming muscles along the way.


Sounds very interesting. Are your classes webinars or some onsite classes? Do you have any online resource I could read up on? We've been looking to experiment with AWS API Gateway for some time, with something like JAWS [1] but it seemed too much to get somebody from the team to become an expert in all of this without knowing how it will turn out. It would be really cool to have a good online resource or even webinar that uses the AWS serverless stack.

[1] https://github.com/jaws-stack/JAWS


> Reducing infrastructure waste at small scale seems pointless to me

Agreed. I think what it avoids though is setting up ansible + puppet + firewall + updates + monitoring + blah blah stuff you do to secure and dev-ops a prod app (for a baby app).

You can half-arse this stuff in a day for a new company. I assume the defaults of lambda will be a little more secure... but amazon security is pretty complicated to set up right, so not totally sure what the max win will be.


> If your scale is large, you have enough work to make full use of however many servers you rent.

If this was close to true, AWS would not have existed. The thing is, when you're big, you're planning for spikes, and you're paying for your maxima per month. (Spare capacity aside, as that has to be a percentage of your desired monthly capacity.)


It doesn't matter if you are big or small, you're always wasting resources.. Just count how many servers you have using <50% cpu. By having thousands of users, amazon can organize servers in a way, no cpu is wasted.


If they kept CPUs 100% busy, how would they deal with a surge in demand or a failure situation? You HAVE to keep spare capacity or 1 failover crashes your entire fleet of 5 billion servers.

You could correctly argue that usage increases with scale.. so while you with 1 app can maybe only keep a server 20% busy, Amazon can afford to keep their servers 70% busy.. which is part of the story.

But the other part is you are paying for the convenience. You can get raw CPU way cheaper than what lambda gives it out at. Even if you only kept your CPU usage at 10-20% - for example Digital Ocean would still be cheaper than the Amazon Lambda app version. You are literally paying a surcharge to not have to set up your own servers, or deal with maintenance. I find it very unlikely you will save big money with lambda over your own servers.


Yea I can see that. Lambda is cool and there are some niche cases where it's useful...

However when talking about microservices, any non-trivial app quickly gets extremely complicated to setup and actually more inefficient in the amount of time it takes to execute each request. And probably more expensive since there still have to be servers somewhere, always ready to run your app and with all the management and orchestration layer on top.

Also VM technology is sufficiently advanced that they can be multitenant on the same base hardware very efficiently and consume almost no resources while in a sleep state. With live-migration and auto-scaling advancements it's not a big deal.


That kind of server sharing is exactly what all cloud platforms (IaaS, PaaS,etc.), as opposed to dedicated servers, are about.

And that without having to set up virtual servers on an IaaS like EC2 isn't new, lies of people, big and small, were doing PaaS offerings long before Lambda.


For me as a mostly one-man show, yes, having a server to manage is a pain. Investing some extra time into a “serverless” solution is greatly preferable to managing a server. And if not anything else, with a server you always have at least extra security to take care of.


What about the PaaS like elastic beanstalk or heroku? Are you coding an entire app using Lambda?


To clarify, I don’t currently use Lambda at all. But generally speaking, any service that doesn’t require me to maintain a server is a plus (Beanstalk and Heroku included).


"Microservices without the Servers: the Uberization of IaaS as PaaS for SaaS"

Like when you say you have no carbon footprint because you don't own a car, even though you call a taxi every time you want to go somewhere?

Are microservices different from SOA? Or is it just a more modern, streamlined buzzword?

You say "microservices," but all I see is "omg, you realize inter-node latency isn't a trivial component to ignore when building interactive services, right?"


> Or is it just a more modern, streamlined buzzword?

You'd be surprised how many people taking the "micro service" bait have never heard of an n-tier architecture or even SOA. I pity the next generation of developers that will have to maintain all that micro-servicing mess developed today.

Instead of relying on a simple architecture with messaging and workers, micro-service evangelists are turning maintenance and deployment of applications into a nightmare for the sake of being hip.

I won't even talk about amazon lambda which is the "acme" of vendor lock in .


Yes, microservices are just a rebranding of a SOA subset. http://martinfowler.com/articles/microservices.html#Microser... But I think this direction is inevitable, and we'll soon see freemium microservices.

Amazon's "Lambda" page (esp scroll down to the "benefits" https://aws.amazon.com/lambda/ ) shows it's more like offloading some tasks (like worker threads in the cloud).

I had a play with the second (linked) app, SquirrelBin http://squirrelbin.com/ which can edit and run javascript snippets. The latency is awful, 2-3 seconds for me (I'm in Australia, but that should only add 200ms roundtrip or so). They seem to spin up (reuse?) an entire instance for each request - it's incredible that it's as fast as it is.

But the problem is the architecture of this specific app: the delay would be fine if you could edit-run-loop code locally, without the cloud. But they wanted to demonstrate quick development (for them) by just making a CRUD app, using AWS Lambda existing http endpoints for PUT, POST, GET, DEL. So after editing you have to save, load and run - and each one interacts with the cloud. BTW the article about SquirrelBin https://aws.amazon.com/blogs/compute/the-squirrelbin-archite...


They seem to spin up (reuse?) an entire instance for each request

There are some clever platforms running on bare Xen (no direct OS) that can spin up an entire instance and destroy it on every request pretty quickly. http://erlangonxen.org is a great example. 100ms to boot your entire "system" for production usage.


The biggest difference I see when compared with SOA is that microservices uses a dumb-pipe model, whereas SOA's use of an ESB pushed it to a smart-pipe model, which, was much more complicated overall to set up, manage, scale.


Here's how I deploy code, without having to modify it:

    cf push myapp
It figures out the language/runtime I'm using (Java, Ruby, Go, NodeJS, PHP), builds the code with a buildpack, then hands it off to a cloud controller which places it in a container. My code gets wired to traffic routing, log collection and injected services. I can deploy a 600Mb Java blockbuster using 8Gb of RAM per instance or I can push a 400kb Go app that needs 8Mb of RAM per instance.

I don't need to read special documentation, I don't need special Java annotations.

I just push. And it just works.

I'm talking about Cloud Foundry. It runs on AWS. And vSphere. And OpenStack. It's opensource and doesn't tie you to a single vendor or cloud forever.

I worked on it for a while, in the buildpacks team, so I'm a one-eyed fan.

Seriously: why are we still talking about devops? It's a solved problem. Use Heroku. Install Cloud Foundry. Install OpenShift. And get back to focusing on user value, not tinkering.

Disclaimer: I work for Pivotal Labs, part of Pivotal, which donates the largest amount of engineering effort on Cloud Foundry (followed by IBM).


As a note, I decided to look up CF based on this comment. This lead me to cloudfoundry.org, which appears entirely devoid of content. Just useless talk about "heavyweights" and so on. The menu didn't appear to have any links to anything useful either. Clicking on products lead to a page with three product names. Having visited the site, I'm actually now negatively disposed towards it (but your comment outweighs my experience, and I'll still attempt to check it out).

Granted I only spent a minute, but if this is a typical experience, I'm unsure how anyone would come to the conclusion that there's any software worth using there.


Frankly, I agree with you. We suck at developer outreach. It bugs me.

Unless you know where to find the docs[0], they're not obvious. There's a single master repo[1], but it's oriented at deployment and works by aggregating dozens of sub-projects[2] into a BOSH release and BOSH deployment.

... which requires you to know what the hell BOSH[3] is ...

So recently we started trying to make it easier. The best place to start tinkering is Lattice[4], which is a cutdown extract of Cloud Foundry. or Pivotal Web Services[5]. Or IBM BlueMix, I guess[6].

[0] http://docs.cloudfoundry.org/

[1] https://github.com/cloudfoundry/cf-release

[2] https://github.com/cloudfoundry and https://github.com/cloudfoundry-incubator

[3] http://bosh.io/docs

[4] http://lattice.cf/docs

[5] https://run.pivotal.io/

[6] https://console.ng.bluemix.net/


Thanks for the links, much appreciated! How does CF compare to go.cd? Will there be a lot of setup work required?


go.cd fills a different role. Funnily enough go.cd was the main CI system used for Cloud Foundry, though it's being steadily replaced by concourse.ci.

Cloud Foundry is a bear to install because you will probably wind up needing to wrap your head around BOSH, the IaaS orchestration tool. Once you get past that hump it's relatively obvious. Getting past the hump is tough.

Bear in mind that it's a complete PaaS. The kind of thing you bet your company on (and our customers do). BOSH is a heavyweight system that predates a lot of later tools like Terraform or Cloud Formation. On the other hand, we use BOSH to update Pivotal Web Services to the latest cf-release every 2 weeks or so and basically, nobody ever notices. It just works.

The easiest way to start is either Lattice or a public Cloud Foundry installation. The former has the advantage of being easy to install on a laptop, and it's intended for developers to tinker with. The latter has the advantage that someone else ran `bosh deploy` and is provisioning the VMs that Cloud Foundry runs on. Pivotal Web Service (based on AWS) and IBM BlueMix (based on SoftLayer, I think) are the two main ones.


> Getting past the hump is tough.

So... it's not a solved problem after all? :)


Oh you :)

You only have to install CF once, not every time you deploy. After that it's easy to upgrade. We do so on Pivotal Web Services every time cf-release is incremented, which is approximately fortnightly.


I just had exactly the same experience. I even went to Platform->Documentation and now I'm even more confused. A page describing what exactly the platform is would go a long way.


Because DevOps != CICD, and a PaaS doesn't cover the most important parts of DevOps (creating feedback channels between your Dev and Ops teams and a structure for continuous process improvement).

PaaS just makes life easier by reducing the amount of communication needed. It's more like a fancy wall that makes it easier for developers to throw code over it.


I'm not sure I agree (insofar as CI is in our DNA and CD is what we're trying to make normal). To me the point of PaaS is to remove the need for a specialist gatekeeper in the feedback loop.

It makes Ops happy by isolating the damage Devs can do. It makes Devs happy by removing Ops roadblocks and allowing Devs to see, immediately and directly, what their apps are doing.

Classic ops culture emerged because computing was an expensive shared resource. Early tooling favoured utilisation over isolation because the latter is expensive. So it became possible for devs to accidentally or deliberately step on each other's toes. Ops was like the mature adult in a shared house: making sure everyone kept their junk out of the living room, restocking the bathroom, insisting that dishes get washed.

With a PaaS that changes radically. Devs can't burst out of the box they assign themselves, subject to Ops-set resource pools. Ops hands Devs keys to personal apartments that are created on demand. What they do inside is up to them, Ops needn't worry or care.


I'm playing with these exact things now and it is very enjoyable so far.

My main worry is not on the technical side but on how things are charged for. If I build something that starts to get used I am covered in terms of scalability, but not in a way that protects me from 'cost scalability' so to speak. I know I can set up billing alerts and hit a big 'shutdown' button in response to high load, but what I don't think I can do is throttle these services based on the money I want to budget/spend. With my own services I have a hard cost limit, with a hard scalability limit, or rather I just accept that my response times will go down or fail once I've allocated all I can afford.

If there something for AWS in terms of 'cost throttling'? It may be a gap in their services, especially for people want to build things that might get traction?


As a small user, I've bemoaned the lack of 'cost throttling' for a while. I spend minimally and don't want to worry about e.g. private key leaks that cost a fortune, or some malicious traffic hitting my s3 hard.


As a thought exercise it is interesting to think through what we would 'trade' though. For the S3 case there could be a hard cap of allocation, but once it is reached then does the service just switch off? Could it just respond to, say, 100 requests per second, as a ratio of how much money you want to spend in some time period?

For Lambda style services it's odd because if I was hosting them in my own container (or an AWS one even) then they would still accept requests but their responses would start to slow (but still work). Trading through-put for cost/scalability?

The trouble with AWS services for that 'fear of success' type of budgeting (different from losing a key or malicious calls) is they are either on or off, with no in-between cost/latency/resource allocation ratio.

I'm surprised this hasn't come up more actually (or I just missed it), considering the overlap between devs that would like less devops and devs with limited budgets..


One of the easiest mitigations to this is to not even create credentials that have access to do anything that could run up a bill in any short amount of time. Between the Console (access protected with an MFA token) and IAM roles, neither you or your application ought to ever have to handle raw AWS secrets.


Yeah, I do use IAM roles heavily, 2fa, etc : )


The other thing that would be interesting to look at would be the cost of running an alternative implementation on a more standard platform. What would be the cost of running the same application on AWS VM's, for example?

Is Amazon trying to extract a profit? Unless they are using specialized hardware, can they really execute this service any cheaper than you could? Or do they pad the billing to fund Amazon?

Why would I go with a likely much more expensive option (I'd be pretty surprised if Amazon's billings for this service scale linearly with their costs) instead of the option that provides more traditional cost scaling?


You can set up an alert to notify you as soon as your bill exceeds a given threshold.

http://docs.aws.amazon.com/awsaccountbilling/latest/aboutv2/...


The alerts don't work too great. I wrote this in response to a previous story:

"I actually had a billing alert set, and I did get an alert, but it looked like this: "You are receiving this email because your estimated charges are greater than the limit you set for the alarm "awsbilling-AWS-Service-Charges-total" in AWS Account XXXXXXXX. The alarm limit you set was $ 10.00 USD. Your total estimated charges accrued for this billing period are currently $ 1050.95 USD as of Saturday 18 July, 2015 17:34:36 UTC." So, it came a bit too late to take action."

https://news.ycombinator.com/item?id=10024958


Yes, thanks - I do use those. I was thinking of a slightly different scenario though when compare to my own container, as in my reply to bpicolo below (or above).


Same worry about the cost. The API gateway seems pretty expensive.


Great to see Lambda stepping up their serverless game. We're big fans of this approach and are hacking on something similar to this at StackHut[1], but:

* Mostly OSS to avoid lock-in

* Git integration

* Full stack specification (OS, dependencies, etc.)

* Python/ES6 support (Ruby and PHP coming)

* Client libs so you can call your functions 'natively' in other languages.

It would be awesome to hear what people would like us to build for them. Here is a blog-post on how to build a PDF -> image converter: http://blog.stackhut.com/it-was-meant-to-be/

[1] https://stackhut.com


I think your pricing scheme[1] could put a lot of people off. I fall into the category where I'd be fine on the free tier (< 10 private services), and yet I don't want a free service.

I know if it's free, then it's going to be under some kind of fair usage policy, and you're going to rate limit me or have some kind of restrictions eventually. There's no way it can be sustainably free if I start to push it really hard. I'd prefer to just know the limits upfront, or have some kind of usage based pricing.

[1]. http://stackhut.com/#/pricing


Hey Jake -- thanks for your feedback, that is really helpful.

We're going to add some better pricing. How would you like this to work?

- per month, flat rate, ups w/ usage - per request - per compute / storage

We really like the idea of only paying for the compute you actually use a la lambda; one of my gripes with Heroku was having to pay $x when the server was only in use for short bursts. Why should I pay for downtime?

That said, we've actually had many people say they would prefer per month, as it is more predictable and they are worried it could spiral out of control.

I would be super interested to hear your thoughts.


I'm not sure that per-request would work; because the resources that a request takes can vary wildly in resources used / time taken. PiCloud (somewhat similar idea) used to charge based on processing time essentially (down to the millisecond I believe).

I personally think that is the correct kind of pricing for something like this; but monthly plans including X time/requests would likely be a good idea.


Different person, but I'd prefer per-request with N requests packaged for free and (N+1)th request charged if charging is enabled, or returning error if I want to stay free. (so there are no surprise charges for service spamming)


The current problem with this architecture is the network cannot be used as a security layer. Databases, search engines, etc need ports opened to the public rather than to selected servers.


If you're on a private network (like your own DC), I'd argue that network-based security is a poor idea because then an attacker just needs to plug in and have pretty easy access.

If you're on the public cloud, I'd argue that this is an even bigger problem as you're then relying on VPC (or the equivalent) to always work correctly.

Why not ignore the networking and just build in robust security? Pubkey authentication where possible, random long passwords where not? Retry limits for clients, network intrusion detection, etc. To me, relying on the network to keep you secure seems a bit like a crutch.


This is an optimistic counterpoint.

However realistically nearly all persistence services such as MySQL, Postgres, MongoDB, Memcache, ElasticSearch, etc either have been insufficiently hardened as a public service or flat out are not intended to be used on a public port and depend on the network for security.

There is not currently an option to connect an RDS database instance to a Lambda function without opening said database instance up to the public. It's a problem.

You are correct that SSH tunneling could be used to provide security but such usage is not yet a standard approach.


Totally agree. It's our most requested feature on the Lambda team and a priority to enable.


Awesome, right up until you need a feature they don't want to offer, or they decide to sunset a feature you're the only one using, and you have absolutely 0 control over it.


If you are interested in a 100% open-source version of Amazon Lambda, you can check out http://hook.io/


Lamda does not work inside a vpc nor can it connect to one. You cannot use RDS period. This severly limits options currently available from a database and security perspective.


AFAIK the AWS team is working on this. It's one of the most asked for features.


Too much vendor lock in. Will keep my VMs thanks.


Exactly. What if amazon decides to close your account, because you know.. they can. Now you're pretty much screwed.

With traditional VPS you just point ansible/salt/puppet to new servers and you're good to go.


Ironically this happened to me due to a card expiry fuck up.


Same thing happened to me. Card got expired but they wouldn't let us add a new card (or payment method as they call it) because the account was in some invalid state. When asked what it was, they couldn't give the details due to legal reasons.

It took about 2 months with support (Business support) and finally they chose to close the account.

We created a new account with a new card and migrated our AWS infrastructure. Unfortunately we still have to use AWS..for now.


Yeah, that's what I think too. This feels too far into the vendor-lockin, downtime death zone, as reliable as Amazon might be.

Plus, while it might be cool for some microservices, it seems like it would be a lot more unwieldy trying to do a full scale application.


Or as App engine decided to do, increase prices/decrease quotas. What now? rewrite your code or just pay the new prices.


Lambda functions are just Docker containers.

API Gateway could be replaced by something like Open Resty.

S3 can be replaced by any other file storage solution.


And language lock-in. So far Lambda supports only Node.js and Java.


Java gets you ruby via JRuby and Python via Jython... (Not sure how current they are these days). So missing C and Rust type stuff I guess?


The AWS Lambda team wrote a blog post about how to use Python - https://aws.amazon.com/blogs/compute/using-python-in-an-aws-...

You can also call native binaries so you therefore you can use languages like Go or C or anything else that statically compiles.


I came here expecting to read about "distributed computing in the peer to peer network" and instead found a how-to for "servers-as-a-service" from Amazon.

Check this out instead:

https://crowdprocess.com/


Folks interested in this might like to know that ContainerCon also had a session on Containers and Unikernels. http://sched.co/3YUJ

A write up and audio from that session is also available.

http://thenewstack.io/the-comparison-and-context-of-unikerne...


This is a misleading title. Managed cloud services run on servers. There has to be a better title. For a moment I thought they were proposing P2P hosting.


The implication is that you don't have to provision any servers to execute your code. That is the "serverless" part.


But the name is terrible. It is like saying, going from A to B without a car by using a Taxi.


That's what you get when marketing relabels the cluster as "cloud" and a VPS interface as a "compute service". You start thinking it's buzzwords all the way down :-)


This article only makes sense if you don't know what servers are, and believe "the cloud" doesn't use them.


Beware, link-bait! Title should really be "Microservices without non-Amazon Services", which if you remove the double negate really says "Microservices with Amazon Services", which is well.. not that interesting IMO. I'd rather write against CloudFoundry which abstract away AWS.


Also note the total absence of any reference to cost and billing. This isn't free.


Lambda has a fairly generous free tier.

Here are the details on pricing:

https://aws.amazon.com/lambda/pricing/

You get up to 1M requests / month and 400,000 GB-seconds of compute time per month.

A default lambda function uses 128 MB of ram (0.125 GB), which gives you about 3.2 M seconds of compute time (time actually spent executing requests) every month for free.

Thus if you have functions that take 500 ms on average, and use the default amount of RAM, you can process about 6.4 M requests in a month for a total bill of $1.20.

Above the free tier limits you pay $0.20 per million requests, and $0.1667 per million GB-seconds.

The pricing is fairly attractive.


I built http://vat.landedcost.avalara.com/ using this same architecture pattern.

The site is served up via S3, and the back-end logic is a Lambda module that wraps a SOAP API.


I made a pretty cool lambda this week converting using mandrill inbound email api, processing this through lambda, then posting it to my redmine docker server. After a lot of fiddling (lambdas doesnt support x-www-form-urlencoded) it now works great.


Have you seen this? https://forums.aws.amazon.com/thread.jspa?messageID=673863

It's a mapping template for the AWS API Gateway you can use to convert both HTML form POSTed data and HTTP GET query string data to JSON.


Yeah, I used a mapping template similar to this. Sorry for the late reply


Is there a particular reason why Amazon chose JavaScript? I'm seeing more and more PAAS services going nodejs first/only and am wondering if there's an underlying reason.


AWS Lambda supports nodejs and jvm-based languages (Java, Scala, Clojure, etc.) directly, and lets you run Python, shell scripts, and arbitrary executables as well. We started with nodejs because it worked nicely for expressing our initial launch scenario, event handlers.


yes - I am aware that you support Python, etc. .. but nodejs is your first class language. For example, even your docs only mention nodejs (and java recently) [1]

what is even more interesting is that you felt it worked nicely for expressing event handlers. Can you talk a bit more about that - very interesting to see why not something like python or ruby. I know that nodejs is a callback-oriented framework... was it the fact that you can test locally on nodejs consistently versus what would be the expected output on Lambda ?

[1] https://aws.amazon.com/lambda/faqs/


Popularity?


I was thinking... How can you use server less webapp with SEO-friendly dynamic url structure, e.g. Ecommerce, social network, etc. does anyone have an idea on that?


I don't think servers have a lot to do with this -- when you go to a URL, and a page gets returned, why does it matter where the data came from?

For example: Say you've got an AngularJS app sitting in S3 or something, and your backend is a Node.js app running in Lambda. Google finds a link to "random-new-social-network.com/profile/drinchev" somewhere and tries to index it -- their request is routed to "random-new-social-network.com," where Angular recognizes "/profile/drinchev" as a route to a profile for some user named drinchev, pulls in the "profile" template, and spits out your profile, where Google can read it and call it a day.

If you're talking about search engines getting along with Javascript-reliant sites, that's a different story, but I don't think I see the problem.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: