
AWS Lambda: a few years of advancement and we are back to stored procedures - frostmatthew
http://it20.info/2016/04/aws-lambda-a-few-years-of-advancement-and-we-are-back-to-stored-procedures/
======
djhworld
> _There is some truth to it. This, however, isn’t due (too much) by how you
> write the code from a syntax perspective: while coding my Python program I
> noticed that when the function is run in the context of Lambda, the platform
> expects to pass a couple of parameters to the function (“event” and
> “context”). I had to tweak my original code to include those two inputs
> (even though I make no use of them in my program)._

This is just a product of bad design

A better approach would be to decouple all your business logic and provide an
interface to it.

e.g.

    
    
        MyBusinessLogic
            doSomeBusiness
    
        MyLambdaRequestHandler
            - handleRequest
              - call myBusinessLogic.doSomeBusiness
    

That way if you need to move away from AWS Lambda, you can simply remove
MyLambdaRequestHandler and any associated unit tests and your application code
is unaffected.

~~~
exelius
I personally hate stored procedures because they invite bad design. The code
for them rarely makes it into source control, and even when it does there are
often contextual parts of the schema that aren't included.

If you can do them properly (i.e. source control everything, proper schema
update scripts, etc) they can work ok. Just hope you marketing guys don't want
to do any A/B testing of various algorithms.

Stored procedures, IMO, are a case where DBAs should be pushing their pain
down to the software engineering teams. I think a lot of this delineation will
disappear as cross-functional product teams become more integrated across the
industry, however.

~~~
skrowl
I couldn't disagree more. Hand-tuned stored procedures are always more
performant (and sometimes even more SQL injection resistant) than anything
spit out by ORMs.

It's more minor / maybe just me, but I also like the fact that I can replace
logic in a stored procedure without having to push out a whole new software
build unless I change the inputs or outputs.

~~~
politician
> Hand-tuned T-SQL is typically more performant than ... ORMs

It's not ORMs vs Stored Procs unless your data access framework is garbage.

~~~
spdustin
Is that fair? What if the language of an application that the developer must
extend has only "garbage" (in your opinion) ORMs? Which ORMs allow me to
update an object without querying for it first? And before you say "you
shouldn't do that", remember: your domain experience is different from mine.
My domain experience includes guaranteed single-concurrent-accessor systems
where the service isn't in any danger of racing another use to the data.

Many ORMs will double the DB<->app traffic when a single column of a single
row is changed, even if only the primary key of an existing record and the NEW
column value are the only things that need to be sent, and the front-end
application is the only possible source for BOTH values. Sure, I'd need to
check for consistency - is this record the same as when I first queried it? -
but a trigger (don't laugh, that'd be a mistake) that locks a row and
increments a "row_version" value on update could allow me to optimize the
consistency check to a simple "SELECT id, row_version FROM my_table". I'm
personally not aware of any ORM that allows for that kind of optimization,
though I'm also certainly aware that I'm not an expert in all ORMs and all
things related to RDBMSs.

ORMs have their place. Many solve common problems well. Just as many miss some
optimizations that _may be relevant to one person, but not to you_.

------
snockerton
Usually Massimo's stuff is quite well thought out and practical when it comes
to adopting new models in the Enterprise, but this article has a "I'm annoyed
by this platform because it doesn't work how I want it to" sentiment
throughout. He seems to miss a lot of the potential value that the serverless
architecture model brings to the table.

It's pretty well established that Amazon's usual approach is to provide a
"toolbox" of services that can used to build any number of app architecture
permutations, without all the typical fluff and polish expected by large
business customers. I for one, as an engineer, appreciate this model since
it's much more accessible and lightweight.

~~~
toomuchtodo
> He seems to miss a lot of the potential value that the serverless
> architecture model brings to the table.

Its a code evaluation platform, running containerized under the hood, and with
a large markup by Amazon. Companies have offered this for years before AWS,
but AWS is tooting the horn louder than people have before.

Is it useful? Yeah, sure. Is it revolutionary? Oh come on now.

Just like "the cloud" is just timeshare on someone else's computer, this is
simply code management and execution abstracted a few more layers up.

EDIT: Sorry I'm not on the hype train folks.

~~~
djhworld
The infrastructure isn't revolutionary, I believe IRON.IO and others have
similar platforms.

The innovative aspect of AWS Lambda is the triggers hook into a variety of
popular AWS services, e.g. S3, Kinesis etc, with very little setup required
from the developer, along with an easy interface and setup process. It
abstracts away from containers and operating systems.

Google and Microsoft have recently launched beta versions offering 'Lambda'
like functionality. AWS has led on this one, I don't think you can argue
against that.

~~~
Mizza
Another major piece of this puzzle that people overlook is AWS API Gateway,
which provides a web gateway to Lambda. It's a newer product (and quite poorly
PM'd, quite frankly), but it's an important part of the puzzle that no other
cloud function provider has yet.

~~~
djhworld
I've not used API Gateway, but I believe Microsoft support invoking their
'Azure Functions' style lambda service via HTTP calls

[https://azure.microsoft.com/en-
us/documentation/articles/fun...](https://azure.microsoft.com/en-
us/documentation/articles/functions-reference/#bindings)

~~~
jasonlotito
You can do that with Lambda as well. The API Gateway is an abstraction in
front of that giving you more control then connecting to Lambda directly.

~~~
nivertech
I tried to build a serverless contact form by hooking up AWS Lambda function
into S3 HTTP API event source and it's hard to configure and not working well.
In any case it's not an alternative to HTTP Request triggers in Google Cloud
Functions or Azure Functions.

------
Mizza
The site appears to be down.

One way to avoid that is to use web frameworks based on AWS Lambda, such as
Zappa - [https://github.com/Miserlou/Zappa](https://github.com/Miserlou/Zappa)

~~~
pweissbrod
At first I thought this was a clever joke. Thanks for sharing this link.
Serverless apps could definitely make sense in many scenarios!

------
johansch
I don't understand why Amazon doesn't launch a clone of App Engine. This is
the holy grail in NoOps. AWS Lambda + API Gateway is a bureaucratic exercise
in pain, true to AWS form, while App Engine is a pleasure to use. Anyone who
have used both services will understand me.

It seems to me (from reading HN) like most people are avoiding using App
Engine because Google tortured and killed their pet, err, I mean cancelled
Google Reader three years ago.

(And I guess, the inference is that because Google killed a tiny unprofitable
service once in the past, you can not realistically depend on them to continue
to provide services they are actually putting real money into because it is of
strategical interest. Yeah, that makes total brogrammer sense, let's go with
that line of thinking.)

~~~
jessaustin
G's habit of killing services doesn't explain the extent of AWS's dominance.
There must be some way that AWS is actually superior for popular uses.

~~~
nostrademons
AppEngine had the misfortune of being too early.

I remember checking out both in 2007, right when AppEngine first launched. EC2
was just Linux; you had a familiar virtual machine to work with, and you could
port your existing code over with virtually no effort. AppEngine required
learning a whole new API, which was completely non-portable and locked in to
Google, and that was a complete non-starter.

Also, being a PaaS, there was a lot more to get right with AppEngine, and the
API changed a lot over the years. I remember looking at it in 2007 and
thinking "There is no possible way I can get work done with this." In 2009, it
was "Well, it _might_ work, but it's way too much hassle." In 2010, it was
"Nope, still too hard." Finally in 2012, I tried it again and it was like
"Woah, finally this stuff is usable, and it's actually pretty pleasant." But
by then, the outside world had Heroku, it had Beanstalk, it had Parse &
Firebase, and there's a huge ecosystem around AWS.

It's a pretty good example of path dependence. If you're just coming to cloud
computing _now_ and have never run your own software before, AppEngine is
actually a fairly nice alternative. But in 2007-2009, when a lot of us were
first checking it out, it was an overly-complicated, vendor-lock-in mess. And
vendor lock-in mattered a lot more then; now many developers have just
accepted that they're going to be locked into certain platforms, but back
then, building on anything that didn't have a publicly-specified API was a
huge mistake.

~~~
johansch
This matches my experience. Right now, AppEngine is really nice (pleasant).

And if you use Flask/WSGI interfaces + Cloud SQL there isn't even a lock-in.

The only big issue I can see: no AppEngine in China. :/

At the moment I'm resigned to use app engine for the world-minus china, and
some AWS EC2 inside China, running the same code plus a a lot more admin work.

------
avitzurel
All these years of progress and a site can't be up when it's on HN?

I worked with Lambda a lot, piping the JSON input into Go programs and I
cannot be more happy with something.

I work with Go just like I am used to, testing, compiling, CI and everything
and then, I have a shell script that deploys it to Lambda (uploads the zip).

For what I need, lambda is absolutely great!

~~~
derefr
On that tangent: I've always personally wondered why social bookmarking sites
(Reddit, HN, etc.) don't cache/mirror/archive the sites people link to through
them. The original content site's ads and analytics and such are usually
client-side JS anyway, so they'd still get dynamically served even on the
mirrored page. The original author would still make money from the traffic;
they'd just avoid having to pay for the concomitant DDoS.

------
gamache
I didn't find much in the author's post that I agreed with. In particular, a
few of his points stuck out to me:

> In traditional PaaS world the code is the indisputable protagonist (oh,
> damn, and you also happen to need a persistent data service to store those
> transactions BTW). > With Lambda the data is the indisputable protagonist
> (oh and you also happen to attach code to it to build some logic around
> data). > A few years of advancement and we are back to stored procedures.

I don't agree at all. Lambda is code. It's code that is invoked upon receiving
certain kinds of data, but in no way is the Lambda code subservient to the
data, or (unlike stored procedures) even colocated with it. Lambda still needs
persistent data services to play with persistent data -- there's nothing magic
about it.

> There also have been a lot of discussions as of late re the risk of being
> locked-in by abusing Serverless architectures (like Lambda). > There is some
> truth to it. This, however, isn’t due (too much) by how you write the code
> from a syntax perspective: while coding my Python program I noticed that
> when the function is run in the context of Lambda, the platform expects to
> pass a couple of parameters to the function (“event” and “context”). I had
> to tweak my original code to include those two inputs (even though I make no
> use of them in my program).

The author describes having to write a controller method. This is not
surprising, considering he was trying to make a web service.

> So, IMO, the lock-in will not be a function of how different the syntax in
> your code will be Vs. running it on a platform you control (probably
> minimally different) but rather in how scattered and interleaved with other
> services your code will be (at scale).

This is the most cogent point in the article. API Gateway + AWS Lambda can be
used to create micro-microservices. Serverless Framework tries to wrangle this
potential complexity by allowing users to group related lambdas/endpoints as a
whole, but there is still the opportunity to create a real rat's nest of logic
if we're not careful.

> P.S. Yes, I know that it’s called “Serverless” but it doesn’t mean “there
> are no servers involved”. Are we really discussing this?

Yeah, obviously servers are involved. But the fact that I don't have to care
about those servers nearly as much (in terms of maintenance or in terms of up-
front cost -- both are big wins for most customers) is worth discussing.

------
exelius
There are different approaches that make sense for different use cases. Is
your use case a big database that you just run some commands on top of? Lambda
seems great in that case! Just build a front-end client, have it interface
directly with Lambda, and you've got a pretty quickly developed app.

Is it the right architecture for everything? Hell no. System architects are in
higher demand than ever because there are so many freaking ways to build
technology products these days, and it's their job to figure out what tech is
right for the job.

------
gamache
The site is being quite slow. Pastebin link:
[http://pastebin.com/BceUgZk1](http://pastebin.com/BceUgZk1)

------
atemerev
Not there yet. Stored procedures are seamlessly integrated with underlying
database. There are some hurdles with that for Lambda.

But the critical point is that Lambda is distributed and massively scalable,
and stored procedures weren't.

Remember that company, Sun? The one that invented Java? When it was still
alive, it has an unofficial motto, "the net is the computer". Nobody
understood it then; now we know.

All computing science achievements will now be reproduced in distributed
environment. OS? Check (AWS/DCOS/Kubernetes). Filesystems? Check (IPFS). IPC?
Check (REST/Websocket). Perhaps even "drivers" will be a new thing (for IoT
devices).

------
philliphaydon
Lambda for node devs at the moment is terrible, the version of node is now 13
months old and with the pace of node development it sucks AWS can't keep up
even to major versions :( right now I can't use lambda .

~~~
singlow
You mean the pace at which you adopt new node features, right? Because the
pace of node development does not dictate how you write code.

------
encoderer
Honestly I think a lot of the frustration is because Lambda is not a fully
developed platform yet. I decided to use it for a new product we're building
at Cronitor and it's not a travesty but I wouldn't have used Lambda if I had
it to do over again.

They give you decent primitives: immutable versions, aliases, easy logging.
But everything else you have to build yourself: You have to figure out a
development loop, deploys, configuration management. There is nothing built-in
to help coordinate lambda deploys across regions.

I expect that you'll see this built out more this year.

~~~
spdustin
> _You have to figure out a development loop, deploys, configuration
> management. There is nothing built-in to help coordinate lambda deploys
> across regions._

[https://github.com/serverless/serverless](https://github.com/serverless/serverless)
solves (upon my initial read-through) every one of your requirements,
including multi-region deploys.

~~~
encoderer
That may be true, but when you start using something like Lambda, you don't
immediately jump to "what 3rd party framework can I use to manage this". It's
not clear what the weakness are until you experience it. In the end, we solve
it with a few hundred lines of fabric and bash scripts but neither of those
solutions make up for the blown opportunity of a better and more powerful UX.
(Lambda PMs have been on here before and have said they're working on it.)

------
mangeletti
> AWS describes Lambda (their implementation of Serverless) as the way how
> you’ll do think post containers.

Is this sentence was written in double speak or did I just have a stroke?

~~~
mreferre
Sorry that was a typo. I meant to say "how you'll do things post containers".

Corrected. Thanks.

~~~
mangeletti
Thanks. The ironic part of my comment is that my question was written as:

"Is this sentence was written...", which is even more broken.

------
iamleppert
Lambda is very limited due to the very short (5 min CPU time) and limited
amount of disk space (500 MB /tmp). It's always been my thought the reason for
this is that Amazon is effectively running lambda functions on unused (but
possibly reserved) hosts, such that they can easily be shut down and don't
consume a lot of disk space or do anything so no one will notice (or even
tell) the slight performance hiccup.

~~~
djhworld
Lambda has a number of use cases, if you need something to be running for more
that 5 minutes with lots of disk space, you're probably better off running on
EC2 (or docker containers via ECS), it's designed for short lived, stateless
computations.

~~~
derefr
If you need something to be running for more than 5 minutes, you could also
(just theoretically) split your work, load it up into SQS, then have your
Lambda functions consume it, map-reduce style.

------
derefr
To go off on a complete tangent from the article:

There's nothing _fundamentally_ wrong with stored procedures, per se. What was
wrong with SQL RDBMS stored procedures was that:

1\. each DBMS had its own stored-procedure programming language—and so
application frameworks that wanted to provide compatibility with the generic
idea of a "relational database" couldn't really use them unless they had devs
on their staff familiar with each-and-every DB†;

2\. there was—and basically still is—no concept of an RDBMS stored-procedure
"view" or "schema"—i.e., API versioning for stored procedures, where a client
can request to communicate with the set of stored procedures it was compiled
to support, rather than the single version the database is holding onto today;

3\. one major RDBMS (MySQL) never supported stored procedures at all, so many
devs learned an ossified set of "web development best-practices" without ever
being exposed to the idea of stored procedures as an option.

All of these issues are fixable. #3 is just a historical artifact of MySQL's
laziness; #1 is likewise an artifact of the proprietary, "enterprise lock-in"
nature of the first instances of stored procedures (Microsoft's and Oracle's),
evenly fracturing the ecosystem away from adopting either. Neither is likely
to repeat.

#2 is more pernicious, and to this day seems ill-addressed.

One place I've seen at least an attempt to resolve it is in the design of
Redis's Lua queries, where the "solution" is to refer to the stored procedures
by content-hash, with the database having an always-possible error case that
requires clients to be able to fall back to inserting the stored procedure
again (thus necessitating that all clients track their own copies of any
stored procedures they want to call.)

Such a solution could be ported to other RDBMSes; I could imagine Postgres,
for example, having a "database view" concept††, where real databases only
contain raw tables and indexes, and all the view definitions, triggers, stored
procedures, constraints, and even typedefs are held in some
record/spec/document that can be both manipulated as data, and connected to as
a database. This is sort of equivalent to the CouchDB 'design document'
concept.

\---

† Sadly, it's really just a syntax problem. If you had several DBMSes that all
had the same syntax but different extenional semantics, it'd be very easy to
write a single code-generator into your application framework that would spit
out appropriate code to take advantage of the extensions available. That's how
regular ORM SQL-generation works, after all. But when you have disparate
syntaxes, suddenly you need disparate code-generators, which get out of sync
and lose features (or an LLVM-like intermediate-representation that you can do
the semantic-optimization steps to before finally doing the codegen step, but
I can't imagine that'd be cheap enough to slot into a webapp's hot loop.)

†† To go all the way with such a concept, the real 'data' of the database—the
tables and the indexes—can become floating objects, not contained "in"
anything or defined anywhere, merely existing because of a ref-count from
various vDB schemas. You wouldn't explicitly define tables; instead, you'd
define your views (relational projections) and then assert identity
relationships between some of the columns of those projections, causing one
"table" to exist holding the underlying data for both views. This is, AFAIK,
what
[https://en.wikipedia.org/wiki/Dabble_DB](https://en.wikipedia.org/wiki/Dabble_DB)
was working toward.

~~~
testrun
Can add a few more:

\- Depending on database, the dev tool environment can be extremely limited

\- Depending on database, debugging can be a nightmare

\- Difficult, if not impossible, to scale over more than one node

\- Another version dependency problem added

\- If you want to sell your software, or services depending on an application
that uses stored procedures, you need to be very careful how you manage
licenses.

------
jmlucjav
not sure about the other versions but working with Java, Lambda is was quite a
terrible experience (at least if you needed to include a bunch of jars for
your work).

~~~
djhworld
I've worked on > 6 lambda functions now across various projects using Java,
not sure what your problem is.

If you need to include a bunch of jars, using Maven + the maven shade plugin
(or assembly plugin) to generate a 'fat jar' is very simple.

In fact, their official documentation states exactly this
[http://docs.aws.amazon.com/lambda/latest/dg/java-create-
jar-...](http://docs.aws.amazon.com/lambda/latest/dg/java-create-jar-pkg-
maven-no-ide.html)

~~~
jmlucjav
Issues I saw (it was months ago, maybe now it's better):

\- was not clear what was the limit on the total size, I think it was supposed
to be 50mb, but it was not exact.

\- edit the lambda function code, and wait 10 min till it is uploaded by the
eclipse plugin

\- finally not being able to attach a debugger

On the plus side, once you had all up and running, I agree it worked nicely. I
complain about the developer experience only.

~~~
namero999
I just finished today to extract a module of my monolith to AWS Lambda in
Java. I loved the experience, but first I had to configure a CI in AWS so that
the build and deployment cycle was fast.

Since my function needs phantomjs, I embedded the binaries into the deployment
package and by just doing that I topped up 36 MB of the 50 allowed.
Transferring it from my local regular internet was a pain. Now it's nice, I
push code to my dev branch, it gets picked up by the CI, tested, built and
deployed. I get a notification in the IDE when the whole roundtrip is done,
and with the CI in AWS it takes seconds instead of minutes. Without binaries
you can still fit jackson, a few AWS clients, groovy runtime, guava and
httpclient in a few megs, which is manageable.

I agree about the debugger, but authoring a function is trivial and unit tests
can be written and run without considering Lambda. I miss tailing logs,
CloudWatch is nice but it's far from realtime.

