Hacker News new | past | comments | ask | show | jobs | submit login
AWS Lambda: a few years of advancement and we are back to stored procedures (it20.info)
100 points by frostmatthew on April 4, 2016 | hide | past | favorite | 87 comments



> There is some truth to it. This, however, isn’t due (too much) by how you write the code from a syntax perspective: while coding my Python program I noticed that when the function is run in the context of Lambda, the platform expects to pass a couple of parameters to the function (“event” and “context”). I had to tweak my original code to include those two inputs (even though I make no use of them in my program).

This is just a product of bad design

A better approach would be to decouple all your business logic and provide an interface to it.

e.g.

    MyBusinessLogic
        doSomeBusiness

    MyLambdaRequestHandler
        - handleRequest
          - call myBusinessLogic.doSomeBusiness
That way if you need to move away from AWS Lambda, you can simply remove MyLambdaRequestHandler and any associated unit tests and your application code is unaffected.


Yep, exactly. We're currently running the exact same code on both AWS Lambda and IronWorker, all that differs is a simple handler file.

We're looking to add support for Microsoft's new Azure Functions and Google Cloud Functions, and this will be a matter of creating a single file for each to handle the input.

You should always abstract your dependencies, especially if it's a critical part of your infrastructure.


We have been working on similar effort, to allow businesses and developers avoid lock-ins and be cloud-agnostic:

https://github.com/MitocGroup/deep-framework

From our experience, this is pretty hard, but not impossible!


Example code would be helpful.


I personally hate stored procedures because they invite bad design. The code for them rarely makes it into source control, and even when it does there are often contextual parts of the schema that aren't included.

If you can do them properly (i.e. source control everything, proper schema update scripts, etc) they can work ok. Just hope you marketing guys don't want to do any A/B testing of various algorithms.

Stored procedures, IMO, are a case where DBAs should be pushing their pain down to the software engineering teams. I think a lot of this delineation will disappear as cross-functional product teams become more integrated across the industry, however.


> I personally hate stored procedures because they invite bad design. The code for them rarely makes it into source control, and even when it does there are often contextual parts of the schema that aren't included.

That's kinda like saying "I personal hate <Language> because it invites bad design. The code for them rarely makes it into source control". I've never worked on a project that used stored procedures where they were not only checked into source control but provided the full necessary context so any "fresh" dev box could grab them and test them. Half the projects I've worked on with stored procedures even had unit tests!

> Stored procedures, IMO, are a case where DBAs should be pushing their pain down to the software engineering teams.

DBAs are just specialized developers or dev-ops. There is no reason stored procedures shouldn't be written by whoever is writing the application itself.

Personally I loved sprocs! I can fine tune my DB access control so it's impossible to do anything but call a set of sprocs, the sprocs has zero dynamic code so no injection of commands are possible, each db user can access different sprocs and I have this nice separation of concerns where my DB layer is simply a dumb, black box to take in input and return output. The same for all of the other layers of the app.


> I've never worked on a project that used stored procedures where they were not only checked into source control but provided the full necessary context so any "fresh" dev box could grab them and test them.

Did you mean to say the opposite of this?


Basically. Who knows what upvotes support my screwed up text versus my actual opinion lol. Can't go back and edit. Oh well.


I love stored procedures but I feel the pain of source control management. I hope a system comes around to make this easy, ideally a system where you give your current version number and branch when you are trying to update and it generates a sql update file, which really is just all the patches appended in order.

Ideally it would also allow you to restore to a previous point but anything that isn't purely schema related (like an UPDATE OR INSERT) probably needs to be done by hand.


Does Liquidbase[1] do what you describe?

[1] http://www.liquibase.org/


Would something like the Rails database migration systems work for this (aka a DSL for describing common db transformation tasks with tooling to put you at the right version, rollbacks,etc.)


Where I work we have a big directory with one file for each stored procedure.

On deploy, all the stored procedures will get updated from that directory. Rolling back would consist of deploying an earlier version of the repository with the procedures in it.


Years ago I wrote some custom scripts to do exactly this. Deploy consisted of having all of the sprocs migrate. At the time I also encrypted all of the sprocs because some admins had a hard time not making changes to live system instead of in source control.


TLDR: You don't mind stored-procedures if the database schema is in source control like it should be?


I couldn't disagree more. Hand-tuned stored procedures are always more performant (and sometimes even more SQL injection resistant) than anything spit out by ORMs.

It's more minor / maybe just me, but I also like the fact that I can replace logic in a stored procedure without having to push out a whole new software build unless I change the inputs or outputs.


I think of stored procedures as an API, and I check them into source control. Especially with SQL Server and the debugging hooks available to me, there's a good chance that my stored proc will execute much faster than any other interpreted language/DSL/ORM

But I limit the ways in which I use them. Only to enforce rules intended solely for the database, for example, like shadowing an entry or changeset to another table. I also sometimes use them to optimize the retrieval of some cursors where I know from testing that SQL Server's query plan will end up returning faster than a view or (shiver) an app-side aggregation.

In short, like everything else in our world, fellow hackers, there's a time and a place for everything, and it's (in my opinion) irresponsible to say that you know better than someone else with different domain experience than you.

Edit: to my parent post, I'm agreeing with you, btw. Just happened to be a good place to hang my reply.


I don't think it's healthy to have engineers who are actually willing to sneak in invisible and unreviewed changes, and deployment shouldn't be so painful that they're even tempted to do such a reckless thing.


> Hand-tuned T-SQL is typically more performant than ... ORMs

It's not ORMs vs Stored Procs unless your data access framework is garbage.


Is that fair? What if the language of an application that the developer must extend has only "garbage" (in your opinion) ORMs? Which ORMs allow me to update an object without querying for it first? And before you say "you shouldn't do that", remember: your domain experience is different from mine. My domain experience includes guaranteed single-concurrent-accessor systems where the service isn't in any danger of racing another use to the data.

Many ORMs will double the DB<->app traffic when a single column of a single row is changed, even if only the primary key of an existing record and the NEW column value are the only things that need to be sent, and the front-end application is the only possible source for BOTH values. Sure, I'd need to check for consistency - is this record the same as when I first queried it? - but a trigger (don't laugh, that'd be a mistake) that locks a row and increments a "row_version" value on update could allow me to optimize the consistency check to a simple "SELECT id, row_version FROM my_table". I'm personally not aware of any ORM that allows for that kind of optimization, though I'm also certainly aware that I'm not an expert in all ORMs and all things related to RDBMSs.

ORMs have their place. Many solve common problems well. Just as many miss some optimizations that may be relevant to one person, but not to you.


I personally hate stored procedures...

Ah yes, SQL injection will be with us for many years yet.


If you're using sprocs to fix SQL injection you're Doing It Wrong.


There are corner cases in which stored procs don't totally protect the foolish, but they fix 95% of the problem. What "Right" solution are you using that eschews stored procedures completely? [EDIT:] Of course if your use case allows you can just get by with a tiny whitelist. Nice work if you can get it.


(Fan of sprocs here, but) Parameterized queries can give you the same level of protection from SQL injection as sprocs.


Haha I think we mostly agree but I doubt that's what 'bcoates had in mind...


No he's right parameterized queries are how you prevent SQL injection (imo, the only correct answer). Doing a stored procedure for an SQL one-liner is just bringing a world of pain.


Sprocs give you finer grained access control though (a sproc for which a user has permissions can perform actions that the user can't do directly) as well as being very quick to patch in a prod emergency (useful in the case where you are using a compiled or otherwise slow to deploy language). I've made many a one-line sproc in my day; the overhead at dev time seems minimal by comparison, but that's just my opinion. I can see the other side as well.


Do you find that all your service endpoints are naturally just a one-liner? I guess tastes differ with respect to schema organization, etc. Sorry for misunderstanding!


That seems sarcastic to the point of ignorance. Don't conflate ORMs with prepared statements. Input to a stored procedure can be sanitized like any other query.


This is just a product of bad design

Perhaps, rather, a product of an infelicitous programming language? You don't have to worry about extra function arguments in javascript.

It seems common in C, python, and similarly arity-concerned languages to just accept a pointer to a struct (dict, etc.) when coding a callback for an API like this. That way the callback can examine as much or as little of the struct as it wants, and much less needs to change in future when the callback needs more.


> Perhaps, rather, a product of an infelicitous programming language? You don't have to worry about extra function arguments in javascript.

I can't even tell when these things are troll posts and when they are meant to be serious anymore.


TFA had, "I had to tweak my original code to include those two inputs!!?#\!" If that's such a severe "problem", I think I've suggested a plausible "solution". b^)


There are trade-offs for everything. The extra function arguments in Javascript cause all sorts of bugs when you alter the signature of a callback, miss a call site, and the new argument is suddenly being interpreted (without any crash or error message!) according to the semantics of a different argument.

The callback-args struct is common, but has its own set of trade-offs too: it frequently becomes a "data whale", a dumping ground for everything that could happen with no indication in the source code of which parts a subsystem actually uses. Frequently, I've seen these replaced by a dependency injection framework as the software and team gets larger (which has its own set of problems, with it becoming difficult to holistically understand the data flow within the system).

Keyword args are popular in languages that support them, too. They solve the semantic problems of missing positional args, and are semantically identical to the option struct approach but with a bunch of syntactic sugar.


It's an extremely easy change in python, and depending on how you're writing your functions, it may work without any changes (just like Javascript). If you write your functions in python to accept args and *kwargs, it will automatically handle any number of positional arguments and keyword arguments.


> in C, python

Python actually has much better support for varargs than JavaScript used to have. See: https://docs.python.org/dev/tutorial/controlflow.html#keywor...


Usually Massimo's stuff is quite well thought out and practical when it comes to adopting new models in the Enterprise, but this article has a "I'm annoyed by this platform because it doesn't work how I want it to" sentiment throughout. He seems to miss a lot of the potential value that the serverless architecture model brings to the table.

It's pretty well established that Amazon's usual approach is to provide a "toolbox" of services that can used to build any number of app architecture permutations, without all the typical fluff and polish expected by large business customers. I for one, as an engineer, appreciate this model since it's much more accessible and lightweight.


> He seems to miss a lot of the potential value that the serverless architecture model brings to the table.

Its a code evaluation platform, running containerized under the hood, and with a large markup by Amazon. Companies have offered this for years before AWS, but AWS is tooting the horn louder than people have before.

Is it useful? Yeah, sure. Is it revolutionary? Oh come on now.

Just like "the cloud" is just timeshare on someone else's computer, this is simply code management and execution abstracted a few more layers up.

EDIT: Sorry I'm not on the hype train folks.


The infrastructure isn't revolutionary, I believe IRON.IO and others have similar platforms.

The innovative aspect of AWS Lambda is the triggers hook into a variety of popular AWS services, e.g. S3, Kinesis etc, with very little setup required from the developer, along with an easy interface and setup process. It abstracts away from containers and operating systems.

Google and Microsoft have recently launched beta versions offering 'Lambda' like functionality. AWS has led on this one, I don't think you can argue against that.


I'm with Iron.io, thanks for the mention. Interesting to see the architecture/patterns so front and center all of a sudden, it's what we've been doing for a while now.

Very true that a key benefit of AWS Lambda is the ability to hook into the internals, but to the article's point, that's a pretty significant level of lock-in. We recommend to our customers who want a similar level of functionality hook up the internal events to SNS, at which point a job can be triggered on our end.

We operate across any cloud, standardizing through Docker images as the unit of code. It's "serverless" to the developer in that the only configuration is setting the event triggers. Of course there's compute involved, but it's outside of the development lifecycle.


Another major piece of this puzzle that people overlook is AWS API Gateway, which provides a web gateway to Lambda. It's a newer product (and quite poorly PM'd, quite frankly), but it's an important part of the puzzle that no other cloud function provider has yet.


I've not used API Gateway, but I believe Microsoft support invoking their 'Azure Functions' style lambda service via HTTP calls

https://azure.microsoft.com/en-us/documentation/articles/fun...


You can do that with Lambda as well. The API Gateway is an abstraction in front of that giving you more control then connecting to Lambda directly.


I tried to build a serverless contact form by hooking up AWS Lambda function into S3 HTTP API event source and it's hard to configure and not working well. In any case it's not an alternative to HTTP Request triggers in Google Cloud Functions or Azure Functions.


For some reason I can't reply directly to your comment, but from that chart it looks like you can invoke a function, but not pass it any input, which makes it vastly less useful, and means that it can be used for serverless web frameworks.


I figured that the truly innovative aspect here was the pricing model. Other similar services expect your "containers" to run 24/7. Lambda runs your container for as long as the request takes, and only charges you for that. It's like what Heroku was originally trying to do with its Aspen Ruby stack, plugging everyone's apps as handlers into Passenger and only (quickly) spinning them up when they get requests.

You've been able to do the same thing forever by just writing everything as (non-F/non-WS)CGI scripts, or as isolated PHP servlets. But nobody ever charged for those by tracking CPU time, rather than just imposing a monthly fee + caps.


@snockerton thanks for the kind comments. No that was NOT (by any means) how I have approached the post. Arguably the attempt of humor in the article and the picking of a misleading title may have misled the readers to think I was dismissive with the technology. But that wasn't the intent. I do find Lambda (and Serverless in general) very intriguing and I agree on the potentials.


The site appears to be down.

One way to avoid that is to use web frameworks based on AWS Lambda, such as Zappa - https://github.com/Miserlou/Zappa


At first I thought this was a clever joke. Thanks for sharing this link. Serverless apps could definitely make sense in many scenarios!


Sorry, yes it has been down for a while. My basic hosting isn't sized to handle north of 8000 hits in a couple of hours :) (I did not expect this much as on average my site has a few hundreds hits per day).


I don't understand why Amazon doesn't launch a clone of App Engine. This is the holy grail in NoOps. AWS Lambda + API Gateway is a bureaucratic exercise in pain, true to AWS form, while App Engine is a pleasure to use. Anyone who have used both services will understand me.

It seems to me (from reading HN) like most people are avoiding using App Engine because Google tortured and killed their pet, err, I mean cancelled Google Reader three years ago.

(And I guess, the inference is that because Google killed a tiny unprofitable service once in the past, you can not realistically depend on them to continue to provide services they are actually putting real money into because it is of strategical interest. Yeah, that makes total brogrammer sense, let's go with that line of thinking.)


I've been using App Engine for many projects since 2008. Like any tool, it's not the right tool for every job, but for its intended use cases, it's great. Being able to click Deploy, and not have to deal with sys admin is really convenient.

The one area where App Engine falls a bit short is lack of support for some widely used libraries like Numpy. It would be nice if The Google would add support for those (support for some transcoding libraries would be nice as well). Even better would be an interface to TensorFlow.


I've also been using GAE daily since 2008.

Brian - the new "Flexible Environments" (aka Docker aka Managed VMs) give you Numpy, python3, native libraries and more. I took my latest project into production with it, and it's been pretty painless.

https://groups.google.com/forum/#!searchin/google-appengine/...


G's habit of killing services doesn't explain the extent of AWS's dominance. There must be some way that AWS is actually superior for popular uses.


AppEngine had the misfortune of being too early.

I remember checking out both in 2007, right when AppEngine first launched. EC2 was just Linux; you had a familiar virtual machine to work with, and you could port your existing code over with virtually no effort. AppEngine required learning a whole new API, which was completely non-portable and locked in to Google, and that was a complete non-starter.

Also, being a PaaS, there was a lot more to get right with AppEngine, and the API changed a lot over the years. I remember looking at it in 2007 and thinking "There is no possible way I can get work done with this." In 2009, it was "Well, it might work, but it's way too much hassle." In 2010, it was "Nope, still too hard." Finally in 2012, I tried it again and it was like "Woah, finally this stuff is usable, and it's actually pretty pleasant." But by then, the outside world had Heroku, it had Beanstalk, it had Parse & Firebase, and there's a huge ecosystem around AWS.

It's a pretty good example of path dependence. If you're just coming to cloud computing now and have never run your own software before, AppEngine is actually a fairly nice alternative. But in 2007-2009, when a lot of us were first checking it out, it was an overly-complicated, vendor-lock-in mess. And vendor lock-in mattered a lot more then; now many developers have just accepted that they're going to be locked into certain platforms, but back then, building on anything that didn't have a publicly-specified API was a huge mistake.


This matches my experience. Right now, AppEngine is really nice (pleasant).

And if you use Flask/WSGI interfaces + Cloud SQL there isn't even a lock-in.

The only big issue I can see: no AppEngine in China. :/

At the moment I'm resigned to use app engine for the world-minus china, and some AWS EC2 inside China, running the same code plus a a lot more admin work.


Even without an AppEngine style PaaS, AWS had a lot of cloud services available before Google (EC2 and S3 are notable services AWS had long before Google had anything comparable.)

I think a lot of the popularity advantage AWS has is a result of being first mover in many of these areas,


Or maybe Silicon Valley is a fiscally incompetent echo chamber, because, well, they don't tend to worry much about money until it's too late.


Let's stipulate that it is. That's still not a complete explanation.


Well, they had a pretty big head start. And I do remember not understanding why people were going for AWS 4-5 years ago instead of self-hosted clouds of rented actual servers at 1/5 the cost.

Besides that, when starting last summer to evaluate both services, I didn't find much that was in the benefit of AWS, compared to GC.

(Besides AWS China. GC does not currently operate in China. I really, really want them to fix this.)

Oh, and in case you wanted an explanation: it's called groupthink and echo chamber. In (the more frugal, because less VC) Europe we were all amazed at how much american startups would pay for hosting.


For the kinds of things medium-big spenders spend money on (mostly, largish amounts of network storage and the network to connect it to app servers) AWS was the first major provider to be decently competitive. I remember pricing this out at the time, a lot of what you're paying for on EC2 is proximity to EBS, which other providers at the time either didn't have an equivalent to or charged several multiples of AWS prices.

If it's 2010 you're used to the costs of a hosted or coloed system with tens of TB of NetApp storage starting your next project on AWS was compelling in a way discount providers was not.


Elasticbeanstalk is the answer to App Engine and has been around for many years.


It's not really. Elastic Beanstalk is about dynamic provisioning of virtual machines from AMI images. It doesn't solve the security maintenance issue, for one.


Well, if you take it that way sure, but the goal of Eb is to provide a similar solution like GAE and customers should be replacing instance as often as possible.


All these years of progress and a site can't be up when it's on HN?

I worked with Lambda a lot, piping the JSON input into Go programs and I cannot be more happy with something.

I work with Go just like I am used to, testing, compiling, CI and everything and then, I have a shell script that deploys it to Lambda (uploads the zip).

For what I need, lambda is absolutely great!


On that tangent: I've always personally wondered why social bookmarking sites (Reddit, HN, etc.) don't cache/mirror/archive the sites people link to through them. The original content site's ads and analytics and such are usually client-side JS anyway, so they'd still get dynamically served even on the mirrored page. The original author would still make money from the traffic; they'd just avoid having to pay for the concomitant DDoS.


Site would probably handle load bursts a lot better if it was hosted on Lambda * g *.


I didn't find much in the author's post that I agreed with. In particular, a few of his points stuck out to me:

> In traditional PaaS world the code is the indisputable protagonist (oh, damn, and you also happen to need a persistent data service to store those transactions BTW). > With Lambda the data is the indisputable protagonist (oh and you also happen to attach code to it to build some logic around data). > A few years of advancement and we are back to stored procedures.

I don't agree at all. Lambda is code. It's code that is invoked upon receiving certain kinds of data, but in no way is the Lambda code subservient to the data, or (unlike stored procedures) even colocated with it. Lambda still needs persistent data services to play with persistent data -- there's nothing magic about it.

> There also have been a lot of discussions as of late re the risk of being locked-in by abusing Serverless architectures (like Lambda). > There is some truth to it. This, however, isn’t due (too much) by how you write the code from a syntax perspective: while coding my Python program I noticed that when the function is run in the context of Lambda, the platform expects to pass a couple of parameters to the function (“event” and “context”). I had to tweak my original code to include those two inputs (even though I make no use of them in my program).

The author describes having to write a controller method. This is not surprising, considering he was trying to make a web service.

> So, IMO, the lock-in will not be a function of how different the syntax in your code will be Vs. running it on a platform you control (probably minimally different) but rather in how scattered and interleaved with other services your code will be (at scale).

This is the most cogent point in the article. API Gateway + AWS Lambda can be used to create micro-microservices. Serverless Framework tries to wrangle this potential complexity by allowing users to group related lambdas/endpoints as a whole, but there is still the opportunity to create a real rat's nest of logic if we're not careful.

> P.S. Yes, I know that it’s called “Serverless” but it doesn’t mean “there are no servers involved”. Are we really discussing this?

Yeah, obviously servers are involved. But the fact that I don't have to care about those servers nearly as much (in terms of maintenance or in terms of up-front cost -- both are big wins for most customers) is worth discussing.


There are different approaches that make sense for different use cases. Is your use case a big database that you just run some commands on top of? Lambda seems great in that case! Just build a front-end client, have it interface directly with Lambda, and you've got a pretty quickly developed app.

Is it the right architecture for everything? Hell no. System architects are in higher demand than ever because there are so many freaking ways to build technology products these days, and it's their job to figure out what tech is right for the job.


The site is being quite slow. Pastebin link: http://pastebin.com/BceUgZk1


Not there yet. Stored procedures are seamlessly integrated with underlying database. There are some hurdles with that for Lambda.

But the critical point is that Lambda is distributed and massively scalable, and stored procedures weren't.

Remember that company, Sun? The one that invented Java? When it was still alive, it has an unofficial motto, "the net is the computer". Nobody understood it then; now we know.

All computing science achievements will now be reproduced in distributed environment. OS? Check (AWS/DCOS/Kubernetes). Filesystems? Check (IPFS). IPC? Check (REST/Websocket). Perhaps even "drivers" will be a new thing (for IoT devices).


Lambda for node devs at the moment is terrible, the version of node is now 13 months old and with the pace of node development it sucks AWS can't keep up even to major versions :( right now I can't use lambda .


You mean the pace at which you adopt new node features, right? Because the pace of node development does not dictate how you write code.


Honestly I think a lot of the frustration is because Lambda is not a fully developed platform yet. I decided to use it for a new product we're building at Cronitor and it's not a travesty but I wouldn't have used Lambda if I had it to do over again.

They give you decent primitives: immutable versions, aliases, easy logging. But everything else you have to build yourself: You have to figure out a development loop, deploys, configuration management. There is nothing built-in to help coordinate lambda deploys across regions.

I expect that you'll see this built out more this year.


> You have to figure out a development loop, deploys, configuration management. There is nothing built-in to help coordinate lambda deploys across regions.

https://github.com/serverless/serverless solves (upon my initial read-through) every one of your requirements, including multi-region deploys.


That may be true, but when you start using something like Lambda, you don't immediately jump to "what 3rd party framework can I use to manage this". It's not clear what the weakness are until you experience it. In the end, we solve it with a few hundred lines of fabric and bash scripts but neither of those solutions make up for the blown opportunity of a better and more powerful UX. (Lambda PMs have been on here before and have said they're working on it.)


> AWS describes Lambda (their implementation of Serverless) as the way how you’ll do think post containers.

Is this sentence was written in double speak or did I just have a stroke?


Sorry that was a typo. I meant to say "how you'll do things post containers".

Corrected. Thanks.


Thanks. The ironic part of my comment is that my question was written as:

"Is this sentence was written...", which is even more broken.


Lambda is very limited due to the very short (5 min CPU time) and limited amount of disk space (500 MB /tmp). It's always been my thought the reason for this is that Amazon is effectively running lambda functions on unused (but possibly reserved) hosts, such that they can easily be shut down and don't consume a lot of disk space or do anything so no one will notice (or even tell) the slight performance hiccup.


Lambda has a number of use cases, if you need something to be running for more that 5 minutes with lots of disk space, you're probably better off running on EC2 (or docker containers via ECS), it's designed for short lived, stateless computations.


If you need something to be running for more than 5 minutes, you could also (just theoretically) split your work, load it up into SQS, then have your Lambda functions consume it, map-reduce style.


To go off on a complete tangent from the article:

There's nothing fundamentally wrong with stored procedures, per se. What was wrong with SQL RDBMS stored procedures was that:

1. each DBMS had its own stored-procedure programming language—and so application frameworks that wanted to provide compatibility with the generic idea of a "relational database" couldn't really use them unless they had devs on their staff familiar with each-and-every DB†;

2. there was—and basically still is—no concept of an RDBMS stored-procedure "view" or "schema"—i.e., API versioning for stored procedures, where a client can request to communicate with the set of stored procedures it was compiled to support, rather than the single version the database is holding onto today;

3. one major RDBMS (MySQL) never supported stored procedures at all, so many devs learned an ossified set of "web development best-practices" without ever being exposed to the idea of stored procedures as an option.

All of these issues are fixable. #3 is just a historical artifact of MySQL's laziness; #1 is likewise an artifact of the proprietary, "enterprise lock-in" nature of the first instances of stored procedures (Microsoft's and Oracle's), evenly fracturing the ecosystem away from adopting either. Neither is likely to repeat.

#2 is more pernicious, and to this day seems ill-addressed.

One place I've seen at least an attempt to resolve it is in the design of Redis's Lua queries, where the "solution" is to refer to the stored procedures by content-hash, with the database having an always-possible error case that requires clients to be able to fall back to inserting the stored procedure again (thus necessitating that all clients track their own copies of any stored procedures they want to call.)

Such a solution could be ported to other RDBMSes; I could imagine Postgres, for example, having a "database view" concept††, where real databases only contain raw tables and indexes, and all the view definitions, triggers, stored procedures, constraints, and even typedefs are held in some record/spec/document that can be both manipulated as data, and connected to as a database. This is sort of equivalent to the CouchDB 'design document' concept.

---

† Sadly, it's really just a syntax problem. If you had several DBMSes that all had the same syntax but different extenional semantics, it'd be very easy to write a single code-generator into your application framework that would spit out appropriate code to take advantage of the extensions available. That's how regular ORM SQL-generation works, after all. But when you have disparate syntaxes, suddenly you need disparate code-generators, which get out of sync and lose features (or an LLVM-like intermediate-representation that you can do the semantic-optimization steps to before finally doing the codegen step, but I can't imagine that'd be cheap enough to slot into a webapp's hot loop.)

†† To go all the way with such a concept, the real 'data' of the database—the tables and the indexes—can become floating objects, not contained "in" anything or defined anywhere, merely existing because of a ref-count from various vDB schemas. You wouldn't explicitly define tables; instead, you'd define your views (relational projections) and then assert identity relationships between some of the columns of those projections, causing one "table" to exist holding the underlying data for both views. This is, AFAIK, what https://en.wikipedia.org/wiki/Dabble_DB was working toward.


Can add a few more:

- Depending on database, the dev tool environment can be extremely limited

- Depending on database, debugging can be a nightmare

- Difficult, if not impossible, to scale over more than one node

- Another version dependency problem added

- If you want to sell your software, or services depending on an application that uses stored procedures, you need to be very careful how you manage licenses.


not sure about the other versions but working with Java, Lambda is was quite a terrible experience (at least if you needed to include a bunch of jars for your work).


I've worked on > 6 lambda functions now across various projects using Java, not sure what your problem is.

If you need to include a bunch of jars, using Maven + the maven shade plugin (or assembly plugin) to generate a 'fat jar' is very simple.

In fact, their official documentation states exactly this http://docs.aws.amazon.com/lambda/latest/dg/java-create-jar-...


Issues I saw (it was months ago, maybe now it's better):

- was not clear what was the limit on the total size, I think it was supposed to be 50mb, but it was not exact.

- edit the lambda function code, and wait 10 min till it is uploaded by the eclipse plugin

- finally not being able to attach a debugger

On the plus side, once you had all up and running, I agree it worked nicely. I complain about the developer experience only.


I just finished today to extract a module of my monolith to AWS Lambda in Java. I loved the experience, but first I had to configure a CI in AWS so that the build and deployment cycle was fast.

Since my function needs phantomjs, I embedded the binaries into the deployment package and by just doing that I topped up 36 MB of the 50 allowed. Transferring it from my local regular internet was a pain. Now it's nice, I push code to my dev branch, it gets picked up by the CI, tested, built and deployed. I get a notification in the IDE when the whole roundtrip is done, and with the CI in AWS it takes seconds instead of minutes. Without binaries you can still fit jackson, a few AWS clients, groovy runtime, guava and httpclient in a few megs, which is manageable.

I agree about the debugger, but authoring a function is trivial and unit tests can be written and run without considering Lambda. I miss tailing logs, CloudWatch is nice but it's far from realtime.


I believe the limitation is 50mb compressed/250mb uncompressed. I've never deployed a Java Lambda that's bigger than 20mb though so I've not hit that limitation.

I'll admit that on slow internet connections, uploading a JAR to deploy can be a bit slow though. We have a Jenkins CI build pipeline that automates this process, it builds the JAR (running unit tests etc), then uploads it to S3 and uses AWS Cloudformation to make the Lambda point to the new JAR in S3.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: