Hacker News new | comments | show | ask | jobs | submit login

build an api

It's a shame this comment is pithy, because I think it's dead-on.

There times to integrate at the database level. But, the default should be single-application databases.

The rationale is the same as the grandparent's rationale FOR database integration. The odds of needing to share data over time are 1.

Given that shared belief, the problem with database integration is that MANY applications need to share facets of the same data. The single database ends up having a huge surface area trying to satisfy every application's needs.

The resulting schema will have definitions relevant for applications A, B, and C but X, Y and Z.

But, worse, there are dependencies between each application's working schema. This means ensuring integrity becomes harder with every application that integrates.

Finally, integration points are the hardest to change after-the-fact. The more services that integrate, the less ability to make the inevitable model changes necessary to fix mistakes/scale/normalize/denormalize/change solutions.

Thus, "build an api" is the best solution. Well-defined APIs and data-flows between applications helps data and process locality and avoids most of the problems I just listed. The trade-off is you're now meta-programming at the process level— the complexity doesn't disappear, it's just reconceptualised.

> The single database ends up having a huge surface area trying to satisfy every application's needs.

This is, more or less, exactly what views are for.

> Thus, "build an api" is the best solution.

And you can do that within the database with stored procedures, perhaps even with the same language you would use in the front-end (depending). And look at the advantages you have:

- No implied N+1 issues because your API is too granular

- No overfetching because your API is too coarse

- No additional service layers needed

- All the information is in the right place to ensure data validity and performance

Let me be clear: I see these as two viable alternatives and different situations are going to determine the appropriate tool. I bring this up because I do think the NoSQL crowd overall has a very distorted and limited picture of what exactly it is RDBMSes provide and why. If people look underneath their ORMs, they may find an extremely powerful, capable and mature system under there that can solve lots of problems well—possibly (but I admit, not necessarily) even _their own_ problems.

> - All the information is in the right place to ensure data validity and performance

This is where we part ways.

We're talking specifically about integration. That means each system have different processes and are talking with other people.

If this is a case of three apps exposing the same process over three different channels (HTTP, UDP, morse code); then, database-level integration makes perfect sense.

But, as soon as differing behaviors comes in, then the database level doesn't— by definition— have enough information to ensure validity. One app thinks columns X and Y are dependent in one way, the other app views it another way. Now, one or the both of those apps are screwed for validity. And this problem grows with N+1.

I am certainly not arguing against good databases. Stored procedures, views, etc. are all great even for a single application. But, I am arguing database level integration should be the rare exception to the rule.

> Thus, "build an api" is the best solution.

I think there's an asymmetry in your argument.

You are describing all of the problems with data management as though they were specific to schemas in a traditional RDBMS; but glossing over how "building an API" solves those same problems, and whether that method is better or worse.

In other words, "build an API" is the problem at hand, not the solution. A traditional DBMS provides a data definition language (DDL) to help solve the problem (schema design), which isn't a solution either, but it does offer a lot of direction. Saying "build an API" really gives me no direction at all.

Yes, I agree, "build an API" is the problem.

I specifically said the problem and the complexity inherent in solving it doesn't disappear with per-application databases.

But, the application (the 'A' part) is where the most context around the information is to be found. Tying multiple applications together at the Data-side (rather than the Application-side) means you don't lose that context... and eliminate your chances of clashing (and the ensuring integrity issues).

That's a good point about the context, but not very concrete.

At some point you'll need to go into the details, but I don't think a comment is the right format. Maybe a blog post to show exactly what you mean.

Thanks for expanding on my answer, nailed it.

A schema is essentially an API over the data.

For instance, if you have an API with the functions:

  item# = add_item(description, price, color)
  record_sale(item#, quantity, customer#)
So let's see what happens when you need to make a change. For instance, you might decide that an item could be discounted (a new feature); so you need to record the price at the time of sale:

  record_sale(item#, quantity, customer#, price=NULL)
and if the price is NULL, you default to the current item price.

But you already have sales data without a price, so you need to migrate it by copying the current item prices over.

And if one of three applications doesn't support the new API, you need to offer the old API as well and it would just use the default all of the time.

That sounds pretty much like a schema to me. What's the difference?

So you want to put an API in place to access your schema-less database? So that multiple applications can access your database in a consistent manner? That makes sense.

You know what though... you may want to use multiple programming languages, and maintaining an API in multiple languages sucks. So, why not just make a service that each application can connect to, and then let it talk to your database. Then you just need to define a simple protocol that language can use to connect to your service.

Or, you could just skip a few steps and use an actual database to begin with.

Building an API is still a great choice no matter how good your database is. Why would you not have an API if you are using the data store from 5 difference languages?

Agree with this 100%. In the past the way different applications integrate with each other was through sharing the same database tables. Hence the need to keep the schema synchronized across multiple applications. Nowadays applications should integrate through well-defined APIs and service interfaces. See this Martin Fowler short essay on database styles: http://martinfowler.com/bliki/DatabaseStyles.html

> Hence the need to keep the schema synchronized across multiple applications

Views are designed to avoid that problem.

What is the specific way you handle versioning in an API that is superior to versioning in the schema?

Applications are open for YC Winter 2018

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact