Inner-platform effect

jbert · on Jan 9, 2012

It seems to me that this is closely related to the idea of a domain specific language - "custom" and "embedded".

There is a well-trodden path of "little languages" evolving over time to add features. (e.g. XML build files and templating languages accreting conditionals and control flow over time). To me these are "custom" DSLs and are generally a bad thing. [1]

On the other hand, a DSL which leans on the language of it's implementation I consider a Good Thing. i.e. it is embedded in the implementation language, using it for control flow etc but also modifying it somehow (in a lisp, you'll get new special forms/structure/macros, in another language, you'll get a well-written API which allows easy expression of solutions to problems in the domain).

[1] My reasons are mostly that they tend to be ill-suited for programming (no tooling such as code highlighting, debugger support, profiling etc - actions like this must be taken in the implementing language). In addition they tend to be poorly designed/idiosyncratic, because language design is hard. Of course, if the little language becomes sufficiently popular then this becomes a solved problem.

I also note that my criticisms apply to interpreted languages implemented in C. So clearly good design and popularity can mitigate all or most of them.

prodigal_erik · on Jan 9, 2012

The drawback to leaning on the implementation language is that it encourages breaking the Rule of Least Power* . Documents using a minimal, declarative, non-Turing-complete grammar are much easier to analyze and repurpose. Once you add loops and conditionals, there's little any tool can say about what's going on other than just running it and watching what it did.

* http://www.w3.org/2001/tag/doc/leastPower

jbert · on Jan 9, 2012

Fair point.

But I'd say that such little languages exhibit feature creep over time. And once the two features you mention (looping + conditionals) are acquired, you've lost all the benefits you mention, since you're now turing complete.

Regexes, makefiles, ant build files and even sendmail cfg files (http://okmij.org/ftp/Computation/sendmail-as-turing-machine....) have grown in complexity from their first incarnations.

I guess my position is: "Your language is very likely to grow over time - you might as well get the basics from a well designed language with a good toolset rather than design+implement it yourself".

Note also the other way people attack this problem (lack of features in the little language) is to write "config file generators". i.e. programs in a general purpose lang which emit programs in the limited lang. That also tends to argue for the fact that such languages want to grow over time.

gte910h · on Jan 9, 2012

For every Sendmail, there are HUNDREDS of DSLs that never go past INI file level complexity.

Most DSLs I've seen are for people who general purpose programming is beyond their daily capacity to use. It would be inappropriate to saddle decorated lisp on that person, or even LUA.

jbert · on Jan 10, 2012

You're probably right, in that there is a useful role for pure "assign value X" languages.

But also bear in mind that simple INI style files can be represented as data structure literals in the language. So you could replace your ini-style file:

searchdir="/etc/foo"

with (python syntax):

searchdir="/etc/foo"

and then load your cfg file with 'import' from python.

This approach isn't perhaps suitable for all cases, but it also allows users to do "clever" things, e.g. replacing:

server_host_1="foo"

server_port_1=8000

server_host_1="foo"

server_port_2=8010

server_host_1="foo"

server_port_3=8020

server_host_1="foo"

server_port_4=8030

with code.

Basically, I'd urge anyone considering adding anything beyond simple assignement to use a real language, and even in the case of simple assignment, I think you can use a real language without scaring the users too much.

gte910h · on Jan 11, 2012

You would be causing some nasty security issues with that approach I think.

jacques_chester · on Jan 9, 2012

OT: "Lua". It's not an acronym.

gte910h · on Jan 9, 2012

Ahh, it's Portugese for Moon.

jacques_chester · on Jan 9, 2012

This is surprisingly common.

I once worked for a SaaS provider in the recruitment industry. I was brought on to begin working on a 2.0 product, but one day they asked me to work on their 1.0 system instead.

Inside I found horror upon horror. My personal favourite was a table called TableRow_TableRow.

It contained six fields:

    TableRowId_1
    TableRowId_2
    Info1
    Info2
    Info3
    Info4

This table, it was explained to me, allowed anything to be related to anything. Super-duper flexible. Those four "Info" fields were varchars that could store anything -- anything! -- which made them tremendously useful of course. Why didn't I get it? It was written by graduates from my own school, after all!

TableRow_TableRow had four billion rows. On commodity hardware. Without any constraints. But with an index on every field. And they wondered why it was so slow.

I was sometimes brought in to discuss the 2.0 architecture. The tech lead felt very strongly that "Everything should be called a node, and nodes should contain nodes, it will be super-duper flexible!"

My pleas that creating a graph database inside a relational database would perform horribly and be a wellspring of nightmarish bugs fell on deaf ears. My argument that perhaps we could, you know, just model the domain were dismissed as inflexible.

Then the GFC hit, the 2.0 project was cancelled and I was sacked. What a relief.

bni · on Jan 10, 2012

I have encontered this attitude a couple of times also. Everything should be as general and generic as possible. Things should be called "meta data" instead of what enteties really are in the model.

If you argument for a structured relational model, with sane names for everything, you are dismissed as some sort of dinosaur, that just doesnt get it.

In my experience, these "generic" systems are the ones that eventually end up in development hell, containing an obscene LOC in proportion to the problem being solved, and are impossible for new developers to understand and modify.

roryokane · on Jan 9, 2012

Related to this HN story, posted around the same time:

Duck Programming – http://news.ycombinator.com/item?id=3442419

_3u10 · on Jan 9, 2012

"In the database world, developers are sometimes tempted to bypass the RDBMS, for example by storing everything in one big table with two columns labelled key and value."

raganwald · on Jan 9, 2012

Database Analysts will tell you that many ORMs are themselves inner platforms, re-implementing constraints, views, joins and so on that should be encoded directly in the database.

jerf · on Jan 9, 2012

That has become my acid test for how good an ORM is.

Most fail horribly. I count myself lucky if I come across an ORM in some language that didn't visibly have joins bolted on the side in such a way that they just barely work, usually spending all the design value of the library in the project. (That is, merely by using one of these libraries you've often incurred technical debt on the spot.)

eli · on Jan 9, 2012

Just out of curiosity, have any passed this test?

jerf · on Jan 9, 2012

Based on reading the documentation, I think Hibernate for Java would qualify. I've never used it, though, so I can't be sure. SQLAlchemy for Python also looks good, but I've also never had the opportunity to use it.

nerd_in_rage · on Jan 10, 2012

hibernate will make you throw up with all the verbosity and nonsense.

sqlalchemy, on the other hand, is good.

jquery · on Jan 9, 2012

All sufficiently large CRUD apps either use an established ORM or re-invent them poorly.

_3u10 · on Jan 9, 2012

That's not true, work with some legacy perl apps from the 90s, it's raw SQL all the way down.

romaniv · on Jan 9, 2012

ORMs aren't implemented in the database, so it's less of an "inner" and more of a "parallel" platform. I think this is an important distinction, since one of the reasons people use ORMs in the first place is better integration with the programming language of their choice. Moreover, there are often practical reason to re-implement those features you've mentioned.

Let's take constraints as an example. Can you use them to display a list of user-friendly validation messages when someone fills out a web form? Will they benefit from all the tools (version control, refactoring, static verification) your language of choice provides? How reusable will they be?

If there were databases with language-friendly APIs for these things, I bet developers would be less likely to re-implement them.

swalsh · on Jan 9, 2012

I was quite alarmed the first time I used NHibernate with C#. I had some very non descriptive error, and the only solution I could find was removing a constraint I had created in the db.

ajross · on Jan 9, 2012

The flip side to that argument is that the "M" part of the ORM can be understood an artifact of the poor storage technology, and that the ORM+database system should really be understoood as an object store with transactions, not an API layer.

ExpiredLink · on Jan 9, 2012

No need to re-invent OODBMS. The had their chance - and failed.

Maro · on Jan 9, 2012

It you're trying to make a statement about NoSQL databases, then you are missing the point of NoSQL databases.

wonderercat · on Jan 9, 2012

While I totally agree that the subject of the article is a Thing (read: "notable"), this is an inflamatory essay and complaints forum more than an encyclopedia entry in its current state.

Splines · on Jan 9, 2012

As someone who hardly ever uses a RDBMS (I'm in QA and most of my "work coding" is test automation and/or tool work), what happens when you try to deal with data validity constraints? In .NET-land, do you put a try-catch around it and bubble that back up to the user "well, you did something bad, I guess you should double-check what you just entered here"?

lost-theory · on Jan 10, 2012

If you put your constraints in the database schema, then you check for errors on insert/update/etc., like you said, with try-catch.

If you put your constraints in the application code, you usually have some validation layer that the data passes through before sending it to the DB.

It's common to have validation live both in the DB and in the app code. The constraints in the app code are for catching and displaying errors up front and display them to the user (e.g. on a form), while the DB constraints are used as a last line of defense to ensure data integrity.

jarin · on Jan 9, 2012

Intuit's Quickbase is a perfect example of this:

http://quickbase.intuit.com/

I was unfortunate enough to inherit a project management system running on Quickbase, and in the end it was faster for me to re-implement it in Rails than make even small changes to the workflow.

pmr_ · on Jan 9, 2012

When I saw that the first time I had to think a lot about how such systems end up in the end-user face instead of a scriptable system. Is the fear of exposing users to actual programming so huge or is there simply no 'simple enough' language available?

heydenberk · on Jan 9, 2012

It's kind of a cheap shot but I love the dig at PHP.

funkah · on Jan 9, 2012

http://www.springframework.net/

Spring handles a lot for you, but learning to use it and (especially) debug/troubleshoot it is like learning another framework on top of .NET. It even has its own language (called SpEL).