

Thoughts on Designing a new Web Apps Language - jhferris3
http://www.heapified.com/2011/08/designing-a-language-to-write-web-apps/

======
blasdel
Ur/Web hits every single one of your bullet points out of the park:
<http://www.impredicative.com/ur/>

~~~
jhferris3
I had a friend mention this to me before I wrote this, and it does seem pretty
neat.

That said, I think it's big shortcoming to me (havent spent much time with it,
spent ~10 minutes going through some of the demos) is that its a functional
language. Personally, I don't mind haskell/sml/their ilk, but since I was
trying to conceive of a language that people would actually use/adopt, it
needs to be imperative and probably resemble something along the lines of
C/php/python.

------
andybak
He rather lost me at first two points.

I don't hear a lot of complaints from the Python or Ruby camps about how much
they desperately miss static typing. It would have to be via a seriously 'get
out of my way' type-inference for me to want to allow all that ugly back into
the language.

And point 2. Performance? Again - don't hear much complaining about this for
the vast majority of applications. I rarely find the bottlenecks to be in my
web language constructs. Most people aren't writing another Twitter.

~~~
njs12345
Have you ever used anything in the ML family of languages (e.g OCaml/Standard
ML/Haskell)? You barely ever have to write types in these as long as your
program makes sense..

~~~
true_religion
It'd be better to simply have a JIT with type inference for performance.

~~~
jhferris3
Perhaps? It's unclear that a JIT with type inference will give you better
performance than compiling (especially if you were to do profile guided
optimization) (sidethought: not sure if llvm has support for PGO).

Also, if most of your types are static anyway (which in my middling amount of
experience, they tend to be), I personally would rather get compile time
errors rather than runtime errors. But maybe thats just me.

------
BerislavLopac
What are "Web apps" actually? Each Web app is either a) a bunch of more or
less tightly connected marked-up text documents, delivered by an up running on
some server somewhere, or b) two applications, one running on a server and one
delivered to be executed in the visitors' browsers (or, most often, some
combination of the two). There is no single thing called "Web app", even
though many vendors have been trying (and failing) to make it look so.

What we need is a better way to use Internet as a platform on a large scale;
sadly, none has been widely accepted so far.

------
ventu
I would suggest Opa: <http://opalang.org>

------
thurn
Facebook's XHP is a magnificent tool that you need to use to appreciate. It
lets Facebook build a website out of reusable components that know how to load
their own data. It's very different from the MVC paradigm, but very light-
weight. The XML components are full PHP classes, including allowing for
methods, subclassing, etc.

The emphasis on _components_ instead of _pages_ is not strong enough in many
other web frameworks like Rails.

~~~
thumper
I had the same experience -- I didn't fully grasp how useful XHP was until I
had to use it. Not only are they reusable components, but the type system they
create and can enforce is a powerful aid to helping me suss out how they were
meant to be used. (Now if only XHP::render could detect when it's already been
called, to avoid weird validation bugs from the side-effects!)

And when I had to include some multi-line javascript in my code, I found
myself feeling a huge loss. First, heredocs seemed to be the only way to make
it readable. Second, I'd have to actually run the code and interact with the
page to find out if I got the syntax right. It would be awesome if there was a
way to make JS into an object in PHP, the way that XHP is done, and have it
support some simple sanity checks and easily import JS components (which I
suppose Javelin tries to do).

------
lucianof
For web apps I would like a language that works both compiled and not
compiled. Either I just copy my scripts to the web server when I'm lazy (or
developing), or I compile it (and run unit and integration tests and whatever)
when I'm done developing. Is there any language/platform that works like that?

~~~
tmhedberg
The Snap framework (<http://snapframework.com/>) for Haskell works this way.
Your application (including the web server itself) gets compiled down to a
single binary, but during development you can make changes and see them
reflected on the fly without manually recompiling. This is my understanding
based on an interview with one of the framework's developers; I haven't used
it myself for any significant project, though I plan to.

Haskell in general meets a lot (though probably not all) of the author's
requirements. Static type inference is great, and Haskell's system makes
explicit type declarations completely optional except for in a handful of rare
cases, though you will find yourself wanting to use them on most functions
anyway because of how they clarify and improve the readability of your code.
Testing is also dead simple with tools like QuickCheck--it essentially
manufactures test cases for you based on invariants that you specify about
your code.

~~~
danieldk
I use Snap on one small project, it is pretty awesome. Snap applications have
the terseness and simplicity of, say Sinatra or Rails, but with type checking.
As is often the case in Haskell, if your program compiles it is usually
correct.

What I also like about Snap (and Yesod) is that it is integrated well with the
enumerator package. Simply said, the enumerator package allows you to
implement composable data sources, manipulators, and sinks. Since many web
applications consist of extracting data from a source, manipulating, and
sending it, it allows you to write applications short and simple.

------
micrypt
Scala ticks all those boxes. <http://www.scala-lang.org/>

------
scriptproof
Looks like a description of the Scriptol programming language but for the
JavaScript last sentence.

------
angerman
How close would clojure/clojurescript and a webframework like noir be to what
he wants?

~~~
St-Clock
clojure is not statically typed.

------
radious
Sounds like Go with web framework to me.

~~~
franksalim
Go differs on points 3,4,6, and 8. That's only 50%. I think it doesn't sound
much like Go.

------
nirvana
I'm dealing with this very question right now, only I'm coming from a
different angle. I've already picked the language, but I'm attempting to build
a framework in it that makes it work in a very different domain.

What I'm working on is sort of an answer to node.js. It is a coffeescript
platform (use js if you prefer) built on top of erlang. So, the coffeescript
runs in an erlang environment. This means, when you call into the DB, this
spawns off an erlang process. Your collection of coffeescript functions can be
executed on any number of cores, or any number of hosts. In fact, your handler
for a web request can be spread out all over a cluster, with each function
running on the node that has the data... or it can all run on a single node,
but across many processes. (to some extent the amount of distribution will be
controllable as a configuration parameter-- so if you're doing processing that
analyzes big data, you can move your code to the data and run it there,
lowering the cluster communications load, but if the data is small, it may
make sense to keep handling a request constrained to a single node where
everything is conveniently in RAM.)

This is accomplished by compiling the coffeescript in to javascript and
running the javascript on a vm, specifically erlang_js, though I'm looking at
going with V8 via erlv8. Your code and the libraries are all rendered into a
single ball of javascript that we'll call the "application" that is handed off
to various nodes.

How do I plan to get sequential code to work in a fundamentally distributed
environment? That's the $64,000 question and why I'm bringing this up here-- I
could be doing it wrong.

The plan is simple: 1\. The developer needs to know that their application is
not running in a single environment and account for that. 2\. Each entry-point
provided by the developer to the platform's API is assumed that it could be
running in isolation in a separate process. 3\. There's a shared context that
all the processes have access to. (an in-RAM Riak database where the bucket is
unique to a given request, but the keys are up to the developer.) 4\. The APIs
let the developer give callback functions which will be called when the data
is available. (EG: "Go fetch a list of blog posts" could have a callback that
is invoked when the list is returned from Riak. 5\. There's a set of known
phases that each request goes thru, in a known sequence, and we don't move on
to the next phase until the processes spawned by the previous phase are
finished. All of the phases are optional, so the developer can implement as
many as they want or only a single one. The phases are: init, setup, start,
query, evaluate, render, composite, finish functions. The assumption is that
you can get your app to work with 9 opportunities to do a bunch of DB queries
and get the results. 6\. Init will be called when the request comes in. Init
can cause any number of processes to be started (DB queries, or map reduce,
etc.) They will all be finished, and their callbacks called (if any) before
setup is called. Setup can also spin up any number of processes, and so on.
All of these are optional and a hello world app might just implement one (it
doesn't matter which.)

So, the developer can write in a sequential style, they are called regularly
in sequence and know for each phase the previous phase's queries will have
data. Each phase can cause more queries, or even spin up other apps, that will
be rendered before the next phase. And they get the results from a context
that is always available.

This way, init, start ,query and render could all run on different nodes,
though they would run in sequence and each one would have access to the shared
context for the query.

Another way of looking at this, and the way it _might_ be implemented, is that
each of those phases is a long running process that lives on, and is invoked
with different contexts each time to handle its part of handling a query. (So
this lets us, or the developer, experiment with the right way to arrange
things for best resource utilization, since the results can be dramatically
different depending on the kind of work the application needs to do.)

That's how I'm running a sequential language in a genuinely distributed
manner...you can think in callbacks, or in phases, or both, and your
coffeescript really can run in parallel.

A downside of this, though, is that you couldn't write a request handler that,
say, generated a random key, did a lookup on the database, and then would loop
and do that again until it got a result it liked. You have your 9 phases, and
that's it, for a given request. However, there is an API to invoke another
application (e.g.: you could have a login application that is responsible for
part of your page, so, rather than implement a login/logged-in area on each
page, you write it once and include it as a sub-application.) Conceivably you
could do recursion but I haven't thought about the consequences of that yet.
This does sort of lock you into a specific way of doing things, which is why
there are 9 phases, if you only need 3, only implement 3... but if you need
all 9 you have them.

I'm sure I've managed to make something that is not so complicated sound
muddy... This works for me, since coffeescript is convenient, and it is easy
for me to think in terms of erlang concurrency... but it might be an
adjustment for js programmers who are used to setting variables and expecting
them to be there later on... (you'd just have an API that store the values
under a key.)

If you're interested in this project, you can find periodic announcements on
twitter @nirvanacore I expect to have an alpha sometime in late September, and
a Beta sometime after Riak 1.0 (on which this is based) ships.

Apologies if it seems like I'm hijacking a thread here... obviously my
thoughts are about concurrency, but I am differing from the author in assuming
json for common data structures, and directly programming in
coffeescript/javascript. I'm not too worried about compiled speed- I'm more
interested in concurrency than performance. I'd rather add an additional node
and have a homogenous server infrastructure and no thinking about server
architecture... than try to optimize for single CPU performance, etc.

~~~
justincormack
Are you going to be able to pass data between the phases other than through
the DB? It doesn't sound like it from your description, but living without
closure equivalent would be painful. Maybe some way to add some data that gets
message-passed to the next phase?

Sounds interesting.

~~~
nirvana
My "through the DB" solution is not as good as a heap or stack would be, but
it's not as bad as it might sound, because the DB lives in memory. If, in a
given phase, you have some data, you add it to the context, it will be there
in the next phase.

It would be easy to have an API that is along the lines of "in the next phase,
call this function, pass it this data". I could make an API that does that, or
you could put the data under a key in the context, and then call that function
at the beginning of the next phase. IF the set of functions you'd like to have
called that way varies from request to request, they could be stuffed in a
list under a key, and you just process each of the functions in that list.

I think it will be quite possible to provide something equivalent to closures,
via an API, though I can't yet say how syntactically convenient they will be,
but really not too bad, I don't think.

On further thought, I think it would be quite possible to do actor style
message passing... I'm focusing a bit much on the mechanics of implementation
right now, and not making this transparent, but the context could easily be
used to manage a set of mailboxes and "processes", where, in each phase, or
even between phases, whenever a message is available in a mailbox, the
function that it was sent to gets woken up and executed. In fact, not
function, but process.

So, I can add an API that provides an actor model interface. The actors can be
identified by a process ID, they can send messages to each other (addressed by
PID) and include arbitrary data, and this can happen in concurrently in
coffeescript.

~~~
justincormack
Wouldnt it be cleaner if you send messages to a computation state (this
request in a future phase) as an indirection, as the pid might not be
allocated yet?

~~~
nirvana
I think the pids are getting confused. When I say pid, I mean an id for a
combination of a given function and some data, an instance, a fake sort of
process that is facilitated by my code invoking the function with the data
from its mailbox, whenever there is a message sent to the function by another
"process". I'm not talking about erlang processes or "real" processes. So, you
wouldn't have the problem of the "pid might not be allocated yet" because you
would allocated it.

example in pseudo coffeerlangscript:

init-> pidOne = spawn(functionA, argumentlist), pidTwo = spawn(functionA,
differentarguments), contextSet("pidOne",pidOne), contextSet("pidTwo",pidTwo),
lookupData(bucket, key, pidOne), lookupData(bucket, key, functionB).

functionA(message) -> doStuff().

So, the here you're "spawning" two processes. For a function to act like a
process it is written such that it takes any messages it get as arguments. I
could set up their own contexts too, so "contextSet" in pidOne and pidTwo
would be unique namespaces. LookupData, instead of taking a function to
invoke, takes a process, and sends a message when it has retrieved the data
off of the disk.

FunctionB could send a message or to pidOne and pidTwo (which it can find in
the context).

So, the init phase is here, and later the start phase will be called. But the
thread of execution would be: init, then the database queries happen in
parallel, when they are successful, pidOne gets a message and functionB are
called (possibly running in different environments.) FunctionB sends a message
to pidOne and pidTwo, both of which are invoked with these new messages. When
there are no more messages waiting for any of these pseudo processes, and no
more database queries or other long running processes running in parallel,
then the next phase is called.

If you're saying there's a better way to do this, my ears are open, I just
need a little more explanation.

~~~
justincormack
Ah ok by pid I took it to mean a unixy pid or an Erlang mailbox. What you are
saying is what I was thinking...

------
snorkel
static typing isn't worth all of the visual noise it adds to code.

~~~
Sandman
Type inference helps with that: <http://en.wikipedia.org/wiki/Type_inference>

