

Ask HN: Universal libraries? - JoelMcCracken

I have been somewhat fascinated recently with the idea of creating a universal library system. Essentially, the idea is to create library code in an extremely simple programming language (a high-level, assembly-like language?), and then write an interpreter for whatever languages that want to interface with the languages.<p>I realize that there may be quite a bit of naivety in this.<p>Problems I see:<p>1) speed vs writing a custom library. I am not sure exactly how to characterize this, but I can imagine (possibly) the hit being constant time, and not terribly high.<p>2) Interpreter-writing ability. How hard would it actually be to write these interpreters in various languages?<p>3) algorithm creation. People don't like writing things in assembly, and algorithms, by their nature, are hard.<p>Besides the assembly, alternatively a very simple lispy language could be created -- eval is widely known, and this does provide at least some support for higher-level constructs.<p>Is this a realistic project idea? Would it be useful? Does such a thing already exist? Am I insane? Any feedback is welcome.
======
beagle3
What do you envision as the use case for this?

You are not insane, but if you don't have a specific use in mind, you're
bordering on being an architecture astronaut.

Simple scheme implementations sans closures/callcc exist in ridiculously short
amounts of code (100 lines or so), for just about any underlying language. But
none of them is very useful. An interpreter such as you describe is
practically guaranteed to make this too slow for any real use.

~~~
JoelMcCracken
I do have a few, and I listed them just a bit earlier above, but, for example:

email address validation

xml parsing

json parsing

"other" parsing

stripping html

email format manipulation

binary file manipulation (networking stuff, for example)

essentially, any task that takes an input via memory and produces an output
via memory, where the speed of doing so is not critical, but the solution
correctness is critical.

This wouldn't actually replace the need for libraries in other languages, but
it would allow languages that are somewhat devoid of libraries to access a
large body of methods of accomplishing common tasks that, while slow, may work
well, and help give better languages a fighting chance.

~~~
beagle3
It is my opinion that non of the "memory-to-memory" tasks is holding any
better language back; if anything, it's accessing the real world and/or
accessing other systems.

The tasks you listed are usually available with a new language as part of its
standard library, albeit not feature complete or with a standard interface. An
implementation such as you suggested may provide a feature complete,
standardised solution -- however, it will not be convenient due to "impedance
mismatch" (differing object models / storage models / needed conversions back
and from, etc).

Validating an email address is all good, but without shelling out to
/usr/bin/mail or tcp access, you can't actually send it -- and whatever you
use for actually sending it probably has a validator already.

Parsing, specifically (most of your examples) has relatively good language
agnostic support in such things as the GOLD parser --- and yet, it always
loses in convenience to a system that lets you embed implementation language
constructs, such as yacc / lex / ometa / antlr / lemon and friends.

Sorry for being a little non constructive, but I think that without a
concrete, specific use case that clearly demonstrates an advantage, this
project may be interesting philosophically, but useless in practice. When you
work out an example compare: (cost of implementing said feature from scratch
in a new language) to (cost of implementing interpreter + conversions from
language + conversion of results back to language idioms + ... + in each
language + 1/6 feature (amortized over 6 languages)). Take into account
unicode and charset handling because you value correctness above all. I think
you'll find it hard to justify.

What are those languages somewhat devoid of libraries, and what is the large
body of methods that can actually fit in this list of yours that you believe
are holding language adoption back? I surely lack breadth of knowledge and
imagination, but I can't come up with these lists.

------
makecheck
I believe that what you're looking for is Parrot: <http://www.parrot.org/>

~~~
JoelMcCracken
Parrot is a very different idea. The idea is simplicity for ease of
implementation. I'm pretty certain that almost any other system that didn't
have this goal in mind would be too hard to implement for practicality.

An interpreter for this code should be extremely small, really.

~~~
makecheck
I may just not understand the abstraction. Do you have an example of what a
library function would look like in this environment, and how you would want
it to translate to other things? Try implementing "print", for instance.

~~~
JoelMcCracken
Print wouldn't be a good example, but think more along the lines of a TM-esque
definition, which reads and writes to a "tape". "print" wouldn't work, but,
for example, something that takes as an input an email and either accepts if
it is validly formatted or rejects if not would be a good example. It doesn't
require very much to get working, however it is hard to do correctly (how many
places don't accept "+" in emails, for example).

So, possibly, it would be more useful for yes and no answer questions. Also,
algorithms that are intended to do things quickly wouldn't work very well, as
per the platform these are naturally very slow. It would be a bit silly to
write a quicksort with this, for example.

Other examples that I can see working well:

xml parser

json parser

.... parser

html stripping

email format manipulation

basic binary file manipulation (untaring, possibly, reading/manipulation of
networking packets...)

diff

------
sofal
Imagine trying to write a Java bytecode interpreter in several different
programming languages. This would be much easier than what you're thinking,
because the libraries already exist and they didn't have to be written in
straight up Java bytecode either. Also, you could skip all the languages that
have already been fully implemented on the JVM anyway.

~~~
JoelMcCracken
I suppose much of the point is the emphasis on simplicity. I very much imagine
that a java bytecode interpreter would be much harder to implement than a
intentionally-simplistic assembly language.

------
JabavuAdams
What about a wizard-style code generator, instead?

I'm thinking of a machine-readable algorithm collection (with options and
trade-offs defined) that rather than being executed, generates source code.

------
Raphael
Just use JavaScript.

------
wmf
Why is this better than C + SWIG?

~~~
JoelMcCracken
I don't know that it isn't, exactly, however I do know that there aren't many
languages out there (that I know of?) which are viable because they have
access to this massive library of code via C + SWIG, and I wonder why.

I thought of this idea one day, and did a bit of checking into C + SWIG,
without much success toward figuring out why it isn't used like this. My
intuition is that adding these things as standard libs is for some reason a
little unclean. This solution allows language implementors to have more
control over how things are implemented, and, for example, does not depend on
something external like a c complier. My intuition may be very wrong.

------
NonEUCitizen
forth

