
Crema: A Sub-Turing Programming Language - omnibrain
http://ainfosec.github.io/crema/
======
schoen
This is comparable to a more practically usable version of Douglas
Hofstadter's BlooP.

[https://en.wikipedia.org/wiki/BlooP_and_FlooP](https://en.wikipedia.org/wiki/BlooP_and_FlooP)

The inspiration for claiming this more restricted language is a security
benefit comes directly from Sassaman, Patterson, Bratus, and Shubina:

[http://langsec.org/papers/Sassaman.pdf](http://langsec.org/papers/Sassaman.pdf)

(See "Principle 1". "Computational power is an important and heretofore
neglected dimension of the attack surface. Avoid exposing unnecessary
computational power to the attacker. An input language should only be as
computationally complex as absolutely needed, so that the computational power
of the parser necessary for it can be minimized. For example, if recursive
data structures are not needed, they should not be specified in the input
language.")

Further research on these ideas has continued under the name of the "language-
theoretic security research program".

[http://www.langsec.org/](http://www.langsec.org/)

This language is surely intended as a contribution to that project. (The
behavior of programs in Crema, as in BlooP, is decidable, which should help
with correctness verification.)

~~~
kazinator
You can easily put a cap on computational power without dumbing down the
language, via external resource limits: how many CPU cycles can be executed
(or abstract processor time), how much memory can be allocated, and how deep
the stack can go.

Such parameters are easily administered and easy to understand for IT people.

The limitations in a less-than-Turing language don't nicely translate to a
hard cap on the running time. Attackers can find convoluted programs which
maximize running time by skirting every available limit.

~~~
madflame991
Agreed, regular expressions are not turing complete and they can still make
your browser ponder for a bit if carefully crafted. Try

    
    
      /(x+x+)+y/.test('xxxxxxxxxxxxxxxxxxxxxxxxxxxxx')
    

for instance. This regex doesn't even use non-regular extensions; it's
expressible in a very limited computational model, it fits in a tweet and yet
it renders my browser unresponsive.

I can do even more damage with a nasty shader; GLSL is not turing complete
either (in fact it looks a lot like Crema).

~~~
black_knight
Well, there _are_ faster implementations of regexp than the one your browser
most likely is using [1].

[1]
[http://swtch.com/~rsc/regexp/regexp1.html](http://swtch.com/~rsc/regexp/regexp1.html)

------
dgreensp
There's no clear value proposition given. What are we sacrificing Turing-
completeness in order to gain? For example, one thing a programming language
might gain by sacrificing Turing-completeness is that every program provably
halts; no infinite loops! This is called "total function programming." There
are probably other interesting properties you can gain by sacrificing Turing-
completeness as well.

 _Crema can restrict the computational complexity of the program to the
minimum needed to improve security_

"Computational complexity" is a technical term that is related to performance,
but not security.

~~~
otabdeveloper
> There are probably other interesting properties you can gain by sacrificing
> Turing-completeness as well.

Not the author, but I wrote another sub-Turing language. In my opinion,
there's two useful properties: a) type inference -- you can infer the type of
the whole program b) you can infer memory use and thus avoid the need for
garbage collection or manual memory management. Both properties are very
useful for performance, of course. :)

~~~
Ono-Sendai
Hi, I'm writing a total functional language as well. I think that garbage
collection is still useful however, as it can collect garbage as the program
runs, not just at the end.

~~~
otabdeveloper
I've managed to make it so my memory use is completely deterministic -- no
garbage at all.

~~~
Ono-Sendai
That sounds interesting. Any references or notes on that I can read? (or
source code :) )

I would say that just because your memory use is deterministic, doesn't mean
you don't get garbage. Garbage is just memory that isn't needed any more.

for example, in

    
    
      let
        a = f(x)
        b = g(y)
      in
       a + b
    

If there is working memory used in the body of f(), it will be garbage when
f() has finished executing, because it won't be referred to any more. So it
could be freed after f is executed and before g is executed, which may reduce
the maximum amount of memory required by the program.

------
otabdeveloper
Cool!

Another sub-Turing language is my own 'tab',
([https://bitbucket.org/tkatchev/tab](https://bitbucket.org/tkatchev/tab)), a
text processing language/utility.

'Tab' was born out of a practical need to process text files in a manner
similar to SQL statements, so the focus is different from research languages.

Hope the trend catches on, sub-Turing languages are cool.

------
tromp
Instead of Crema's

    
    
        int i = 0
        int values[] = [6, 3, 8, 7, 2, 1, 4, 9, 0, 5]
        foreach (values as dummy) {
          int_print(values[i])
          i = i + 1
        }
    

it would be more straightforward to define loops over array indexing ranges,
as in

    
    
        foreach (int i indexing values) {
          int_print(values[i])
        }

~~~
adrusi
It would seem that you could implement that abstraction within the language

    
    
        def int int_indexing(int values[])[] {
            int i = 0
            int result[] = list_create(list_length(values))
            foreach (values as dummy) {
                result[i] = i
                i = i + 1
            }
            return result
        }
    

I haven't tested this, and I had to look in the source code to find the
list_create function from the standard library. Of course, since the language
lacks type parameterization, you'd need a different indexing function for each
type of array.

------
capitalsigma
Do they forbid recursion? I don't see anything about it on the wiki, but I
assume they must.

~~~
swolchok
crema/tests/fail/ includes several tests relating to recursion, including the
obvious mutual recursion workaround attempt. Given that they're in the fail
subdirectory, I conclude that it's testing that they're rejected.

------
black_knight
As I understand it, Crema's fixed length loops gives it the computation power
of primitive recursive functions. Which is enough for most programming
applications. However, that assumes that a program is something which runs,
computes a result and is done. A lot of programs fall outside that scope —
even if they do not require more computational complexity per se.

A daemon or an operating system are typical examples of programs without
bounded computation times. Not because of their computational complexity, but
because they continuously (or by demand) produce new results. The interesting
property for them is not termination, but _productivity_. They must keep being
productive, and _not become unresponsive_. This is much more subtle notion
than termination, but I think the programming language Agda has come some way
with its co-recursive data types.

