
The memory models that underlie programming languages (2016) - bshanks
http://canonical.org/~kragen/memory-models/
======
AlexanderDhoore
The older I become the more I realize how influential Lisp is/was. The Lisp
memory model presented here looks like object oriented data structures, if you
squint a little. Java dragged C++ programmers 'halfway to Lisp' [1]. I now do
a lot of programming on Python, which is so obviously like Lisp that MIT
teaches it instead of Scheme. We don't have continuations and macro yet in
mainstream languages, but we'll get there.

[1] [http://people.csail.mit.edu/gregs/ll1-discuss-archive-
html/m...](http://people.csail.mit.edu/gregs/ll1-discuss-archive-
html/msg04045.html)

~~~
Bromskloss
> Python, which is so obviously like Lisp

Hang on! Do you mean some similarity other than what is talked about in the
article?

~~~
Tarq0n
The main dissimilarity between python and lisp is that you can't use code as
data in python, so there's no macros. In most other ways they're pretty
similar.

~~~
SeanLuke
Serious lambdas and closures? Sophisticated array handling? A full-featured
numerical tower? Conditions? Python's namespaces seem primitive compared to
Lisp. Etc.

~~~
bogomipz
>"A full-featured numerical tower"

What is a numerical tower?

~~~
dkersten
[https://en.m.wikipedia.org/wiki/Numerical_tower](https://en.m.wikipedia.org/wiki/Numerical_tower)

------
vardump
Correct me if I'm wrong, but isn't this article talking about data layout in
memory, and not memory models at all?

[https://en.wikipedia.org/wiki/Memory_model_(programming)](https://en.wikipedia.org/wiki/Memory_model_\(programming\))

Regardless, interesting to read about "old" languages, such as Cobol. The way
data was laid out in Cobol programs made sense with old computing resource
limitations.

~~~
jasone
There's a footnote for the first sentence of TFA that addresses this.

~~~
vardump
You mean this?

"² I’m calling these six conceptualizations “memory models”, even though that
term used to mean a specific thing that is related to 8086 addressing modes in
C but not really related to this. (I’d love to have a better term for this.)"

I think that's even more wrong. The Wikipedia article gives a pretty good idea
what the term means.

I think a good term for what he's talking about is "data [structure] memory
layout".

~~~
monocasa
He's just using the term archaically. Before caches were combined with SMP on
consumer level systems (which is remarkably recent), the term 'memory model'
did mean various forms of data layout decisions.

The 8086 thing was legitimately called a memory model, choosing how the
distinction between the 64k segments would be treated logically by your
program.
[https://en.wikipedia.org/wiki/Intel_Memory_Model](https://en.wikipedia.org/wiki/Intel_Memory_Model)

~~~
vardump
> The 8086 thing was legitimately called a memory model...

Damn, completely forgot about that despite using many of those segmentation
rules like "tiny", "large" and so on a long long time ago.

------
devit
This is the best way to categorize programming languages, but applying it only
to relatively unpopular languages seems a strange choice.

I'd categorize the popular ones like this:

\- One sparse byte array: C, C++

\- GC heap with class instances with fields: Java, C#, etc.

\- Affine structs and enums on the stack, plus library support for heap and
other models: Rust

\- Dictionaries and primitives: JavaScript, Python, Ruby, etc.

\- Immutable structs and enums on the heap: (safe) Haskell

\- Textual/array/dictionary variables, global and local: bash and other shells

~~~
dingo_bat
> One sparse byte array: C, C++

What does sparse mean in this context?

~~~
slrz
Thinly populated. The whole address space is the array and addresses/pointers
are indices into it. This point of view might not necessarily be supported by
the actual language specifications, though.

------
garmaine
Really disappointed with the treatment of Forth, and other non conventional
languages. Forth doesn’t have anything really resembling von Neumann machine
random access memory. Neither, for the record, does Turing machines. But the
author dismissed these as oddities when in fact they are the most unique
examples of something different.

~~~
zokier
Well, many Forths provide some form of access to heap[1] which is altogether
that different from e.g. C malloc

[1] [https://www.complang.tuwien.ac.at/forth/gforth/Docs-
html/Hea...](https://www.complang.tuwien.ac.at/forth/gforth/Docs-html/Heap-
Allocation.html)

[2] [http://www.mosaic-industries.com/embedded-systems/legacy-
pro...](http://www.mosaic-industries.com/embedded-systems/legacy-
products/qed2-68hc11-microcontroller/software/chapter_05_heap_memory_manager)

------
imglorp
I'm interested in human-like storage layers and wondering if anything has been
done here. Content-associative memory is one idea but doesn't go far enough.
Neural nets are a lot closer but at the moment they seem limited to pattern
matching rather than storage and exploration, although I guess you could say
supervised training is equivalent to storage.

I'm thinking of a DSL on top of a pure associative memory, which will remember
Things with weighted connections to other Things. Matching consists of not
only showing the idea but also related ones more distant. Is there anything
like this?

------
mmjaa
The last time this is posted, the author had me at "Interlude: why is there no
Lua, Erlang, or Forth memory model?" .. and .. a year later, it still holds
true.

Lua doesn't get quite the cred it should, in the language wars. Its one of
those "just going under the radar, getting things done" languages..

EDIT: like, isn't everything a _finite map_ eventually, or at least partially
expressible, idktb......

------
lispm
The LISP model is/was slightly different. Generally there were a lot of
slightly different LISP models, but the first few LISPs were organized in cons
cells and atoms. Actually the latter was 'atomic symbols'. An atomic symbol
was kind of an object and some attributes in a property list: typical for
symbols and functions. Even numbers were stored as kind of pseudo atomic
symbol. See the Lisp 1.5 manual from 1962, section 7.3 - where the memory
model is described.

Then LISP models evolved to still use cons cells, but there now are a bunch of
primitive objects (which were not introduced by Java, but existed in LISP long
before) like certain numbers and characters. LISP then also did not organize
everything as atomic symbol. Atoms were not symbols and a bunch of other data
types. For example numbers were no longer (pseudo) atomic symbols. There are
atoms, but atom then only meant: all data types which are not cons cells.
Where originally there were only cons cells and atomic symbols.

LISP tries to avoid to store pointers to primitive numbers/characters and can
store them directly. Most objects are tagged and tags for primitive types are
stored without added words. Thus a cons cell with two numbers can be two words
and each word is some primitive number. The implementation may also avoid to
tag cons cells. Instead the pointer to a cons cell will be tagged. Similar is
true for vectors. vectors also may not only be vectors of pointers, but can
store primitive objects directly and may be optimized for some primitive
objects: for example a vector of bits is just a vector header and a one
dimensional packed array of bits. The header can't be avoided, but the bits
will be stored directly and not as array of pointers to bits. Some objects may
be allocated on the stack or in registers - for example primitive numbers may
exist multiple times in memory - but other data objects may only exist once -
like a string, which is a vector of characters.

So for primitive data types (some numbers, characters, ...) LISP tries to
avoid pointers to them, makes them as small as possible, has the tags
integrated into the word representation (thus on a 64bit machine typically a
fixnum integer will not be able to use all 64 bits, because the tags need to
be represented) and integrates the bits in such a way that they can be
efficiently set and checked, possibly even by processor instructions - like on
a SPARC processor which has 'some' primitive support for that.

Other types of things in memory then are symbols, functions (also machine
coded functions) and record-like objects: structures and instances of classes.
Those records usually will use a vector to store their slots. Even more data
types exist with a low-level representation like hashtables and multi-
dimensional arrays. The organization of these objects in memory can be
relatively simple (one big pool) or complex (multiple type sorted pools with
generations).

~~~
lispm
> Atoms were not symbols and a bunch of other data types

I meant:

Atoms were now symbols and a bunch of other data types.

