
The Power of Interoperability: Why Objects Are Inevitable (2013) [pdf] - brianhempel
http://www.cs.cmu.edu/~aldrich/papers/objects-essay.pdf
======
weavejester
The author makes arguments for the functionality that OOP encompasses, but
doesn't attempt to explain why these should be grouped under one abstraction.
Instead the author assumes that this should be the case, and rather naturally
ends up with the answer of objects.

The author also has some curious opinions about FP approaches. For example, he
points out that Haskell's type system distinguishes between homogenous and
heterogeneous lists, but then considers this to be a disadvantage. I get the
strong impression that the paper is arguing backward; assuming that the
properties of common OOP languages are the ideal, and then deriving from that
assumption the answer that OOP is the best paradigm.

------
hyp0
The author contrasts "objects" and "ADTs" (abstract data types), but uses - to
my mind - an improvished definition of ADTs. I've always thought of _On the
criteria to be used in decomposingsystems into modules_ (Parnas) as the key
paper for ADTs, which the author cites (at [26]) but doesn't use for this
purpose. Of course, different definitions are available.

The distinction the author uses is that objects can have different
implementations, but ADTs can't.

From this follows a streams of motherhoods of the advantages of having
different implementations.

Curiously, from my scan, he doesn't seem to mention evolution _of interfaces_.
That is, instead of evolution being just a different implementation of an
identical interface, it is much more common for something to be added: extra
data, extra services, extra features. The typical approach is to leave
everything the same (so that old clients still work with this new API), and
strcitly only _add_ new methods (and/or types with superset values).

My reasonably involved studies of the evolution of Java classes in the
standard packages for the purposes of serialization (which must store and
recreate them), showed me that in practice, it's pretty rare for an
implementation to change _without the interface also changing_. That is, I'm
claiming:

    
    
      Interface evolution is more common than implementation evolution alone.
    

To be clear: although implemenation evolution can and _does_ occur (e.g.
intern aspects of String recently), it's pretty rare to occur without the
interface also changing. Not counting simple bug-fixes, which don't
substantially change the implementation, and also not taking into account
interfaces that were initially designed to have multiple implementation (e.g.
collections, swing/awt, some networking classes etc), as these aren't
evolution. (Though, I suppose adding a new implementation of existing
interfaces e.g. LinkedHashMap (ordered map) counts as "evolution".)

This higher frequency of interface evolution may be a reflection that we get
paid for features, not refactoring.

~~~
discreteevent
You can't have large flexible systems of any kind without some sort of fixed
interfaces or protocols. There must be millions of examples. Sockets, ODBC,
drivers, USB, nuts and bolts etc etc.

~~~
hyp0
you skipoed parts of my comment: you can have those bnefits, by
backcompatibility, which you do by evolving only by adding, not changing the
part of the interface that is already there. All those examples you mention
evolve, with later versions adding features that require the interface to be
added to. They aren't "fixed", they are "backcompatible".

This also applies implicitly to any client for interop: you don't have to use
the whole interface, just the festures you want. In effect, you are using
different, smaller interface.

------
fsloth
Thank you, the term "service abstraction" indeed describes the - to my mind -
optimal use case for the object model.

That is, as an easily discoverable interface between an established and
supported platform and client code.

This is the word I had been missing when discussing the ups and downs of
object model within our product's codebase.

Suddenly I have a tiny bit clearer abstraction model in my head of software.

This is the sort of nugget of gold I come to HN for.

------
chipsy
I agree with the idea that the abstractions provided are "inevitable." I
disagree with the idea that the way we're using them is optimal.

I've just spent time in my code reducing polymorphic classes into simpler type
identifier + record combinations. The rationale I am going by is that every
class I introduce proliferates new symbols: the class definition, the
instances, the methods, the values, the references to yet further symbols.
Each time I introduce new symbols I bulk up the code and diffuse its meaning;
the meaning of these symbols exists in the class, not at the call site. Thus,
everything in a class definition is doing its job correctly as long as you
want the class to be a true "black box" that hides behind the symbols it
provides.

But in many situations, the algorithm is basically "business logic" and
conveys the most meaning when inlined at the call site: "Given this type of
data, do this. Otherwise, do that." This type of algorithm assumes a schema
onto the data, which is usually our true intention; having determined the
schema, we can cease further indirection and begin our actual computation. And
in this code, ADTs become less important, because we've already used those to
query for the data; when we have the queries abstracted, we can stop. We don't
need to abstract the meaning any further. Relational database modelling is
analogous; when the data is normalized, it's very indirect, yet also very
simple. And this pattern can be followed from within a program as well.

Consider the "composable component-entity" model that has become somewhat of a
cargo cult fad. These systems generally understand the entity as a record of
"which components are used/where to find them/how they fit together". The
implication is that a definite schema of "types of components and their
properties" exists at or near the top level, and entities do not hide
arbitrary blobs of data and functionality in deep recesses, as results when
trying to build from class inheritance. Since the concept is poorly defined,
actual implementations are all over the place, of course, but the general
direction of it is towards flat/primitive/normalized.

The result I got from killing off some classes and inlining their
functionality was that I found some lurking unused variables, clarified the
top-level, improved the code's flexibility, and reduced total lines of code.
So at least in this instance I seem to be on the right track.

~~~
jamii
Bitsquid recently had an interesting 3-part post on the design of the entity
system for their game engine:

[http://bitsquid.blogspot.com/2014/08/building-data-
oriented-...](http://bitsquid.blogspot.com/2014/08/building-data-oriented-
entity-system.html)

[http://bitsquid.blogspot.com/2014/09/building-data-
oriented-...](http://bitsquid.blogspot.com/2014/09/building-data-oriented-
entity-system.html)

[http://bitsquid.blogspot.com/2014/10/building-data-
oriented-...](http://bitsquid.blogspot.com/2014/10/building-data-oriented-
entity-system.html)

Data is stored in flat, homogeneous blobs and linked together by integer ids
instead of pointers.

------
al2o3cr
"Some of the advantages of object-oriented programming may be psychological in
nature. For example, Schwill argues that “the object-oriented paradigm...is
consistent with the natural way of human thinking” [28]. Such explanations may
be important, but they are out of scope in this inquiry; I am instead
interested in whether there might be significant technical advantages of
object-oriented programming."

This seems to be throwing the rhetorical baby out with the bathwater -
programming is not only a technical activity, but in large part a technical
activity performed BY HUMANS, and generally to satisfy requirements generated
by other humans.

~~~
eternalban
But it is a specialization.

------
stcredzero
_While there has unquestionably been some hype about objects over the years, I
have too much respect for the many brilliant developers I have met in industry
to believe they have been hoodwinked_

A careful student of psychology and history will soon realize that almost
everyone is at least partly hoodwinked about something.

[http://www.paulgraham.com/say.html](http://www.paulgraham.com/say.html)

That said, what's popular most often has at least some merit.

[https://www.youtube.com/watch?v=qu99MegHzgE](https://www.youtube.com/watch?v=qu99MegHzgE)

------
DanielBMarkham
Reading about this stuff reminds me of watching a grass fire: there's a huge
amount of smoke and activity going on, but I'm really not sure where the
flames are -- or if it should concern me.

Working with Ocaml and F# over the past few years, in small and smallish
medium-sized systems, I've been fortunate to have an environment where I can
use any paradigm I want in a natural setting. And while I've gone from an OO-
diehard to an FP fan during that time, I haven't given up on objects.

I think systems "mature into" objects. I do not believe they should start with
them. (Which is completely opposite of what I thought several years ago)
Functionality gets created, tested in various scenarios, and then -- and only
after a good amount of exercising by various other systems -- does it get
grouped into ADTs, objects, or whatnot.

Note that I am describing how these things are created, not what they _are_.
For practical purposes I could care less whether this chunk of bytes is
exposing a service or providing a type. Of much more immediate concern is _why
should it be in this shape?_ I think if you start functionally, you are free
to let the problem domain determine the shape of the mature code. Whereas if
you start with some kind of preconceived notions of where you're ending up and
why, you're making assumptions about the optimized solution that may have no
justification.

tl;dr -- you're asking the wrong question.

------
jph
The author is accidentally promoting functional programming, without realizing
it. He emphasizes "interoperable extensions" and "service abstractions" but
what he's really describing are functional type signatures and interfaces.

His writes: "I now propose a candidate for the leverage provided by object-
oriented service abstractions in design: The key design leverage provided by
objects is the ability to define nontrivial abstractions that are modularly
extensible, where instances of those extensions can interoperate in a first-
class way." But he describes is not "object-oriented" in any sense of
encapsulated data, internal state, and methods. Instead he describes the fold
function, map function, and higher order functions.

The author touches on Alan Kay. Kay wrote this: "I'm sorry that I long ago
coined the term "objects" for this topic because it gets many people to focus
on the lesser idea. The big idea is "messaging" ... The Japanese have a small
word -- ma -- for "that which is in between" \-- perhaps the nearest English
equivalent is "interstitial". The key in making great and growable systems is
much more to design how its modules communicate rather than what their
internal properties and behaviors should be."

~~~
glurgh
Do you really believe the author of a paper explicitly about investigating the
technical advantages of OO which also happens to touch on Haskell and ML is
simply clueless about functional programming? And this ignoramus has somehow
faked his way to the position of director of CMUs Software Engineering PhD
program.

This seems like a stubbornly obtuse way to not engage in the paper's
arguments.

~~~
sesteel
Indeed; I have had the pleasure of having correspondence with Dr. Aldrich in
the past. I had interest in implementing a static compiler for CMU's Plaid
programming language, but found Go sufficient at the time. He also became
involved with development of the Wyvern programming language. Nonetheless, he
is well versed in programming language design and theory as that is his area
of interest. It would be hard to imagine any related ignorance on his part.

~~~
seanmcdirmid
Jonathan aldrich does come from the UW side of PL, which leans toward objects
(a bias I share as well). But he knows his FP. Still, I'm not much for
arguments by authority; either you see a proper comparison in the paper or not
(I haven't looked closely enough). What both William and Jonathan both seem to
miss, however, is more a discussion on object thinking, which I think would
make the difference between OOP and fp more clear from a design perspective.

~~~
glurgh
I don't think this is really an argument from authority since I'm not saying
'he's right because he has a PhD and teaches at a renowned university'. I'm
saying that assuming he's ignorant of FP given both what he says in the paper
and his background is silly and shallow.

It's not really fair to say that Aldrich 'misses' a discussion of object
thinking, he just chooses to put the focus of that particular paper elsewhere
- this is from the intro

 _Some of the advantages of object-oriented programming may be psychological
in nature. For example, Schwill argues that “the object-oriented paradigm...is
consistent with the natural way of human thinking” [28]. Such explanations may
be important, but they are out of scope in this inquiry; I am instead in-
terested in whether there might be significant technical advantages of object-
oriented programming._

and

 _This success raises a natural question: Why has object-oriented programming
been successful in practice?_

~~~
seanmcdirmid
Everyone reads these papers from different perspectives, so it's quite easy to
say someone missed something (fair or not).

My criticism is that an analysis of OOP is incomplete if you avoid looking
at...objects. It's like saying, we are going to ignore the objects themselves,
and just focus on the technical features of the objects to see what the
technical advantage of these features are. It is very reductionist...while
objects favor a more holistic manner of thinking.

~~~
glurgh
We probably (broadly) agree more than disagree. One reason FP/OOP comparisons
are difficult is that FP is closely related to a mathematical formalism while
OOP isn't and can't be. I think the tack he's taking is 'can we explain the
popularity of OOP in terms of "technical" or really, "practical
programming/software engineering" advantages'. It's a tricky needle eye to
thread.

The objection 'that approach can't lead to useful insight' is a reasonable one
but I don't think he's taking the approach out of ignorance or because he
spaced out on something while typing it up - it's a deliberate choice,
whatever its merits.

~~~
seanmcdirmid
Ah, I never said he took this approach out of ignorance. I've talked about
this with him before, and never found his arguments lacking.

It is nice that something is still going on in OOP.

------
jamii
From a discussion we had about this paper, in the context of the relational
logic language we are working on:

Data abstraction is an interesting topic in a language that doesn't really
have data structures. ADTs are usually used to describe the allowable
operations on either a data structure or some side-effectful system.

For describing data _models_ , we have tables and (soon) integrity
constraints. An integrity constraint is a rule which aborts the transaction if
it produces any output. Simple integrity constraints like column types and
foreign keys can potentially be removed when the compiler can prove that they
will not be violated.

For derived data, rather than providing a method which returns the desired
information, we define a view on the data eg instead of a getTotalSalary
method we would define a rule that inserts data into the totalSalary table
which is then either lazily evaluated on query or materialised and
incrementally updated.

For modifying data we currently have no alternatives to directly working on
the base data. There is some amount of work on bidirectional relational views
but it seems like it adds a large amount of complexity for a problem we don't
currently have.

We do intend to make scalar values and functions extensible, similar to
[http://www.postgresql.org/docs/8.3/static/xtypes.html](http://www.postgresql.org/docs/8.3/static/xtypes.html)

For side effects, we request actions by adding a row to a table that has a
watcher attached. The interface is described by the set of tables available eg

    
    
        jobs started (id, input data...)
        jobs to be cancelled (id)
        jobs completed (id, output data...)
    

[http://db.cs.berkeley.edu/papers/eurosys10-boom.pdf](http://db.cs.berkeley.edu/papers/eurosys10-boom.pdf)
and
[http://db.cs.berkeley.edu/papers/vldb14-edelweiss.pdf](http://db.cs.berkeley.edu/papers/vldb14-edelweiss.pdf)
describe variations on this approach. Our current editor prototype uses the
Edelweiss model, which is my preferred choice if we can make it work smoothly.
The main difficulty so far is in constructing stable ids to ensure that
actions are idempotent.

In any case, for most protocols you need to constrain not just the types of
individual messages but the allowable sequences. ADTs don't help here. Where
the protocol can be described by a state machine we can simply encode that
state machine in a table and have a rule that verifies message histories and
provides a list of currently allowable actions.

~~~
seanmcdirmid
Why would stable ids be a problem? Is it because immutability doesn't give you
any stable references to latch on to?

~~~
jamii
Have a look at the Edelweiss paper I linked. Stable ids are needed for that
model. We would probably be fine with eg random uuids if we stuck with Bloom.

EDIT: Thinking about it, the issue is definitely not that it is immutable but
that it is declarative. In the absence of control flow, `new` is not
sufficient to provide unique identity because the rule in question may be run
zero or more times.

~~~
seanmcdirmid
Alot of work in glitch goes into solving this problem. In the new case, we are
able to form stable (reproducible) IDs through a memoized lexical stream (re-
lex will produce the same tokens if possible) and by using the call stack (so
the id is a path with special ids for loop indices). For a declarative (rule
based?) language, perhaps rule firing context could be used for a similar
purpose; I'll try to look at that paper and see if something clicks. I guess
it depends on what makes a unique object.

Though if you are completely declarative, why would object identities be
necessary?

~~~
jamii
For largely the same reason that relational databases need identity. The
classic example is the users table. Everything about the user, from their
username to their email address, can be changed. So you need some way to
stably reference that user. In the timeless Tarpit/Edelweiss world, the
current state of the program is a pure function of its inputs. That means that
the id has to be deterministically derived from the inputs.

For now we synthesise ids out of contextual information eg "the user created
at time T by event X". This works for now but it feels a little awkward. Once
we have written more code in Eve we will have a better set of examples to
guide thinking about a solution.

~~~
seanmcdirmid
Isn't that effectively mutation then? You have an ID with information related
to that ID that changes, or what we would call mutable fields in the object
world. Anyways, this is why I don't play with declarative programming models
anymore: I would wind up hacking in necessary things like identity and
mutation in very weird ways. This is why I decided instead to try applying
declarative time semantics to an imperative programming model.

Your solution is quite similar to mine: all blocks that continue in after
clauses underneath an event are indexed by the event instance (we memoize
event instances so they have identifiers). One problem that we had to deal
with is the discrete nature of an event handling context; it made sense to
have an additional clause that continues after the event (to create an object
like a new user) vs. code that executes discretely (like a physics step). The
former gets indexed by the event instance, the latter is shared across all
event instances (so you can't really create an object in anything other than a
top-level block of event-after block).

~~~
jamii
I'm not sure where you got the idea that we are proposing a language where
nothing ever changes :p

Both Bloom and Edelweiss are ideas about how to model time/change
declaratively ie without having to specify control flow or incidental
ordering. That doesn't seem to be too different to Glitch. Whether you think
of imperative code pushing updates to data structures or declarative code
pulling updates from stream is a matter of perspective. Either way, you still
have to figure out how to order and combine those updates.

Bloom takes the approach that each fact is true at an explicit point in time,
so rather than mutating in place you simple define the state at time T+1 in
terms of the state at time T. Both states are simultaneously accessible so you
don't have to worry about what order the changes happen in.

Edelweiss takes the approach discussed in Out Of The Tarpit where the true
state is just the list of input events and everything else is an incrementally
maintained view over these events. Edelweiss makes this practical by analysing
datalog code to determine when an input event can no longer contribute to the
state of the system. This is roughly analogous to GC, except that instead of
relying on pointers to determine unreachability it uses static analysis. The
appeal of this model is that you can always explain the state of system to the
programmer by tracing backwards through the views.

~~~
seanmcdirmid
I don't think you are, and neither am I. What we both want is controlled
change, we are just going about it from different directions.

I'm still trying to get a handle on Bloom and this discussion is useful,
thanks! Glitch is based on the same GC analogy but does a pure dynamic
analysis: input events generally never wash out of the system since they are
often clock ticks that step a physics simulation :)

------
hyp0
html version
[http://webcache.googleusercontent.com/search?q=cache%3Ahttp%...](http://webcache.googleusercontent.com/search?q=cache%3Ahttp%3A%2F%2Fwww.cs.cmu.edu%2F~aldrich%2Fpapers%2Fobjects-
essay.pdf)

I'd almost forgotten this trick, just google
_cache:[http://www.cs.cmu.edu/~aldrich/papers/objects-
essay.pdf](http://www.cs.cmu.edu/~aldrich/papers/objects-essay.pdf) _

request: could HN do this automatically pls? (similar to how it used to
provide a second link to scribd for pdf) I _always_ prefer a html version, to
assess it. If it's interesting, and diagrams/formulae are garbled, I can
always get the pdf later.

~~~
sitkack
PDFs render inline wonderfully for me in both browsers I use without plugins.

~~~
hyp0
I'm on a phone (but maybe that would be fast enough... I'm using stock: no
plugins)

------
galaxyLogic
The big idea with OOP is "messaging". Right. What makes messaging useful is
that the 'messagee' typically understands more than one different message.
Because if it only understood one we couldn't take advantage of composing the
interaction with them in different ways, as needed by the client.

When we can do that we can also replace the messagee with an instance of some
other conformant class, and thus we can reuse the client-side implementation
of the interaction-protocol with many different types of recipients.

Using the word "replace" here is easily misleading. A better word is "reuse".
We don't typically DELETE the original service object at all. More likely we
just REUSE the same client-code with different types of "service objects". The
point is it can interact with many differently implemented service-objects
that offer the same interface.

If each class defined only one message it responds to, that wouldn't look like
OOP, would it? It would look like FP where functions basically understand just
one message, "Tell me your value!".

Of course FP languages can simulate OOP and vice versa. Smalltalk for instance
defines the class BlockClosure which understands the message #value: (among
others). That means - "Tell me your value for this argument". But BlockClosure
also understands other messages, like #asString, #class etc. That is OOP.

We could say that OOP is a generalization of FP in this sense. Objects can do
what functions do, calculate their "value" for a given argument. But this
operation is explicitly declared as just one of the operations that object can
perform. There is NO SPECIAL syntax for it, like "()" in FP languages.

JavaScript can be characterized as a functional-object -language because its
functions are first-class citizens but they can also have other operations
than calculating their value. We can for instance "send the message"
.toString() to them to get their source-code.

You could have a "messaging conversation" with a JavaScript function (-object)
like this:

Me to function: "Hey tell me your source-code". The function returns it

Me to my Helper -object: "Hey helper, does this source-code look secure?".
Helper returns "Yes".

Me to function: "OK, I trust you. Tell me your value for argument 42!".
Function returns it.

The conversation / messaging above could be reused with some other type of
object without having to modify the client-side code at all.

Summary: The essence of OOP is that objects can declare MULTIPLE messages they
respond to, thus defining the protocol they can be interacted with. Many
objects can support the same protocol of interacting with them, but provide a
different implementation as to how they actually calculate some (or all)
responses to the messages they get.

------
rurban
This was 2013 and he never heard of the CLOS arguments against traditional
objects as he describes it, and he attributes Alan Kay with generics and
method based OO dispatch, the functional approach?

All these arguments were already cleared in the early 1990's, and people still
don't get it. CS scholar's shouldn't care about the popularity of certain
languages, or do they wanna teach VB or PHP?

Objects are of course evitable, functions can do more, can do it better, and
functions can represent not only scoped blocks (variable hiding), but also
objects easily.

[http://www.gigamonkeys.com/book/object-reorientation-
generic...](http://www.gigamonkeys.com/book/object-reorientation-generic-
functions.html) talks a bit about this mindset, mentioning also Kay's simple
dynamic message passing approach.

~~~
Confusion

      Objects are a poor man's closures.
      Closures are a poor man's objects.
      -- Guy Steele

