Hacker News new | comments | show | ask | jobs | submit login
Rich Hickey: The Value of Values (infoq.com)
186 points by dmuino on Aug 14, 2012 | hide | past | web | favorite | 42 comments

This talk needs a different title, because it's way more important than the "Value of Values".

It's a call to stop writing object oriented software. He gives a convincing argument. You can probably find a thing or two to disagree with, but like he says, this is something we all know to be true. It's just that place oriented programming was necessary limitation due to hardware. Eventually, that limitation will no longer exist, or cease to be relevant. At that point, the only thing that makes sense is "value oriented programming" and by extension immutable functional programming.

Datatomic takes this same argument and applies it to databases.

Edit: And this might be crazy, but perhaps this is the answer to the "why can a building be built in a week, but a software project will be a year late and broken". What if you started on the first floor of the building, came in the next day and the dimensions had changed. What if when you needed two similar walls, you took a copy of one. But when you put a light switch on the copy, you accidentally put one on the wall you copied. Buildings are made up of values. A wall of a certain length. A staircase of 42 steps. These values don't change, and if they did, constructing buildings would be a hell of a lot harder.

[This talk is] a call to stop writing object oriented software. He gives a convincing argument. You can probably find a thing or two to disagree with, but like he says, this is something we all know to be true.

I agree very much with the underlying theme of the talk. But changing the focus to working with immutable values is only one step in a direction away from the dominant, imperative style of programming typified by OOP. There are (at least) two more big steps that need to be taken before I can see this more functional style of programming having any chance of going mainstream.

Firstly, we have to deal with the time dimension. The real world is stateful. All useful programs interact with other parts of the world in some form, and the timing of those interactions is often important. While programming with pure functions has advantages and lends itself very well to expressing some ideas, sooner or later we have to model time. There are plenty of relevant ideas with potential, but I don’t think we’re anywhere near getting this right yet.

Secondly, there are some algorithms that you simply can’t implement efficiently without in-place modification of data. If programs are to be expressed in a way that pretends this doesn’t happen, then the compilers and interpreters and VMs need to be capable of optimising the implementation to do it behind the scenes. At best, this is a sufficiently smart compiler argument, and even as those optimisations develop, I suspect that programmers will still have to understand the implications of what they are doing at a higher level to some extent so that they can avoid backing the optimiser into a corner.

We know from research into program comprehension that as we work on code we’re simultaneously forming several different mental models of that code. One of these is what we might call control flow, the order of evaluating expressions and executing statements. Another is data flow, where values come from and how they are used. Often the data flow is what we really care about, but imperative code tends to mix these two different views together and to emphasize control flow even when it’s merely an implementation detail. Moving to a more functional, value-based programming style is surely a step in the right direction, since it helps us understand the data flow without getting lost in irrelevant details.

To really get somewhere, though, I suspect we’ll need to move up another level. I’d like to be able to treat causes and effects (when a program interacts with the world around it or an explicitly stateful component) as first class concepts, because ultimately modelling of time matters exactly to the extent that it constrains those causes and effects. Within that framework, everything else can be timeless, at least in principle, and all those lovely functional programming tools can be applied.

Sometimes, for efficiency, I suspect we’ll still want to introduce internal concepts of time/state, but I’m hoping that however we come to model these ideas will let us keep such code safely isolated so it can still have controlled, timeless interactions with the rest of the program. In other words, I think we need to be able to isolate time not only at the outside edge of our program but also around the edge of any internal pieces where stateful programming remains the most useful way to solve a particular problem but that state is just an implementation detail.

So, I agree with you that this idea is about much more than just programming with immutable values. But I don’t think we can ever do away with a time dimension (or, if you prefer, place-oriented programming) completely. Rather, we need to learn to model interactions using both “external time” and “internal time” with the same kind of precision that modern type systems have for modelling relationships between data types. And whatever model we come up with had better not rely on scary words like “monad”, at least not to the general programming population rather than the guys designing programming languages. In fact, ironically (or not), it starts to sound a lot like the original ideas behind OOP in some respects. :-)

> a time dimension (or, if you prefer, place-oriented programming)

Place oriented is the opposite of having a time dimension. It means that at any time, the thing at that place might be different. This was done for your second argument, efficiency. The talk argues that if functional programming with values is efficient enough for you, than you shouldn't be writing object oriented software.

Values on the other hand absolutely have a time dimension. His "Facts" slide says it as does datomic and his revision control example.

He has another great talk that touches a bit more on time:


For the avoidance of doubt, when I’m talking about (not) doing away with a time dimension here, I mean from the perspective of the world changing over time as our program runs, not of associating values that effectively last forever with a certain point in time (as in purely functional data structures, version control systems, etc.).

That is, even if we follow the current trend of adopting more ideas from functional programming in mainstream programming languages, I’m saying that I doubt we will ever completely remove variable state, which is what I understand Rich to mean by “place-oriented programming”, or events that must happen in a certain order.

Instead, I think we will learn to control these aspects of our programs better. When we model time-dependent things, we want to have well-specified behaviour based on a clean underlying model, so we can easily understand what our code will do. Today, we have functions and variables, and we have type systems that can stop us passing the colour orange into a function eat(food). Tomorrow, I think we’ll promote some of these time-related ideas to first-class entities in our programming languages too, and we’ll have rules to stop you doing time-dependent things without specifying valid relationships to other time-dependent things. Some of the ideas in that second talk you linked to, like recognising that we’re often modelling a process, are very much what I’m talking about here.

As an aside, it’s possible that instead of adding first-class entities for things like effects, we will instead develop some really flexible first-class concepts that let us implement effects as just another type of second-class citizen. However, given the experience to date with monads in Haskell and with Lisps in general, I’m doubtful that anything short of first-class language support is going to cut it for a mainstream audience. It seems that for new programming styles to achieve mainstream acceptance, some concepts have to be special.

In any case, my hope is that if we make time-related ideas explicit when we care about them, it will mean that when we don’t need to keep track of time, we needn’t clutter our designs/code with unnecessary details. That contrasts with typical imperative programming today, where you’re always effectively specifying things about timing and order of execution whether you actually care about them or not, but when it comes to things like concurrency and resource management the underlying models of how things interact usually aren’t very powerful and allow many classes of timing/synchronisation bug to get into production.

I'm far from an expert on this topic, but it seems that you're still missing the main point--this talk is exactly about how to model time dependent data (immutable data), and how not to (mutable state, oo). Hickey definitely isn't advocating a system that can't change with time. Such a system would be pointless. He wants changes in state (which, naturally, occur on a time axis) to be represented by new values, not as in place, destructive updates of old values, as it's done in oo and currently popular databases.

If you look at the results of this approach in Datomic, I think you actually do see a design that treats time as much like a first-class citizen as it's ever been treated, in the sense that time essentially acts as a primary key, and developers are provided with a time machine that allows them easy and efficient access to any state that has existed in their data in the history of the application. (In theory, at least--I haven't personally tried Datomic).

I’m pretty sure I understand where Rich is coming from. I’m just arguing that while moving to persistent, immutable values is a big step in what could be a good direction, it’s not sufficient by itself to justify or cause a shift in mainstream programming styles on the scale of abandoning OOP (as suggested in the original post I replied to).

You lose things in that transition, very useful things that are widely applicable. We’re not going to just give those up without having something good enough to replace them, and I thought that in this specific talk those two areas I mentioned were glossed over far too readily.

For example, although Rich said very clearly that he thought it was OK to manipulate data in-place as an implementation detail of how a new value is built, he then argued that once the finished result was ready it should become an immutable value, and that we no longer need to use abstractions that are based on or tied to that kind of underlying behaviour. I contend that there are many cases where it is not so simple even with today’s technology, and that the idea of constraining in-place mutation to the initial creation of a value is a leaky abstraction that will not survive a lot of practical applications.

Later on, processes are briefly mentioned, but that part of the talk is about information systems, which are mostly concerned with pure data analysis. That makes it is rather easy to dismiss the idea of modelling interactive processes in context, but unfortunately, very many real world programs do need to be concerned with the wider time-related concepts like effects.

I’m sure Rich himself is well aware of all of these issues. He’s discussed related ideas in far more detail on other occasions, including in the talk that lrenn cited above. But I find his challenge near the end of this talk, “If you can afford to do this, why would you do anything else? What’s a really good reason for doing something else?” to be rather unconvincing. For one thing, that’s a mighty big “if”, whether you interpret “afford” in terms of performance or dollars. For another thing, the answer to those questions could simply be “Because ‘this’ isn’t capable of modelling my real world, interactive system effectively.”

Why would you want to use software that hasn't changed since your house was built? I wish my house could be updated as frequently and easily as my software.

The fact that you can't update your house easily is because it is a physical object. Using values doesn't change how easy it is to change your software. If anything, it makes it easier because you know your changes can't screw anyone else up.

Great talk. Love the phrase "Information Technology not Technology Technology".

But I do think he has been a bit unfair to databases (and primary keys) generally, in characterizing them as "place oriented". The relational model is actually a brilliantly successful example of a value-oriented information technology.

The very foundation of the relational model is the information principle, in which the only way information is encoded is as tuples of attribute values.

As a consequence, the relational model provides a technology that is imbued with all of the the virtues of values he discusses. * language independence * values can be shared * don't need methods * can send values without code * are semantically transparent * composition, etc.

It's true that we can think of the database itself as a place, but that's a consequence of having a shared data bank in which we try to settle a representation of what we believe to be true. Isolation gives the perception of a particular value. In some ways, this is just like a CDN "origin".

Also regarding using primary key as "place". Because capturing the information model is the primary task in designing a relational database schema, the designer wants to be fairly ruthless by discarding information that's not pertinent. For example, in recording student attendance, we don't record the name of the attending student - just their ID. This is not bad. We just decided that in the case of a name change, it's not important to know the name of the student as at the time of their attendance. If we decide otherwise, then we change the schema.

It wasn't a knock against relational databases. The issue is update in place. If you have a relational database that is append only there is no problem. He actually wrote one (datomic).

The criticism of a primary key is again not anything against having primary keys, but that in a database that allows updates in place a primary key is meaningless. It is meaningless because it doesn't specify a value -- you pass a primary key and it could be anything by the time the receiver gets around to using it. If instead the value was immutable passing a primary key would be fine.

I've done work with ERP systems and having the ability to query against arbitrary points in time would be amazing. What was the value of inventory on this date? There are other ways to go about this (event sourcing) but it moves all the complexity to application code. The goal would be for the database itself to do the work for us.

> you pass a primary key and it could be anything by the time the receiver gets around to using it. If instead the value was immutable passing a primary key would be fine.

Not sure what you mean by receiver here -- receiver as in the database or another component in your software hierarchy? The best way to ensure that your data goes unchanged across atomically disparate events is to insist that (Oracle, which is what I use in enterprise) lock the row. The easiest way is to use SELECT ... FOR UPDATE. The cursor you receive will have all the SELECTed rows locked until you close the cursor -- by commit or rollback. This will ensure that nobody can change your data whilst you're messing around with it, even if your goal is never to actually modify it, but merely capture a snapshot of the data. Obviously, if you have a lot of different processes hitting the same data they will block and wait for the lock to free (though this behaviour can be changed) so depending on what you're doing this may not be the most efficient way, though it certainly is the most transactionally safe way. Another way is to use Oracle's ORA_ROWSCN which is, greatly simplified, incremented for a row when that row is changed. So: read your data incl. its ORA_ROWSCN and when you update only update if the ORA_ROWSCN is the same. A similar approach could be done with Oracle's auditing or a simple timestamp mechanicm, but you obviously lose some of the atomicity from doing it that way.

> I've done work with ERP systems and having the ability to query against arbitrary points in time would be amazing.

You can do that in Oracle. You can SELECT temporally; so you could tell the DB to give you the state of the select query 5 minutes ago, subject to 1) flashback being enabled; and 2) the logs still having the temporal data.

Another way is to use an audit log table to store changes to data. We use this all the time in accounting/finance apps as people fiddling with invoices and bank account numbers must be logged; you can either CREATE TRIGGER your way to a decent solution, or use Oracle's built-in auditing version which is actually REALLY advanced and feature rich!

N.B.: I do not use other databases so my knowledge of them is limited, but it should give you some ideas at least!

I think the parent meant the value vs. reference separation.

Consider that you create a record, give it a primary key N, then start referring to that value by the primary key and at some point make an update to the record, the same primary key now refers to another value. The primary key is just a reference pointer to a placeholder (=record) and depending on whatnot the value in the placeholder can change to anything. So, you have to be careful of what you mean by the primary key because it's just a reference, not a value. In the value paradigm your primary key would be a hash, like in git, that would forever be that one value instead of referring to some value.

Exactly. Primary key is a subset of attributes - thus necessarily populated by values.

It's worth considering the correspondence between the concept of functional dependency in the relational model, and the concept of a pure function in functional programming. The issue under discussion, then, is whether referential transparency is afforded in the database.

While referential transparency in a database is achieved momentarily at the right isolation level, it is not achieved in the eternal sense of a pure function.

This is because the functional dependency in FP encodes an intensional definition, whereas the functional dependency captured in a relation is extensional, usually modelling the state of the knowledge of the relevant world, and therefore being subject to change.

Great talk, and without having any experience with FP, it really makes sense on many levels. I love data, and how transparent it is, and how objects seem to get in the way a lot of the time. I like queues, and shipping data from one process to another rather than sharing objects. RESTful interfaces, etc. Those concepts and tools are powerful.

The only thing I'm not too comfortable with is that space isn't really infinite. Yes, it's much cheaper, but still not infinite. If we stored all our logs in an ever growing database, and expect to be able to access it all the time, this is really very expensive. This is why we rotate logs, archive them and trash them eventually. Sure, we can afford this expense for source control, because this data (source code) is amazingly small in comparison. I'm not sure how it translates to real data on our systems, which is immensely bigger.

Also thinking about it in context of technologies like redis. redis manifets a lot of the advancement in technology in how memory is used. It's so vastly bigger and cheaper than before that we can afford to store our whole database in it, instead of on much slower disks. But then this super-fast in-memory database definitely faces storage size constraints that needs to be considered...

Just a few random thoughts. Wish I could have a chat to Rich Hickey one day. Even if I could, I have a lot more to learn until then, so I'd make the most of this chat.

I think the notion is like garbage collection (calling "new", as he mentioned) — the illusion of infinite space. (http://mitpress.mit.edu/sicp/full-text/sicp/book/node119.htm...)

> This is why we rotate logs, archive them and trash them eventually.

i think an organization trashes old logs for out-of-band reasons - acquiring more disk space requires following an organization's procurement process which imposes tons of friction, or because compliance with applicable regulations requires saving, e.g., emails, for three months and saving them for longer is a legal risk.

Or... because I don't really care what what time Postgres started up 2 years ago.

Old logs may be a privacy liability; and depending on the type of log and the load on the system, the fully loaded cost of keeping logs indefinitely may be too high to justify based on the revenue for a customer (I'm thinking of ISP logs in particular, for example).

If you found this interesting and have not tried Clojure yet, you should really give it a go. Learning Clojure teaches a lot about programming just because it is very well-designed.

Is there any convenient way to get notified when Rich Hickey pushes a new talk or article? I can't seem to find a RSS feed, mailing list or Twitter account to follow. Any advice appreciated!

You could try a Google Alert [ http://www.google.com/alerts ] for "Rich Hickey (talk | article)"

I've never used it, but it seems to be a very nice way to keep track of such things. Thanks a lot for the suggestion!

You could follow @richhickey on Twitter or just read Hacker News ;)

I think there is another great example of value based programming we use every day even on small scale: unix pipes.

cat file | grep .... | wc

There are no complex protocols involved between cat, grep and wc - just passing around the value (now I am not talking about mutable files, directories etc).

I have seen very few systems which are as simple yet as flexible and versatile. Conventional wisdom says it is because unix is set of small utilities where each program does just one thing right. After watching the talk we should note that these utilities pass around text values.

If you want to build something as powerful and flexible as unix command line, you should think about value of decomposition as well as value of values :)

Great talk. Most of these talks on functional programming make perfect sense. These also look ideological superior.

My only problem is Object oriented programming looks more pragmatic in the real world. There are libraries, tools, tutorials, help forums and a lot of other stuff out there which helps anybody who wants to start learning OO from to go from nothing to far places.

You can't say the same thing about functional programming. The community is too elitist, the tutorials are math heavy. And the tools are too ancient. Having to use simple text editors and selling them as technologies used to build large applications is a contradictory philosophy.

> The community is too elitist, the tutorials are math heavy. And the tools are too ancient. Having to use simple text editors and selling them as technologies used to build large applications is a contradictory philosophy.

Careful. Statements like this only create more "elitists" by insulting people. Have you seen leiningen, Counter Clockwise, or La-Clojure? Part of the reason you need all that tooling is because of the objects. If you haven't become proficient in a functional language, can you really say the tooling is insufficient? It's like telling someone their movie sucked without seeing it, or only staying for the first 5 minutes. When I get rid of my Foo, FooImpl, JPAFoo, FooService, FooServiceImpl, FooDao, Bar, BarImpl, etc, the requirements for my editor and tooling suddenly change. If I'm not using big heavy frameworks, I no longer need all those plugins. I don't need to be able to click on my spring configuration and jump to the implementation. When I'm working in repl, I don't need heavy integration with Jetty (for example) because I never need to restart the server. If my restful webservice just needs to be a function that returns a map (ring), then I don't need annotation support, or some framework plugin. If my code is data, my editor already supports both.

I need to move around functions, and compile my code. Code completion? Navigation? Sure, but Emacs, CC, La-Clojure can all do that. I hope you aren't insinuating that Emacs/Slime is a "simple" text editor ;)

Tutorials are their own issue. A new object oriented programming language only needs to teach you their syntax and their API. A Clojure tutorial targeted at someone who has only ever done serious work in an OOPL is going to have to explain not only Clojure, but fundamental concepts related to functional programming. Once you really learn one, the rest all make sense in the same way it's relatively easy to jump around OOPLs.

If you've accepted the technical argument, don't let those other things hold you back. The Clojure community is great, and the Eclipse and IntelliJ stuff has really come a long way.

+1 the requirement for tooling is less. I switch between IntelliJ and Emacs for Clojure development, and tools don't hold me back.

I did a fair amount of Java GWT/SmartGWT development last year and having both client side Java code and server side code running in twin debuggers was handy, but really only crucial because of the complexity of the whole setup. That said, I only write simple web apps and web services in Clojure and Noir and perhaps that is why I don't feel that I need complex tools.

I won't claim any level of tooling/IDE parity, but FPs are quickly getting up there.

As an example, Haskell has a very helpful community, lots of stimulating content (yes, some are math heavy but many/most are not), over 3000 packages on Hackage (many of which are really excellent), 700+ people on IRC at anytime constantly talking/answering questions, at least 3 major web frameworks, many concurrency libraries, database drivers/libraries for almost anything, an astounding number of utility libraries and a real world-class, top of the line compiler (GHC) that produces blazing fast, robust code. Many companies are building commercial/proprietary tools with it for mission critical applications.

I just love it when people try to juxtapose "pragmatic" with "math-heavy". Obviously, in the real world no one ever uses math, all they do is print "Hello World" to the screen.

Unless you use Haskell, in which even printing "Hello World" to the screen is math heavy.

Clojure in Emacs with Slime is hardly a simple text editor. Emacs might be ancient, but that doesn't make it less powerful.

There are also plenty of great documentation for Clojure, Programming Clojure is one, and the online documentation is excellent.

Well now, I love Clojure and write it for my hobby projects, but I have to disagree about the online documentation for it. Every command is documented, sure, but that doesn't make it excellent. For example, here's the doc for the `with-open' macro:

"bindings => [name init ...]

Evaluates body in a try expression with names bound to the values of the inits, and a finally clause that calls (.close name) on each name in reverse order."

I deliberately chose a bit of code that's actually pretty simple and straightforward; also, one that I know intimately. Now, knowing what `with-open' does as well as I do, this doc string almost makes sense on the first pass. But to the layman this is almost impenetrable. I've written a couple variants of this macro and I STILL have to read the docstring a couple of times to understand what it's saying.

The online docs are comprehensive but I would never use the term "excellent" to describe them.

Good point. The clojuredocs version[1] is a bit better, but you're right, it could be better.

[1] http://clojuredocs.org/clojure_core/clojure.core/with-open

This video, along with many others I've watched from him, espouses the purity of data, and talks about not tangling data with functions (object orientation).

In this video, he seems to go along and say that values are great for software projects that use multiple languages, in part because values are universal meaning one doesn't need to translate classes into multiple languages.

However, regardless of whether you use an object oriented design or not, don't you usually have a set of functions you tend to perform on a set of data or values? For instance, you may not wrap your customer data behind a class and methods, but there are still going to be some rules related to all that data you're passing around. So in the multiple languages scenario, wouldn't you still have to translate those rules from language to language?

You would. The thing is that it's supposed to be easier and involving less code, since you don't need to port the interface boilerplate. At least that's how I understood it.

Has anyone seen a Sales CRM implemented with a temporal+value approach? Seems quite useful for tracking movement through a funnel.

You can use a stock rdbms and still keep track of changes. Just keep a seperate "Updates" table which consists of the tuple {class, object, change-description, changed-when, changed-by}

You don't need both class and object, but I prefer to log both object type and id.

Anything that touches financial records-keeping would be ripe for this kind of software, too. I spent quite some time on a financial services system and our number one enemy was mutable state.

Rich Hickey is a great thinker.

Is there anywhere I can go to get a collection of useful programming videos? Somewhere that aggregates videos like these after they are uploaded?

InfoQ is actually a pretty good place. Just scroll through their backlog and you're sure to find something.

You should be more clear what you want, programming videos can be anything from printing hallo world in java to theoretical computer sience where they dont even know how large the problems are.

Are you intressted in languages, compilers, algorithms, data structurs, parallism, concurency, databases, operating system, virtual maschines, garbage colleters, grafics or something more meta like the this 'value of value' video.

I can provide you links almost everything but im not going to do the work until you tell me what you want.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact