It's a call to stop writing object oriented software. He gives a convincing argument. You can probably find a thing or two to disagree with, but like he says, this is something we all know to be true. It's just that place oriented programming was necessary limitation due to hardware. Eventually, that limitation will no longer exist, or cease to be relevant. At that point, the only thing that makes sense is "value oriented programming" and by extension immutable functional programming.
Datatomic takes this same argument and applies it to databases.
Edit: And this might be crazy, but perhaps this is the answer to the "why can a building be built in a week, but a software project will be a year late and broken". What if you started on the first floor of the building, came in the next day and the dimensions had changed. What if when you needed two similar walls, you took a copy of one. But when you put a light switch on the copy, you accidentally put one on the wall you copied. Buildings are made up of values. A wall of a certain length. A staircase of 42 steps. These values don't change, and if they did, constructing buildings would be a hell of a lot harder.
I agree very much with the underlying theme of the talk. But changing the focus to working with immutable values is only one step in a direction away from the dominant, imperative style of programming typified by OOP. There are (at least) two more big steps that need to be taken before I can see this more functional style of programming having any chance of going mainstream.
Firstly, we have to deal with the time dimension. The real world is stateful. All useful programs interact with other parts of the world in some form, and the timing of those interactions is often important. While programming with pure functions has advantages and lends itself very well to expressing some ideas, sooner or later we have to model time. There are plenty of relevant ideas with potential, but I don’t think we’re anywhere near getting this right yet.
Secondly, there are some algorithms that you simply can’t implement efficiently without in-place modification of data. If programs are to be expressed in a way that pretends this doesn’t happen, then the compilers and interpreters and VMs need to be capable of optimising the implementation to do it behind the scenes. At best, this is a sufficiently smart compiler argument, and even as those optimisations develop, I suspect that programmers will still have to understand the implications of what they are doing at a higher level to some extent so that they can avoid backing the optimiser into a corner.
We know from research into program comprehension that as we work on code we’re simultaneously forming several different mental models of that code. One of these is what we might call control flow, the order of evaluating expressions and executing statements. Another is data flow, where values come from and how they are used. Often the data flow is what we really care about, but imperative code tends to mix these two different views together and to emphasize control flow even when it’s merely an implementation detail. Moving to a more functional, value-based programming style is surely a step in the right direction, since it helps us understand the data flow without getting lost in irrelevant details.
To really get somewhere, though, I suspect we’ll need to move up another level. I’d like to be able to treat causes and effects (when a program interacts with the world around it or an explicitly stateful component) as first class concepts, because ultimately modelling of time matters exactly to the extent that it constrains those causes and effects. Within that framework, everything else can be timeless, at least in principle, and all those lovely functional programming tools can be applied.
Sometimes, for efficiency, I suspect we’ll still want to introduce internal concepts of time/state, but I’m hoping that however we come to model these ideas will let us keep such code safely isolated so it can still have controlled, timeless interactions with the rest of the program. In other words, I think we need to be able to isolate time not only at the outside edge of our program but also around the edge of any internal pieces where stateful programming remains the most useful way to solve a particular problem but that state is just an implementation detail.
So, I agree with you that this idea is about much more than just programming with immutable values. But I don’t think we can ever do away with a time dimension (or, if you prefer, place-oriented programming) completely. Rather, we need to learn to model interactions using both “external time” and “internal time” with the same kind of precision that modern type systems have for modelling relationships between data types. And whatever model we come up with had better not rely on scary words like “monad”, at least not to the general programming population rather than the guys designing programming languages. In fact, ironically (or not), it starts to sound a lot like the original ideas behind OOP in some respects. :-)
Place oriented is the opposite of having a time dimension. It means that at any time, the thing at that place might be different. This was done for your second argument, efficiency. The talk argues that if functional programming with values is efficient enough for you, than you shouldn't be writing object oriented software.
Values on the other hand absolutely have a time dimension. His "Facts" slide says it as does datomic and his revision control example.
He has another great talk that touches a bit more on time:
That is, even if we follow the current trend of adopting more ideas from functional programming in mainstream programming languages, I’m saying that I doubt we will ever completely remove variable state, which is what I understand Rich to mean by “place-oriented programming”, or events that must happen in a certain order.
Instead, I think we will learn to control these aspects of our programs better. When we model time-dependent things, we want to have well-specified behaviour based on a clean underlying model, so we can easily understand what our code will do. Today, we have functions and variables, and we have type systems that can stop us passing the colour orange into a function eat(food). Tomorrow, I think we’ll promote some of these time-related ideas to first-class entities in our programming languages too, and we’ll have rules to stop you doing time-dependent things without specifying valid relationships to other time-dependent things. Some of the ideas in that second talk you linked to, like recognising that we’re often modelling a process, are very much what I’m talking about here.
As an aside, it’s possible that instead of adding first-class entities for things like effects, we will instead develop some really flexible first-class concepts that let us implement effects as just another type of second-class citizen. However, given the experience to date with monads in Haskell and with Lisps in general, I’m doubtful that anything short of first-class language support is going to cut it for a mainstream audience. It seems that for new programming styles to achieve mainstream acceptance, some concepts have to be special.
In any case, my hope is that if we make time-related ideas explicit when we care about them, it will mean that when we don’t need to keep track of time, we needn’t clutter our designs/code with unnecessary details. That contrasts with typical imperative programming today, where you’re always effectively specifying things about timing and order of execution whether you actually care about them or not, but when it comes to things like concurrency and resource management the underlying models of how things interact usually aren’t very powerful and allow many classes of timing/synchronisation bug to get into production.
If you look at the results of this approach in Datomic, I think you actually do see a design that treats time as much like a first-class citizen as it's ever been treated, in the sense that time essentially acts as a primary key, and developers are provided with a time machine that allows them easy and efficient access to any state that has existed in their data in the history of the application. (In theory, at least--I haven't personally tried Datomic).
You lose things in that transition, very useful things that are widely applicable. We’re not going to just give those up without having something good enough to replace them, and I thought that in this specific talk those two areas I mentioned were glossed over far too readily.
For example, although Rich said very clearly that he thought it was OK to manipulate data in-place as an implementation detail of how a new value is built, he then argued that once the finished result was ready it should become an immutable value, and that we no longer need to use abstractions that are based on or tied to that kind of underlying behaviour. I contend that there are many cases where it is not so simple even with today’s technology, and that the idea of constraining in-place mutation to the initial creation of a value is a leaky abstraction that will not survive a lot of practical applications.
Later on, processes are briefly mentioned, but that part of the talk is about information systems, which are mostly concerned with pure data analysis. That makes it is rather easy to dismiss the idea of modelling interactive processes in context, but unfortunately, very many real world programs do need to be concerned with the wider time-related concepts like effects.
I’m sure Rich himself is well aware of all of these issues. He’s discussed related ideas in far more detail on other occasions, including in the talk that lrenn cited above. But I find his challenge near the end of this talk, “If you can afford to do this, why would you do anything else? What’s a really good reason for doing something else?” to be rather unconvincing. For one thing, that’s a mighty big “if”, whether you interpret “afford” in terms of performance or dollars. For another thing, the answer to those questions could simply be “Because ‘this’ isn’t capable of modelling my real world, interactive system effectively.”
But I do think he has been a bit unfair to databases (and primary keys) generally, in characterizing them as "place oriented". The relational model is actually a brilliantly successful example of a value-oriented information technology.
The very foundation of the relational model is the information principle, in which the only way information is encoded is as tuples of attribute values.
As a consequence, the relational model provides a technology that is imbued with all of the the virtues of values he discusses.
* language independence
* values can be shared
* don't need methods
* can send values without code
* are semantically transparent
* composition, etc.
It's true that we can think of the database itself as a place, but that's a consequence of having a shared data bank in which we try to settle a representation of what we believe to be true. Isolation gives the perception of a particular value. In some ways, this is just like a CDN "origin".
Also regarding using primary key as "place". Because capturing the information model is the primary task in designing a relational database schema, the designer wants to be fairly ruthless by discarding information that's not pertinent. For example, in recording student attendance, we don't record the name of the attending student - just their ID. This is not bad. We just decided that in the case of a name change, it's not important to know the name of the student as at the time of their attendance. If we decide otherwise, then we change the schema.
The criticism of a primary key is again not anything against having primary keys, but that in a database that allows updates in place a primary key is meaningless. It is meaningless because it doesn't specify a value -- you pass a primary key and it could be anything by the time the receiver gets around to using it. If instead the value was immutable passing a primary key would be fine.
I've done work with ERP systems and having the ability to query against arbitrary points in time would be amazing. What was the value of inventory on this date? There are other ways to go about this (event sourcing) but it moves all the complexity to application code. The goal would be for the database itself to do the work for us.
Not sure what you mean by receiver here -- receiver as in the database or another component in your software hierarchy? The best way to ensure that your data goes unchanged across atomically disparate events is to insist that (Oracle, which is what I use in enterprise) lock the row. The easiest way is to use SELECT ... FOR UPDATE. The cursor you receive will have all the SELECTed rows locked until you close the cursor -- by commit or rollback. This will ensure that nobody can change your data whilst you're messing around with it, even if your goal is never to actually modify it, but merely capture a snapshot of the data. Obviously, if you have a lot of different processes hitting the same data they will block and wait for the lock to free (though this behaviour can be changed) so depending on what you're doing this may not be the most efficient way, though it certainly is the most transactionally safe way. Another way is to use Oracle's ORA_ROWSCN which is, greatly simplified, incremented for a row when that row is changed. So: read your data incl. its ORA_ROWSCN and when you update only update if the ORA_ROWSCN is the same. A similar approach could be done with Oracle's auditing or a simple timestamp mechanicm, but you obviously lose some of the atomicity from doing it that way.
> I've done work with ERP systems and having the ability to query against arbitrary points in time would be amazing.
You can do that in Oracle. You can SELECT temporally; so you could tell the DB to give you the state of the select query 5 minutes ago, subject to 1) flashback being enabled; and 2) the logs still having the temporal data.
Another way is to use an audit log table to store changes to data. We use this all the time in accounting/finance apps as people fiddling with invoices and bank account numbers must be logged; you can either CREATE TRIGGER your way to a decent solution, or use Oracle's built-in auditing version which is actually REALLY advanced and feature rich!
N.B.: I do not use other databases so my knowledge of them is limited, but it should give you some ideas at least!
Consider that you create a record, give it a primary key N, then start referring to that value by the primary key and at some point make an update to the record, the same primary key now refers to another value. The primary key is just a reference pointer to a placeholder (=record) and depending on whatnot the value in the placeholder can change to anything. So, you have to be careful of what you mean by the primary key because it's just a reference, not a value. In the value paradigm your primary key would be a hash, like in git, that would forever be that one value instead of referring to some value.
It's worth considering the correspondence between the concept of functional dependency in the relational model, and the concept of a pure function in functional programming. The issue under discussion, then, is whether referential transparency is afforded in the database.
While referential transparency in a database is achieved momentarily at the right isolation level, it is not achieved in the eternal sense of a pure function.
This is because the functional dependency in FP encodes an intensional definition, whereas the functional dependency captured in a relation is extensional, usually modelling the state of the knowledge of the relevant world, and therefore being subject to change.
The only thing I'm not too comfortable with is that space isn't really infinite. Yes, it's much cheaper, but still not infinite. If we stored all our logs in an ever growing database, and expect to be able to access it all the time, this is really very expensive. This is why we rotate logs, archive them and trash them eventually. Sure, we can afford this expense for source control, because this data (source code) is amazingly small in comparison. I'm not sure how it translates to real data on our systems, which is immensely bigger.
Also thinking about it in context of technologies like
redis. redis manifets a lot of the advancement in technology in how memory is used. It's so vastly bigger and cheaper than before that we can afford to store our whole database in it, instead of on much slower disks. But then this super-fast in-memory database definitely faces storage size constraints that needs to be considered...
Just a few random thoughts. Wish I could have a chat to Rich Hickey one day. Even if I could, I have a lot more to learn until then, so I'd make the most of this chat.
i think an organization trashes old logs for out-of-band reasons - acquiring more disk space requires following an organization's procurement process which imposes tons of friction, or because compliance with applicable regulations requires saving, e.g., emails, for three months and saving them for longer is a legal risk.
cat file | grep .... | wc
There are no complex protocols involved between cat, grep and wc - just passing around the value (now I am not talking about mutable files, directories etc).
I have seen very few systems which are as simple yet as flexible and versatile. Conventional wisdom says it is because unix is set of small utilities where each program does just one thing right. After watching the talk we should note that these utilities pass around text values.
If you want to build something as powerful and flexible as unix command line, you should think about value of decomposition as well as value of values :)
My only problem is Object oriented programming looks more pragmatic in the real world. There are libraries, tools, tutorials, help forums and a lot of other stuff out there which helps anybody who wants to start learning OO from to go from nothing to far places.
You can't say the same thing about functional programming. The community is too elitist, the tutorials are math heavy. And the tools are too ancient. Having to use simple text editors and selling them as technologies used to build large applications is a contradictory philosophy.
Careful. Statements like this only create more "elitists" by insulting people. Have you seen leiningen, Counter Clockwise, or La-Clojure? Part of the reason you need all that tooling is because of the objects. If you haven't become proficient in a functional language, can you really say the tooling is insufficient? It's like telling someone their movie sucked without seeing it, or only staying for the first 5 minutes. When I get rid of my Foo, FooImpl, JPAFoo, FooService, FooServiceImpl, FooDao, Bar, BarImpl, etc, the requirements for my editor and tooling suddenly change. If I'm not using big heavy frameworks, I no longer need all those plugins. I don't need to be able to click on my spring configuration and jump to the implementation. When I'm working in repl, I don't need heavy integration with Jetty (for example) because I never need to restart the server. If my restful webservice just needs to be a function that returns a map (ring), then I don't need annotation support, or some framework plugin. If my code is data, my editor already supports both.
I need to move around functions, and compile my code. Code completion? Navigation? Sure, but Emacs, CC, La-Clojure can all do that. I hope you aren't insinuating that Emacs/Slime is a "simple" text editor ;)
Tutorials are their own issue. A new object oriented programming language only needs to teach you their syntax and their API. A Clojure tutorial targeted at someone who has only ever done serious work in an OOPL is going to have to explain not only Clojure, but fundamental concepts related to functional programming. Once you really learn one, the rest all make sense in the same way it's relatively easy to jump around OOPLs.
If you've accepted the technical argument, don't let those other things hold you back. The Clojure community is great, and the Eclipse and IntelliJ stuff has really come a long way.
I did a fair amount of Java GWT/SmartGWT development last year and having both client side Java code and server side code running in twin debuggers was handy, but really only crucial because of the complexity of the whole setup. That said, I only write simple web apps and web services in Clojure and Noir and perhaps that is why I don't feel that I need complex tools.
As an example, Haskell has a very helpful community, lots of stimulating content (yes, some are math heavy but many/most are not), over 3000 packages on Hackage (many of which are really excellent), 700+ people on IRC at anytime constantly talking/answering questions, at least 3 major web frameworks, many concurrency libraries, database drivers/libraries for almost anything, an astounding number of utility libraries and a real world-class, top of the line compiler (GHC) that produces blazing fast, robust code. Many companies are building commercial/proprietary tools with it for mission critical applications.
There are also plenty of great documentation for Clojure, Programming Clojure is one, and the online documentation is excellent.
"bindings => [name init ...]
Evaluates body in a try expression with names bound to the values of the inits, and a finally clause that calls (.close name) on each name in reverse order."
I deliberately chose a bit of code that's actually pretty simple and straightforward; also, one that I know intimately. Now, knowing what `with-open' does as well as I do, this doc string almost makes sense on the first pass. But to the layman this is almost impenetrable. I've written a couple variants of this macro and I STILL have to read the docstring a couple of times to understand what it's saying.
The online docs are comprehensive but I would never use the term "excellent" to describe them.
In this video, he seems to go along and say that values are great for software projects that use multiple languages, in part because values are universal meaning one doesn't need to translate classes into multiple languages.
However, regardless of whether you use an object oriented design or not, don't you usually have a set of functions you tend to perform on a set of data or values? For instance, you may not wrap your customer data behind a class and methods, but there are still going to be some rules related to all that data you're passing around. So in the multiple languages scenario, wouldn't you still have to translate those rules from language to language?
You don't need both class and object, but I prefer to log both object type and id.
Are you intressted in languages, compilers, algorithms, data structurs, parallism, concurency, databases, operating system, virtual maschines, garbage colleters, grafics or something more meta like the this 'value of value' video.
I can provide you links almost everything but im not going to do the work until you tell me what you want.