I love the Purely Functional Data Structures book (so many cool data structures ...

lostcolony · on April 1, 2022

Yes and no.

Ultimately, is the Big O scale going to be larger for certain cases? Yes. Is it going to matter? Maybe. Maybe not. The point the original comment is making is that the author seems unaware these data structures even exist; the fact he was having O(n) operations and found them to be too slow tells us nothing about the reasonableness of the O(logn) option for his use case. He mentioned using a 30k byte array; that's the difference between 30k operations, and 15. Cutting it from 15 to 1 is probably not going to be that meaningful; certainly not compared with the first jump.

zodiac · on April 1, 2022

That's a good point.

I wonder if the problem is that efficient persistent arrays aren't in racket's standard library (there are some third-party libraries implementing them though). Since racket isn't a "pure FP" language, they include cons-arrays and mutable vectors, and I imagine they felt that there wasn't a need to include efficient persistent arrays in addition to those.

lostcolony · on April 1, 2022

Sort of - consider when Lisp originated. The heritage/age comes from a time before the research in persistent data structures happened, or even the issues with mutability were fully known. Because of that, idiomatic Lisp tends to be mutable, and no one is writing a new Lisp who isn't already familiar with another one. That's not to say immutability isn't worthwhile there, or that no Lisp standard libraries support it, but mutability tends to be the default assumption from what I've seen.

phyrex · on April 1, 2022

I don’t know about that; clojure was specifically created with persistent data structures in mind, and it’s definitely as much a lisp as racket is

lostcolony · on April 1, 2022

I inserted 'tends' everywhere intentionally.

pbiggar · on April 1, 2022

In addition, the O(logN) only comes into play if you are truly accessing only one item. If you're accessing more you are either reading the whole list (O(N) in total, O(1) per item), or a sublist (in which case it's O(logN) once to get/update the k item sublist and O(k) to operate on it).

If you are accessing just one item, the O(logN) cost is likely dominated by some other cost (eg the user clicking a button, making a web request, etc).

TuringTest · on April 1, 2022

The thing is, functional and imperative programming are mathematically equivalent - to the point that you can express imperative code in a pure functional language, if that's what you really need. If you want a O(1) random access in a functional language, you can formally define it; the problem with this is that you may end up embedding a whole computer in your program to do it, if your program heavily depends on side effects like those in the article. So, if you really need O(1) mutability for a large part of your program, maybe functional programming is not the tool for the job? Being too tied to a paradigm prevents you from finding the best tool at hand.

Functional programs are best used when you want referential transparency[1] as your problem domain benefits from it. If you want to represent mutable state in a functional, transparent way you use a subset of temporal logic[2] - each function has an additional state of the world input parameter, and return an additional updated state of the world value with all the mutated objects in it. The State monad is a version of this with increased legibility (as long as you don't combine many different sources of mutation).

If your needs for mutation are limited to a few objects in each module, you represent those as streams (the generalized version of futures which are able to be repeatedly updated with new values); a stream represents the history of all values that a mutable variable takes during the program execution. This is the basis of the functional reactive paradigm[3] on which modern web frameworks like React and Vue are converging, as it happens that modeling asynchronous state mutations of multiple loosely connected elements is easier in the functional paradigm than in the imperative.

[1] https://en.wikipedia.org/wiki/Referential_transparency

[2] https://en.wikipedia.org/wiki/Temporal_logic

[3] https://en.wikipedia.org/wiki/Functional_reactive_programmin...

zodiac · on April 1, 2022

If you used the State monad for the author's interpreter example, and you did it in Racket (your "state of the world" is a cons-list), what would be the run-time of the resulting interpreter? Similar question for streams (although I'm not that familiar with streams, so I don't know how they would apply here, or if they can be implemented without changes to the base language)

TuringTest · on April 1, 2022

A stream is in essence the same as the Observer pattern; every function that depends on the values of a mutable variable must use accessor methods to extract the most recent value when they need it, or pass it 'hook' callback functions that represent the next steps to take on new values.

So, in principle, it should have the same performance as the 'Children can issue notification events' method in the article, since it's using what in essence is the same structure (just with a much nicer syntax, better adapted to functional programs).

The advantage of using functional streams instead of OOP observable callbacks is that it tends to be easier to write the program's large structure as compositions of functions over streams, i.e. it is easy to build modular functions without the bugs introduced by side effects.

I learned this style of programming through the online 'Paradigms'[1] course by Peter Van Roy, which you can take for free if you don't want credits. It uses the multi-paradigm language Oz to explain the essence of each paradigm and the differences between them. The course distills the primitives of different programming languages in terms of how they see state (see Van Roy's diagram in [2]), and how deciding to incorporate and combine some of those primitives and not others gives rise to the different programming styles. I found it enlightening, in the way that programmers' koans refer to.

[1] https://www.edx.org/es/course/paradigms-of-computer-programm...

[2] https://en.wikipedia.org/wiki/Programming_paradigm#Overview

Jtsummers · on April 1, 2022

There are cost tradeoffs involved so you have to determine based on your application whether they solve the problem or not.

In a case like this you have several options:

1. Stick with mutation, reuse the original array and don't worry about being purely functional. This will probably be the optimal solution (vice explicitly creating a copy and mutating just the relevant bit). However, if you need the original for some reason then this is a non-solution.

2. Imitate the purely functional style with explicit copies, and accept the massive memory and performance costs that carries. You get to keep the cheaper read cost.

3. Use a better data structure, but losing (in particular) read performance, but largely mitigating the creation and "mutation" costs. (in quotes because the original is, from the perspective of the user, unchanged.)

Assuming you're in a language that doesn't provide (3) for free, and that (1) is a non-solution, you need to profile your system to decide between which of (2) and (3) is appropriate. With a large enough data structure or frequent enough updates, (3) is probably what you will want. If your data structure is small or updates are rare, then you can stick with (2). And if you have no idea which one will occur in the real-world (some users tend towards (2) being better and others towards (3)) you have to make an explicit choice on which one to provide better performance for.