If you want an intro to JSP, you might find helpful an annotated version [0] of Hoare's explanation of JSP that I edited for a Michael Jackson festschrift in 2009.
For those who don't know JSP, I’d point to these ideas as worth knowing:
- There’s a class of programming problem that involves traversing context-free structures can be solved very systematically. HTDP addresses this class, but bases code structure only on input structure; JSP synthesized input and output.
- There are some archetypal problems that, however you code, can't be pushed under the rug—most notably structure clashes—and just recognizing them helps.
- Coroutines (or code transformation) let you structure code more cleanly when you need to read or write more than one structure. It’s why real iterators (with yield), which offer a limited form of this, are (in my view) better than Java-style iterators with a next method.
- The idea of viewing a system as a collection of asynchronous processes (Ch. 11 in the JSP book, which later became JSD) with a long-running process for each real-world entity. This was a notable contrast to OOP, and led to a strategy (seeing a resurgence with event storming for DDD) that began with events rather than objects.
... this brings back memories! In the late eighties I, as a teenager, found a Jackson Struct. Pr. book at the town library. I remember I was amazed at the text and wondered why I hadn't heard about the method before.
If I remember correctly did the book clearly point out backtracking as a standard method, while mentioning that most languages lacked that, so it had to be implemented manually.
This is referenced(1) as a core inspiration in the preface to “How to Design Programs” but i never researched it further because i’ve found the “design recipes” approach in htdp to be pretty solid in real life problems.
1. Analyse the problem and derive appropriate data representations, write out illustrative examples
2. Write a signature for the function - what does it consume and what does it produce. Spend time to get a concise definition of the computation it performs.
3. Create some illustrative examples of what the function does
4. Outline the function
5. Fill in the gaps to complete the definition of the function
6. Tidy up by converting some of those illustrative examples into tests
There’s a further practice to this of iterative refinement - taking what you learn as you apply this process and use that info to refine what you did.
This is a pretty solid way to go about things and it’s utterly trivial to bring a colleague up to speed with what you’re doing at any step when you need a hand. The biggest gap in practice of this method that i’ve found is a lack of concern for efficiency. It’d be fairly trivial to produce an n^3 solution following these steps when an n solution exists.
I was always swayed by the “make it work, make it right, then make it fast” approach before.
(1) … Actually i just went and opened the book, the actual reference is to Michael Jackson’s method for creating COBOL programs which is the progenitor of JSP.
In the "Fundamentals"[1] course they use a table format for the design recipe that makes the correspondence between data and functions more obvious, and perhaps the JSP inspiration more evident.
Heh. I remember when the concept of subroutines was considered dangerously subversive.
Structured Programming was thought of as revolutionary. Most folks were either doing COBOL or Assembly, at the time. C was just beginning to feel its oats (It was still thought of as a mostly academic language, but it spawned a few languages that were considered "workhorse" languages, like PL/1).
I did start using Pascal, in the 1980s, because that was Apple's native language. It was a very strange language, coming from Assembly, FORTRAN, BASIC, and PL/1.
> In Chapter 3 of Principles of Program Design[1] Jackson presents two versions of a program, one designed using JSP, the other using the traditional single-loop structure.
I think I've come back around to seeing some merit in the "traditional" version. All the state is declared before the main loop. You could stop the program after any iteration of the single loop, save the explicitly declared state, and restart where you left off. The logic inside the loop can be applied to any record and state.
In the double-loop version, your place in the control flow is part of the state. If you stopped, you wouldn't know where to start up again. There's also a state variable (firstLineOfGroup) declared inside the loop.
I know why I prefer the "traditional" version: it can easily be refactored to be functional, and making the state explicit means it doesn't have to be refactored if I want to store the state externally or switch from a batch architecture to a streaming architecture. The JSP version is inherently imperative and needs to be refactored before it can be used in a different architecture. Funny how ideas like that alter your taste.
I think coroutines, generators, and async allow the compiler to transform the two-loop version into a state machine that can be suspended at a readLine() or println() call.
A compiler that could take the JSP version and transform it into a streaming job with managed state would be pretty cool. Otherwise, a programmer has to construct the streaming architecture around the code, whether they integrate it into an existing framework like Flink or write all the state handling and management themselves. In that case the processing code has to be structured so that the state is explicitly passed in and out for each record.
That's exactly how JSP was used. In its first application to COBOL, the JSP-COBOL tool turned structured code into coroutining code with suspend/resume, and tools built later for JSD generated code to save state in a database.
Or use a language with generators and coroutines? I believe this would be natural to express in Python and (maybe) with C++20's coroutines. Java’s stream API could also probably accomplish something similar.
There's an old Rob Pike talk where he asks the audience:
"So how many of you are familiar with the work of Michael Jackson"
Most of the crowd raises their hand to which he replies
"Really? Quite a few! Well as you know, Michael Jackson developed Jackson Structural Programming".
He was seemingly unaware of the popstar of the same name.
It's been a long time since I watched that talk, but I seem to remember that about 40 minutes in he has a moment where he remembers that there's another Michael Jackson and apologizes for the confusion.
Disclaimer: I have done exactly 0 research on JSP before today and come from python/lisp land.
Being in the middle of developping an OpenAPI application with python's Pydantic (an extension of type annotations), I can't help but feel python is this close to being a combination of JSP and data model definitions.
To go one step further if you write your modules with Hy (a lisp that interops & ~transpiles to python), you get a pretty darn elegant codebase that is a rather direct mapping of your domain problem to expressions which look a lot like these JSP examples.
Maybe I'm only just discovering for myself the "power of typed languages" (feel free to rib on duck typed devs here) but it feels like a huge progress in terms of productivity, bug count and readability. I love modern python.
I've just got "Structured Program Design Using JSP" off the bookshelf. Haven't looked at it in about 30 years!
I was once sent on a work course to learn how to write COBOL using Jackson Structured Programming. I _loathed_ it. I remember thinking that
a) it tried reduce the programmers role to little more than that of an automoton
b) it was completely at odds with my views (30 years ago and still today) that good software development is a blend of the technical and creative/artistic.
Happily very little of my working life involved actually using JSP (well Java Server Pages excluded).
Biggest erk and put-off many in COBOL had with JSP was exception handling and with that - without using GOTO (which was banned for JSP hardcore mentalities) you would end up setting flags and having checks - adding a lot of processing into the code-base at a time that was still before the IBM PC was to come about. Even then, CPU's for large mainframes et all, and the cost of CPU cycles and storage, sure did have a far larger value than in later years.
JSP used the "quit" statement to handle errors and backtracking; it was a primitive form of exception before exceptions were common. There's a lengthy defense of this in the JSP book (on p.282) arguing that eliminating GOTOs dogmatically is a mistake.
Is that really a debate? Gosh, I'd love to see the "separatist" arguments, because ever since I've discovered Lisp, I've automatically assumed that separating statements and expressions is one of those stupid historical baggage things, a stepping stone in early PL design that we can't just get rid of.
I mean, how many libraries with weird APIs were created just because you can't write the following in most Algol-like languages:
In many languages, people are resorting to "immediately invoked function expressions" to simulate this pattern (whether for returning or assigning). And those that can't, well, here goes another pointless little function to encapsulate it[0].
--
[0] - Which transitions into what is a real debate - "lots of small functions" vs. "fewer but larger functions". It's one of those holy wars that can't die, because both sides have good arguments. But it's a false choice - a limitation of the tools we're using to write programs. 'emilprogviz has a nice summary of that last point here: https://emilprogviz.com/ep05/ep05-transcript.html
Linking to transcript, in the spirit of "text with screenshots is almost always better than a video" - (nice job providing it Emil!) - but the video itself is good too, as are others on that site.
Ok it may not exist as a debate outside my own head :)
To offer one argument: the problem with expression oriented languages is their generality, in that you can write "weird" expressions, e.g.:
let x =
[giant block of code with multiple nested lets, ifs, etc.]
in
f(x)
I've definitely done this a number of times. Languages that separate statements and expressions force you to break things down further and prevent the code from going too far to the right.
Also, for low level languages, the mapping between code and assembly is clearer in languages that separate statements and expressions (I need a more succinct term).
EDIT: Also, by preventing statements from being used as expressions, you encourage breaking up long and complex statements into smaller functions.
Oh, I see what you mean. I discovered Lisp after working in C++ and Java, so I avoided this pitfall in my code, but if I had a dollar for every instance of:
(let ((some-variable (progn
(do stuff)
(do other stuff)
(let ((some-helper-var ...))
(some more code with some-helper-var)
(more-of-the-same (progn
...)))
;; 50-100 lines later, just return one of the values
some-variable)
that I saw in Lisp codebases, particularly in Emacs, well... I could feed my family for a month or two from that money.
There's plenty of abuse potential here (and for the love of god, if your language has 'let', it probably also has lambdas or 'flet', use that to create local functions...). But mitigating this, arguably, is a problem for style guides - the overall feature of "everything is an expression" is powerful. It has nice simplicity to it, and reduces boilerplate :).
> Also, for low level languages, the mapping between code and assembly is clearer in languages that separate statements and expressions
I'm guessing this is where the separation originally came from. Assembly is essentially statement-only. But at this point, I think all programming languages in use crossed the threshold where we're actually programming to an "abstract machine", and being expression-oriented seems to confer greater expressiveness.
> (I need a more succinct term).
"Separatist"? :).
> EDIT: Also, by preventing statements from being used as expressions, you encourage breaking up long and complex statements into smaller functions.
Ah yes. The real debate. I edited my comment to mention it before I saw your edit :).
Yes, thank you, your code example is much better than mine.
The simplifying potential of expression-oriented languages is huge. Alan Perlis said[0]: "symmetry is a complexity-reducing concept; seek it everywhere", and this is a great example of that.
For example, languages like C and Ada have both if statements and if expressions, this is a duplication that can be eliminated by making them expression-oriented so you only have an if expression like in Haskell or ML.
But, interestingly, there is one historical case of a language going from expression-oriented to statement/expression separation: ALGOL-W[1] was expression oriented, Pascal[2], its successor, separates statements from expressions. Wirth designed both.
I don't know what the motivation was, but I suspect it's because Pascal was designed to be an educational language, and Wirth must have thought that separating expressions and statements made didactic sense when teaching programming as a recipe or list of things to do, as opposed to the more mathematized formulation of expression-oriented languages (of having an evaluation function from expressions to values).
The successors of Pascal (Ada, Modula and its sequels) retain the statement/expression separation.
I can't say for sure, but I suspect it was a reaction to ALGOL-68. Creators of later Algolish languages usually made a point of rejecting features of Algol-68 they saw as prone to abuse.
And the block expressions in Algol-68 were certainly used and abused. Basically, the type and value of a block (BEGIN ... END or ( ... )) are those of its last expression, so you can put them anywhere. For example, Algol-68 has the looping construct WHILE <condition> DO <body> OD, but not C's `do <body> while (<condition>)`. So what do you do if you want to have the test at the end of the loop body? Simply
WHILE (
<body>;
<condition>
)
DO SKIP OD
I guess you'd get used to those idioms eventually, or some coding conventions would have arisen if the language had been successful. But I can understand language creators looking at that and seeing how getting rid of it simplifies not only their compilers but also the programs written in their languages.
That introduces a variable that has to be initialised outside the control block. It's not terrible as hacks go, but I guess the readability cost of the additional variable is higher than having the body in the condition.
> Oh, I see what you mean. I discovered Lisp after working in C++ and Java, so I avoided this pitfall in my code, but if I had a dollar for every instance of: [...]
This is one of those anti-patterns I sometimes find myself falling into (I don't code Lisp or its descendants often). It's kind of discouraging, to be honest, because it feels like there ought to be some more efficient way to do this, and my inner critic comes along and complains that I'd be better of writing Python or C than learning to do it the Right Way in Lisp.
Is there some kind of idiomatic way to avoid this and write cleaner code? Is the solution to simply extract the progn into another function? Does that violate some rules of function encapsulation in Lisp?
I think extracting functions is the way to go. I like it when a function's body reads like a sentence, when it's been broken down to the atoms of that conceptual level of abstraction. Forth code that achieves this can look very satisfying.
The problem is that the toplevel of a module is full of functions at various levels of granularity.
It might be nice if programming languages had a concept of "code sections" (you could implement this with literate programming), where modules are organized hierarchically into sections, and declarations can be public or private within a section. So you might have:
section foo
// Accessible from outside this section
public important_function()
// Only accessible from inside this section
private utility_function_1()
private utility_function_2()
end section
> The problem is that the toplevel of a module is full of functions at various levels of granularity.
Exactly--this is what I was hinting at with "rules of function encapsulation." I took a course on LISP in college (apropos of nothing, it featured a lab called, "Isn't this just a one credit course?") in which it seemed like the LISP Way was to use helper functions. It's always felt like something was missing in my ability to "translate" between paradigms because this rubs me the wrong way (although as I went through some examples I realized I do this with some regularity in other languages--but it doesn't "feel" as bad).
In retrospect probably part of the problem was that we ALGOL-adjacent undergrads didn't have the scaffolding to understand more generic approaches like fold and cousins. It was a different way of thinking.
I like the idea of literate programming as a potential solution, especially as it can be used to guide newbies through the code and identify areas where idioms are much different between one's existing paradigms and that of the codebase.
In Common Lisp you can use FLET or LABELS to declare local functions. This works, but there really is no good solution: utility functions often just take up space and belong nowhere.
The issue with these debates is that these tend to be driven by people with strong opinions coming usually from misunderstanding of the other side of the debate.
I like to think that there is no OOP vs FP, there is no strong typing vs no typing, there should be just adding to a shelf with tools that each has application in some situations but also constraints on when it can be used effectively. And your job as developer is to understand the bounds on application and effectiveness.
The road to mastery of development then should be by learning and understand those various tools (in the broadest possible sense) rather than by forming strong opinions and shunning the other side of the debate. People who cut off themselves from OOP will never learn its benefits just as people who cut themselves from FP.
Well, I'm not sure I'm on one side or the other of the statements-or-no-statements debate, but in general I think the argument in favor of separating statements from expressions is that it adds redundancy to your programs, which makes them easier to read and enables the compiler to produce better error messages when you have a syntax error. Sometimes you can trade this off against other aspects of syntax you'd like to improve; Lua, for example, omits semicolons and isn't white-space sensitive, but the compiler can still emit reasonable syntax error messages because of its strict separation of statements and expressions.
You can transform this to the following in even the most limited ALGOL-like languages, which is less clear but not nearly as heinous as the alternatives you mention:
The more popular ALGOL-derived programming languages like C, Java, and post-walrus Python have conditional expressions and assignments inside expressions, which means that in this case you don't need to resort to declaring a variable. In Golang and Pascal you can just assign to the named return value rather than declaring it as a normal variable and then explicitly returning it.
The argument against separating expressions from statements is also that it adds redundancy to your programs, as in the above example.
I think the "separatist" side (yep I'm adopting your term) has already accepted that the war is lost, but I still see battles being waged often in programming language spec committees, whenever there's a suggestion for allowing if to be treated as an expression.
The arguments against it are always that it requires having keywords that behave differently when being expressions or statements.
Often the alternatives and compromises proposed always have the same issues, such as Javascript repurposing the do keyword for enclosing expression ifs.
The intuition that I've built up is that most of the time that you have a debate that goes on for more than about 5 minutes, there's a problem at a different abstraction level.
For instance with Imperial vs Metric System, Metric optimizes for multiplication, while Imperial optimizes for division. You can never unify these under the current system. But you can, if you change our base to base 12. Then they suddenly merge.
With CLI vs GUI, we've realized that we needed a mixture. We need a GUI that runs through a CLI. And now we have that, it's called a website. I think tabs vs spaces was solved similarly, with tabs as spaces that editors config can treat n-straight-spaces as tabs.
I'm firmly on the expression only and lots of functions sides, but that transcript is very interesting. You can already do local functions in C#, it seems to be implying that we should strive for that to be our mixture.
> The intuition that I've built up is that most of the time that you have a debate that goes on for more than about 5 minutes, there's a problem at a different abstraction level.
I strongly agree. Also, that's poetically put. I'm saving this in my quotes file.
No, I don't think that's the equivalent debate today. Structured programming vs "the old way" was about clarity of control flow. Structured programming was clearly superior once you had machines with enough muscle to handle the overhead. (And once you realized that you were trying to optimize for programmer time rather than for machine time.)
I don't see expression-oriented as being similar to that at all. Expression-oriented may provide a boost to programmer efficiency, but it's not nearly on the order of structured programming vs. unstructured. It doesn't require changing peoples' mindset to realizing that programmer time is more valuable than machine time; they already know that. And, the parallel problem would be clarity of data flow, and I'm not sure that expression-oriented is a huge win in that area.
I mean, it sounds weird now, but back then the assembler would indeed have eaten up precious CPU time.
Imagine if, nowadays, a grad student took a node in a scientific cluster to run Dreamweaver in a Windows VM instead of writing HTML by hand. (Sorry, it's the closest analogy I could come up with)
Agreed! I also think that von Neumann (and some of his contemporaries as well) were likely the last to be able to understand the entire breadth of math, physics, and computer science at the time. These days we’re all far too specialized…
If you take into account that the particular humans whose time was (not) valued here were graduate students, that time may not have been so brief after all.
I started out from a BTEC (changed to BECTEC and 2 years instead of just 1 year shortly after I did mine) and was taught COBOL. First Job doing COBOL and I wrote a program, really upset the analyst - as with JSP exception handling done way differently than just using GOTO (a COBOL verb I'd never used or was even aware of as was hard core JSP). Was explained the whole aspect of efficiency and with many aspects of code - what looks neat and nice today over time will get modified and how it was easier to modify something with GOTO's over some elegant JSP layout. That and efficiency of the code - performance.
So I learned the usual lesson many do when you go from Educational ideals into Working reality. Real World business don't run on cutting edge changing soon as that changes, legacy/stability and historical aspects do play out. So whilst you may know a better way, there are many factors that make that impracticable. Sure if rewriting your code-base and hardware to the cutting edge of the time was viable then people would do it, but testing and verification of code - when done properly takes longer than the time to build the latest cutting edge system. Let alone the whole cost factor.
I imagine many have comparable stories upon their move from the realms of Education into Work. Do share as be nice to read what brain-walls today's first workers encounter.
For those not familiar with the JSP style alluded to in the comments, and its contrast to the "traditional style," here's an example of a function that splits a string into words. The "traditional" version was shown by a famous computer scientist in a conference talk a few years ago; I translated it to JS to provide some anonymity :-).
JSP's claim to greater clarity is based on the correspondence between the code blocks and the structure being processed: the body of the outer loop processes a group (of gap and word); the body of the inner loops process a gap and a word respectively. The traditional style, in contrast, is like a state machine, and can't be viewed structurally---you need to think about what's happening at each point. This makes it harder to modify the code (eg, by adding something that happens once per word, which is now in two places) and often leads to bugs.
This same structural clarity is why, in my view, code that uses list functionals like map/reduce is usually more comprehensible than traditional imperative code.
// JSP version of a split function
split_jsp = function (s) {
words = [];
i = 0;
while (i < s.length) {
while (is_white (s[i]))
i++;
word = "";
while (is_alpha (s[i]))
word += s[i++];
if (word.length > 0) words.push(word);
}
return words;
}
// "traditional" version of a split function
split_traditional = function (s) {
words = [];
word = "";
for (i = 0; i < s.length; i++) {
ch = s[i];
if (is_white (ch)) {
if (word != "") words.push (word)
word = "";
}
else
word += ch;
}
if (word != "")
words.push (word);
return words;
}
is_white = ch => (ch == ' ' || ch == '\t' || ch == '\n')
is_alpha = ch => ch != undefined && (/[a-zA-Z]/).test(ch)
Read the book, which I obtained from the British Council library, when I was around 15. I have to say that the book made quite an impression on me at the time.
i believe this is equivalent to a modern functional effect system, which compiles an AST (typically a hosted monadic sub language, or equivalently a s-expression with control flow) into a DAG intermediate language. The DAG is abstract and is then further interpreted or compiled into an execution target, which evaluate the DAG with late bound runtime properties like asynchrony, exceptions, side effects, garbage collection, structured concurrency. the key idea is that the DAG is abstract and has no notion of the runtime behaviors like asynchrony; structured programming is about capturing the computation’s essential structure as a DAG and then using that structure to reflect the target runtime behavior. E.g. interpreting edges of the DAG as callbacks. Structure is also about constraints (what you can’t or shouldn’t do) and for example a DAG can’t express goto. But a DAG can express stack frames and lexical closures (by nesting).
I was taught this when working at British Gas in the mid 80s. I remember this being followed by Delta 3GL which allowed you to allocate code to the pre and post hooks etc of each node. Your function ‘places’ were there and you just assigned code. It was a very methodical way of working and IMO left less room for error.
Did you also have to endure the era of `4GL` code generators that would produce COBOL code and marketed shortly as would make COBOL coders redundant?
I worked at Eastern Electricity Board in the early 80's - No JSP there, but did work upon a new project that was object focused - using COBOL - no data dictionary overkill project, but saw what would normally be one program broken down into a main body that would link in lots of functions - which in themselves would be small contained COBOL programs. So one data input screen would see a program for each field type, instead of one large program and the main code would in effect be a skeleton that would link in the screen template and needed programs to handle those input/output data fields. Certainly the way forward in many ways, though COBOL perhaps not the most learning towards that.
I liked JSP, but then that was how I was taught COBOL in education. Never really got to use it in anger due to legacy standards and other factors which alas made sense unless your doing greenfeild at the time.
Actually I think it was Delta 4GL. BACS built their payments system using it. They put us in a warehouse basement in Dunstable. That was a huge project. I wonder if they still use it.
I tried to follow Jackson’s book on Problem Frames, did not complete the book, though I’d consider trying to read it again. Its a non-formal or semi-formal methodology that emphasizes understanding the structure of problems and domains before solution design. Its a technique of architecture and systems engineering.
this is how I was first taught how to design software. I found it quite unnatural and didn't really use it. But it did help me start thinking about how to structure software.
For those who don't know JSP, I’d point to these ideas as worth knowing:
- There’s a class of programming problem that involves traversing context-free structures can be solved very systematically. HTDP addresses this class, but bases code structure only on input structure; JSP synthesized input and output.
- There are some archetypal problems that, however you code, can't be pushed under the rug—most notably structure clashes—and just recognizing them helps.
- Coroutines (or code transformation) let you structure code more cleanly when you need to read or write more than one structure. It’s why real iterators (with yield), which offer a limited form of this, are (in my view) better than Java-style iterators with a next method.
- The idea of viewing a system as a collection of asynchronous processes (Ch. 11 in the JSP book, which later became JSD) with a long-running process for each real-world entity. This was a notable contrast to OOP, and led to a strategy (seeing a resurgence with event storming for DDD) that began with events rather than objects.
[0] https://groups.csail.mit.edu/sdg/pubs/2009/hoare-jsp-3-29-09...