This blog post proposes a "data" sanity solution (alternative external oracles, or "scoring" on probabilities, etc) but that's not really what the comment by Samuel Falvo II was about.
His problem was syntax getting broken and not about "questionable data". In the context of dynamic vs static compile, he's worried about scenarios like this:
x = customer.last_name // worked on Friday and x=="Smith"
- remove field customer.last_name // Saturday refactor
+ add field customer.full_name // Saturday refactor
x = customer.last_name // broken on Monday with a runtime error
With a static type system, the changes on Saturday would have told them immediately at compile time that the last_name field access by other client code was broken. (And yes, one typical answer for handling errors like that in dynamic type systems is unit tests -- but that's veering off into a different conversation.)
This essay is a "solution" to a different problem.
> ...would have told them immediately at compile time that the last_name field access by other client code was broken.
What's the proposed fix in this case? Should all clients accessing that field switch to `customer.full_name`? Do you control all of the call-sites? If so, that change is pretty easy to automate in dynamic languages by marking the field as deprecated, adding the new field, and forwarding `customer.last_name` to `customer.full_name`. That way, running systems don't break but you get the new behavior. You can automate logs to track when the deprecated field is accessed and, if necessary, fix call-sites after a reasonable period of monitoring, reducing risk (assuming there's not "once-in-a-century" paths in the code that invoke the call site).
It seems reasonably likely that this is not the desired behavior - you may have external clients, and some callers may be relying on this field to actually return only a last name. In that case, deprecation is still the correct path, I think, since it's unreasonable to expect external clients to immediately fix systems relying on that field being available.
Static typing can tell clients where they need to make changes, but it's not realistic demand that they immediately update every call site, every method depending on that behavior, and every integration with their external dependencies that relied on the interaction between your code and another system. This is why we have (semantic) versioning, deprecation warnings, and support agreements for larger systems. API's should evolve slowly and gracefully over time, not shift suddenly overnight.
The proposed fix is that you don't deploy code that the type system can prove is broken, and the person performing the refactoring needs to do additional work rather than pushing out something that will fail in production.
My reading of jsode's example is in a shared codebase, where it would be reasonable for the person performing the refactoring to update all call sites, and where checking in a change that does not update all call sites to whatever branch is released to production should be unacceptable, as it will provably cause problems in production.
If the change was made in a standalone library that other teams depend on, the type-checker error would be produced when they update the library to the new refactored version, and it's the same story: prohibit deploying known-invalid code to production until you've updated all call-sites to match the changes in the library API.
If updating all call-sites immediately is implausible for whatever reason, leaving the old field in place with a deprecated annotation would be quite reasonable, but the purpose of a type checker in this context is to prevent you from failing to handle this in at least some way. As you say, responsible maintainership of an API should facilitate gradual transition; the use of a type-checker here is to prevent deploying code that will fail at runtime when someone has made a mistake, and can help facilitate responsible maintainership.
I don't disagree that static type checkers and analysis software can be useful for catching many kinds of errors. In fact, at the end of my comment I emphasize exactly their strength: in automatically detecting when and where clients need to make changes. My intent in making the comment was to highlight that static analysis is only one part of a larger solution to the problem of keeping call sites in sync with the code they depend on, and to discuss other techniques that don't rely on static analysis. I think articles and comments online focus too much on the debate between static and dynamic typing, instead of the engineering principles that apply to both camps, which can help mitigate the risks associated with change.
For the purposes of my reply to the author, the type of fix doesn't matter and its discussion would be a distraction away from my main point: the blog author Curtis "Ovid" Poe misintepreted the comment by Samuel Falvo II and therefore wrote the wrong "solution" as a response. Ovid's health data example of obj.fetchDataFromServer() error and reconstructing data from other sources with a threshold score is not the same abstraction level as the commenter's example of obj.last_name getting broken because the plumbing API was renamed/refactored.
(My guess is that Samuel Falvo II's hostile tone makes it easy to miss that he was actually describing a syntax error and not a data interpretation error.)
Regardless of whether the fix is:
- stringsplit(full_name) to extract a last_name
- or change the client code's usage of last_name and only use full_name
- or re-architect the upstream customer object to include both last_name and full_name so there's a gradual migration
...it isn't the issue.
What the angry commenter was trying to communicate was, "tell me about the last_name refactoring causing a syntax error _immediately_ instead of letting it sit there as a ticking time bomb that will blow up and wake me up at 4am".
He doesn't necessarily have to immediately fix it, but he does want to immediately know of the error's existence before a crash at runtime.
I apologize for being unclear. My intent was not to suggest that static type checking (and more broadly, static analysis) were not useful tools in catching errors early. Rather, I wanted to emphasize that other techniques could be used to mitigate the risks associated with change, such that the reliance on static analysis could be minimized. In that sense, both static analysis and the mitigation techniques help prevent the angry commenter from waking up to 4am alarms in prod.
So you marked the field deprecated and stated it would be removed in release "potato-wedge". The "potato-wedge" release rolls around, you delete the field, and Ted, who didn't care to pay attention to the deprecation warnings, ... does what?
In a statically typed world, Ted tries to compile his project and gets an error saying, "last_name doesn't exist". In a dynamically typed world, Ted ships his code into production (or possibly testing) and gets a call at 4:00a.m. saying it threw an exception or just died. (The testers are in Bangladesh.)
In Alan Kay's world (or at least that of the author), the code goes on robustly producing whatever results it produces in the case where no one has a last name. Better?
I agree with you! Static typing and more broadly static analysis can help catch some types of backwards incompatibilities. The intent of my comment was to highlight complementary techniques that can minimize the reliance on static analysis as the primary tool for keeping code in sync with dependencies. Alone, any one technique is not particularly fool-proof.
This is about message passing at an abstract level, which is what Alan Kay imagined with Smalltalk. You are the one missing the point....
When you are dealing with external systems, such as a third party API, the message passing paradigm is what you are dealing with. There is no static typing for someone else's API (even if you are using their SDK).
>This is about message passing at an abstract level, which is what Alan Kay imagined with Smalltalk.
The merits of Alan Kay's message passing are fine but it is not relevant to this case of the author writing an inappropriate solution because he misunderstood a commenter's feedback.
The "this" in your "_this_ is about message passing at an abstract level" comment seems to only refer to blog article #2 in isolation -- instead of considering how it was provoked by the author's previous misunderstanding.
>You are the one missing the point
You want to lecture me on being wrong but it seems that you are the one jumping into the middle of a conversation without realizing what communications have transpired so far. If you didn't completely read both blog articles and the user comment, you're missing the full context. Let's recap:
The blog author wrote 2 articles.
Blog article #1[1] didn't say anything about "external systems 3rd party API". The blog author himself brought up the issue of dynamic vs static in regards to "hot swapping" code without restarting the system. It was not "external systems". It included an example:
print customers.fetch(customer_id).last_name
At the bottom[2][3] of that blog, a commenter named Samuel Falvo II asks what to do when a broken "customer.last_name" message fails because it wasn't statically checked for syntax errors. The commenter also wasn't talking about "external APIs". He was saying static compile checking helps him find errors early instead of errors later at runtime with an unexpected emergency at 4am.
Blog article #2[4] then responds with an elaborate solution to reconstruct missing data which is a totally different abstraction level from the error scenario the commenter was asking about. This 2nd article introduces an "ETL workflow" that possibly comes from servers of external partners. The 1st article did not.
My top-level comment was explaining to the blog author that he misinterpreted what the angry commenter was asking.
I think there's a big difference between not understanding the message, and not understanding the data (a message parameter).
When they get an unrecognized message, Objective C objects call their doesNotRecognizeSelector method, and Smalltalk objects call their doesNotUnderstand method.
And the object sending the message can first check with respondsToSelector in Objective C or respondsTo in Smalltalk, before sending the message.
But validating and sanitizing input parameters is a totally different thing than handling unknown messages, orthogonal to object oriented programming.
Hm, I don't see how they are totally different and orthogonal.
Objects and messages are kinds of types. Types can have constraints and conditions and there is all sorts of nuance.
Difference, yes. But completely different? No... when designing a system there is a lot of freedom in where to draw the shapes of the system, how much information do we contain in the objects themselves, a hierarchy or inside parameters.
"Stringy interfaces" is one extreme. There are many others.
I think you're barking up the wrong class hierarchy, trying to reformulate objects, messages and parameters as abstract data types. Alan Kay has been quite clear about his opinion that “Abstract Data Types” is not OOP.
>One of the things I should have mentioned is that there were two main paths that were catalysed by Simula. The early one (just by accident) was the bio/net non-data-procedure route that I took. The other one, which came a little later as an object of study was abstract data types, and this got much more play.
>If we look at the whole history, we see that the proto-OOP stuff started with ADT, had a little fork towards what I called "objects" -- that led to Smalltalk, etc.,-- but after the little fork, the CS establishment pretty much did ADT and wanted to stick with the data-procedure paradigm. [...]
>(I'm not against types, but I don't know of any type systems that aren't a complete pain, so I still like dynamic typing.)
>OOP to me means only messaging, local retention and protection and hiding of state-process, and extreme late-binding of all things. It can be done in Smalltalk and in LISP. There are possibly other systems in which this is possible, but I'm not aware of them.
>If you are “setting” values from the outside of an object, you are doing “simulated data structure programming” rather than object oriented programming. One of my original motivations for trying to invent OOP was to eliminate imperative assignment (at least as a global unprotected action). “Real OOP” is much more about “requests”, and the more the requests invoke goals the object knows how to accomplish, the better. “Abstract Data Types” is not OOP!
>An interesting historical note is that the two inventors of Simula had completely different views of what they were doing and how it should be used for programming. Dahl was brilliant and conservative, and later wrote papers about using class definitions to make Abstract Data Types (and that is how a lot of so-called OOP programming is done today). Nygaard on the other hand was quite a wonderful wild man and visionary -- beyond brilliant -- and was into the abstract "simulate the meaningful structures" idea. Dahl was trying to fix the past and Nygaard was trying to invent the future.
>This is probably a good place to comment on the difference between what we thought of as OOP-style and the superficial encapsulation called "abstract data types" that was just starting to be investigated in academic circles. Our early "LISP-pair" definition is an example of an abstract data type because it preserves the "field access" and "field rebinding" that is the hallmark of a data structure. Considerable work in the 60s was concerned with generalizing such structures [DSP *]. The "official" computer science world started to regard Simula as a possible vehicle for defining abstract data types (even by one of its inventors [Dahl 1970]), and it formed much of the later backbone of ADA. This led to the ubiquitous stack data-type example in hundreds of papers. To put it mildly, we were quite amazed at this, since to us, what Simula had whispered was something much stronger than simply reimplementing a weak and ad hoc idea. What I got from Simula was that you could now replace bindings and assignment with goals. The last thing you wanted any programmer to do is mess with internal state even if presented figuratively. Instead, the objects should be presented as sites of higher level behaviors more appropriate for use as dynamic components.
>GOOS is looking at it from the perspective of Abstract Data Types. In Alan Kay’s conception of OOP, instead of static structures that are easy to reason, aliasing gives you dynamic systems of collaborating objects that are endlessly flexible and scalable, just like in nature’s biological systems of cells or the Internet of web servers. Proponents of ADT-style thinking, who use languages like C++ and Java, can’t imagine such complex systems, or they’re afraid of them.
>Where the big missing piece lacking in mainstream typed OO languages today is:
>The big idea is “messaging”
>He advocates focus should instead be on messaging and the loose-coupling and interactions of modules rather than their internal object composition:
>The key in making great and growable systems is much more to design how its modules communicate rather than what their internal properties and behaviors should be.
>And finds static type systems too crippling:
>I’m not against types, but I don’t know of any type systems that aren’t a complete pain, so I still like dynamic typing.
>Other popular languages embracing Smalltalk’s message-passing and late-binding and having implemented its message-based doesNotUnderstand construct include: forwardInvocation in Objective-C, method_missing in Ruby and more recently noSuchMethod in Google’s Dart.
>Some people are completely religious about type systems and as a mathematician I love the idea of type systems, but nobody has ever come up with one that has enough scope. If you combine Simula and Lisp—Lisp didn’t have data structures, it had instances of objects—you would have a dynamic type system that would give you the range of expression you need.
>It would allow you to think the kinds of thoughts you need to think without worrying about what type something is, because you have a much, much wider range of things. What you’re paying for is some of the checks that can be done at runtime, and, especially in the old days, you paid for it in some efficiencies. Now we get around the efficiency stuff the same way Barton did on the B5000: by just saying, “Screw it, we’re going to execute this important stuff as directly as we possibly can.” We’re not going to worry about whether we can compile it into a von Neumann computer or not, and we will make the microcode do whatever we need to get around these inefficiencies because a lot of the inefficiencies are just putting stuff on obsolete hardware architectures.
> This is the software equivalent of "Not My Problem." ... This is how you deal with unanswered messages. You think about your response instead of just letting your software crash.
A terrific 2014 paper, Simple Testing Can Prevent Most Critical Failures: An Analysis of Production Failures in Distributed Data-Intensive Systems [1] found that most catastrophic crashes in distributed systems are a result of catching exceptions and not paying thought to how to handle them.
I have c programming experience but not java. I would greatly appreciate if you can tell me the where to catch an exception. When there are nested method calls I can’t tell where to handle it correctly - should I handle it in the top most function or the one where the current method was called from? Can you point me to some resources so that I can understand the correct way of handling exceptions.
As the paper and the article say, what matters is not the technical aspect of where to handle an exception, but to give some serious thought to what should be done when it occurs (as opposed to just logging it). Serious problems happen when people don't pay enough attention to that.
You should handle it in the most immediate context which can do something productive with it- if there's nothing you can do but swallow it, you should probably let it continue to pass up the stack. If you can handle it and log a warning and return something else, you should handle it there.
For example, if you're writing an HTTP handler for an API, you should catch any exceptions at that point in your handler and return a 500 with an appropriate response to the client, since returning 500 is something productive we can do (vs letting it pass and crashing the server).
I can't shake the feeling that we're talking about different classes of problems here.
Data importers and sanitizers are de-rigueur when dealing with real world data, of course. You have to expect to find crazy things inside, and be expected to just cope with it. That's not a problem.
But when you're dealing with purely internal systems, the calculus is different. If you had to keep chasing down fopen(), printf(), and malloc() calls, you'd spend more time in administration than you would getting actual work done.
So if I'm understanding this correctly, we're talking about purely internal code vs code that traverses domains (much like in DDD), which require different styles of solutions.
You're not missing the point. I apparently didn't do a good job of making it clear.
I'm primarily talking about external data/services. Anything which you would ordinarily consider "suspect", such as "reading from a third-party API", is the target here.
The odds of printf() failing are so ridiculously low that I am not going to write tons of code to handle this case. It's not worth the money.
The odds of putMoneyInCustomerAccount() failing are much higher, and the consequences are much graver, so that's a perfect candidate for saying "maybe just throwing an exception ain't the best solution here."
In the case of OOP, particularly in Kay's vision, these objects might be written by someone else, using code we don't see, and thanks to isolation, might be connecting to third-party services which do all sorts of interesting things we don't know about or need to care about. Thus, they're not trusted.
If I wrote the object or I can read its code, and I know what it's doing, I'm not going to sweat it.
I'm also not arguing that we should always make the code that robust. Some things you can recover from or your budget might not allow you to make the system more robust.
"After a few months, I had rewritten the ETL system to be a fault-tolerant, rules-driven system that could handle the radically different data formats that pharmaceutical companies kept sending us."
Oddly enough, I'm currently working on exactly this sort of thing, for financial data. And I've been cursing the guy who wrote the initial pass at it, the state of "software engineering" education, and the Pacific Ocean for good measure. It turns out that Mr. Guy is an expert at modern software engineering best practices, dependency injection, unit testing, and ORMs, but had apparently never written a parser. Being an unpleasant and perverse excuse for a human being at this stage of my career, I decided to mostly keep his parser and wrap it in enough error handling to do something useful is some transaction record doesn't have a supplier. That is, rather than rewriting it as a normal-human-being parser or even (heaven forfend) breaking out and learning to use ANTLR. (Whew. I feel better now.)
Anyway, so, yeah, I've been there. I know what you're talking about.
But...
Samuel Falvo II is commenting on a completely different topic. He's not talking about the problem of making a system robust against crazy-pants input or even crazy-pants programmers (well, indirectly...). He's not talking about
"So when fetchDataFromServer() fails, or when an object doesn't respond to the message you sent, what do you? There's a simple way to approach this: ask yourself what you would do in real life."
He's talking about "when an object doesn't respond to the message you sent," how do you know? Don't have a static type system of some sort? If you type "entrie_name" when the message handler is called "entire_name", you won't know until some process tries executing that message send. And if you don't have something like a exception system, you won't even know then. (SIGSEGV, SIGILL, or SIGBUS and the like are really indications that you have just made some grievous error in how you are living your life.)
Falvo, with all his vitriol, is pointing out that the systems of Kay's vision are actively trying to prevent you from writing robust systems. And you have done exactly what he's accused you of doing:
"Don't resort to changing the topic either [...]; the issue is, right then when the error happened... What happens right then and there?
"Until you can answer that question, you have done nothing but bloviate about what is possible in some utopian system...."
I think some of the confusion here is that code you can't see or change is typically running on a different server, while ordinary method calls are typically about in-process communication. Compile-time checking makes sense for avoiding problems within code that's all going to be compiled together. Runtime error recovery makes sense for remote procedure calls. And there are also situations where you have untrusted code loaded dynamically (JavaScript in a browser).
It seems pretty important to be clear about static assumptions you can trust (because it's compiled in) versus things that can change.
mythz> On RPC, and how it distorts developers mindsets in architecting and building systems:
Kay> The people who liked objects as non-data were smaller in number, and included myself, Carl Hewitt, Dave Reed and a few others – pretty much all of this group were from the ARPA community and were involved in one way or another with the design of ARPAnet->Internet in which the basic unit of computation was a whole computer. But just to show how stubbornly an idea can hang on, all through the seventies and eighties, there were many people who tried to get by with “Remote Procedure Call” instead of thinking about objects and messages. Sic transit gloria mundi.
mythz> Carl Hewitt being the inventor of the Actor Model and Dave Reed who was involved in the early development of TCP/IP and the designer of UDP.
mythz> The last latin phrase translates to “Thus passes the glory of the world” - expressing his dismay on what might have been.
He also doesn't consider accessing a data field called "entire_name" to be "object oriented programming". He calls that "simulated data structure programming".
Kay> If you are “setting” values from the outside of an object, you are doing “simulated data structure programming” rather than object oriented programming. One of my original motivations for trying to invent OOP was to eliminate imperative assignment (at least as a global unprotected action). “Real OOP” is much more about “requests”, and the more the requests invoke goals the object knows how to accomplish, the better. “Abstract Data Types” is not OOP!
For RPC, substitute "sending a message to a remote machine" if that's what you prefer. Network requests and responses are pretty universal and can be represented in different ways at the language level, but they are pretty different from local requests.
I'm wondering if this stuff could be better explained without going back to what Alan Kay supposedly wanted? If a software architecture idea is good then it should live on its own and others should have elaborations on it.
As an aside, if you like Erlang, take a look at Pony (https://www.ponylang.io/); it's another take at an actor-based, message-passing, concurrent language with both similarities to and differences from Erlang. It doesn't (yet) have the good robustness story, but there are a lot of good ideas already in place.
I was asked to do something very similar at a film post production facility for credits. Essentially I was given a day and a half to reconcile spreadsheets of crew member names (compiled by department HODs) with their "credit name" on our employee database in order to create a master for the production company.
I had all kinds of issues - variations of names, mispelled names, nicknames, names with the middle name used as a surname (and vice versa) and a few lacking even that. I recall leaning heavily on tables of common names, various Python string normalisation methods as well as soundex. In the end I was left with a dozen or so names which needed follow up but it was pretty good for the ~1000 I had started with initially.
The most harrowing part of it all was attending the crew screening - it is actually possible to address issues on an end crawl (provided you catch it early enough) but I really didn't want to be the guy who screwed up the credit of someone who had just spent the previous three months in crunch to get the picture out the door.
I've watched one of Rich Hickey's talk and at some point he brings
that issue on the table, what happens when the receiver is not responding to the message.
He is advocating Queues and one of his arguments is that Queues decoupling the requester from the receiver. So you don't really have to worry about things like this by using a queue.
That's abstracting in time more than in implementation; and it's primarily useful with stateful objects that are best reasoned about as a single, synchronous timeline (which most objects are). You'll see this pattern used in, say, Actor systems, as a queue-backed inbox.
Note that "...how you might handle objects that don't respond to messages" refers specifically to what happens when you send a message (or call a method, in the parlance of most non-Smalltalk languages)
thingy.spamford()
and the thingy doesn't implement a handler (or a method) for spamford. It has to do with this paragraph from the original, original post:
"Kay argues (correctly, IMHO), that the computer revolution hasn't happened yet because while our bodies are massively scalable meat computers, our silicon computers generally don't scale in the slightest. This isn't just because silicon is slow; it's because of things like print customers.fetch(customer_id).last_name not actually having a last_name method, throwing an exception (assuming it compiled in the first place) and programmers getting frantic late-night calls to bring the batch system back up. The only real upside is that it offers job security."
Having worked in healthcare IT, I'm less astonished than I should be about approximate matches and score thresholds being used to confirm "same entity" for pharma data. What could go wrong?
(Not knocking the author...the practice is unfortunately needed because providers won't provide good data)
It wasn't quite like that. It was to help the pharmaceuticals work together to "pool" talented researchers who had the required training to proceed in the clinical trials. If a bunch of companies don't share this data, they drive up their costs tremendously in trying to find and recruit qualified doctors for their trials, pay to train them, and then have them conduct studies, only to find out that the doctor in question doesn't recruit any candidates themselves, or fails to report results.
If, however, they pool their resources, they can find qualified doctors who have already taken the training and are known to participate well in trials. They can then correlate that with populations who might benefit from the drugs in question and save a fortune (and possibly many lives), by reducing the cost-to-market.
The matching algorithm was actually very strong, but there was still a human step, from the pharmaceutical companies, to verify that these were the researchers they were seeking.
There's something I don't understand about the "messaging" metaphor as opposed to the more pedestrian method calling system that most people are used to...
What about private methods? Decently written objects in the wild tend to have some private methods that they call on themselves. The highfalutin metaphor of a message goes out the window there, because it's an individual object doing something to itself, not a communication.
That's true if there is no composition going on within the object. But, once you have a class hierarchy, some mixins, some dependency injection, or some such... you are "privately" talking to parts of yourself that come from elsewhere.
But when it's the exact same syntax to call a statically-dispatched method on the current object as it is to send a message to something else, it breaks the illusion of messaging being meaningfully different while writing code.
Whether some actual bonafide communication ends up being done one level removed (e.g. inside such a private method, or chained off of a field access) is immaterial because when appraising the conceptual integrity we cannot continue after seeing a fail.
am I crazy about this? You should have (at least two) type of message guarantees - "don't care" and "must respond", kind of like "udp" and "tcp". Obviously other messaging systems have other types of guarantees, but these seem like they are a reasonable basic.
If your agent that send a "must respond" message doesn't get a response within some customizable time, then it itself fails loudly.
The idea is that rather than try/catch (which tries to do something and can report low-level failures but cannot resume them) one uses something more like Lisp's condition system. In a condition system, signalling an error preserves the stack, and it means for example that a low-level loop can detect an error (say, a poorly-formatted CSV cell), signal that to higher-level code, the high-level code can make a decision (e.g. to replace the contents of the cell or skip the cell or send an email to the CEO with the contents of the cell and wait for his reply) and the low-level loop can continue processing as if there had never been a problem at all. There's a really great explanation in Practical Common Lisp:
In the context of the example, I can easily imagine how the system might have processed most rows without issue, called out to a scoring system for most exceptional rows, and raised a few rows to human attention when even the scoring system couldn't figure it out.
This sounds like something that could be introduced into a lot of modern functional-ish VM based languages, in the form of restartable exceptions, it just .. hasn't.
There are two big things lacking in current languages: dynamic scope and macros.
Dynamic scope gives callers the ability to provide values to their callee's callees, without polluting the argument list. It's a tremendously valuable option, which is ignored by just about every language these days. Fortunately you can fake it, e.g. with Go's context.Context.
Macros help clean up the verbose mess of catch/throw or try/except or panic/defer/recover blocks and make the logic explicit rather than implicit. Without macros, you have to spell out all the boilerplate all the time. You can get part of the way there with anonymous functions, but they are also imperfect and verbose.
I think there's a lot of argument right now between folks who recognise the absolute necessity for macros and folks who don't understand it yet; so far as I can tell there's very little understanding of or apreciation for dynamic scope. This strikes me as unfortunate: while I definitely don't want it by default, it's wonderful in a lot of situations.
Here's a stupidly-simple example in Python. So very not ready for production use:
def with_handler(ctx, h):
return {'handler': h, 'next': ctx}
def with_restart(ctx, name, r):
return {'restart': (name, r), 'next': ctx}
def find_handler(ctx, err):
if 'handler' in ctx:
return ctx['handler']
elif 'next' in ctx:
return find_handler(ctx['next'], err)
else:
raise Exception('Unhandled error %s' % err)
def find_restart(ctx, name):
if 'restart' in ctx and ctx['restart'][0] == name:
return ctx['restart'][1]
elif 'next' in ctx:
return find_restart(ctx['next'], name)
else:
return error(ctx, 'No restart named %s' % name)
def error(ctx, err):
find_handler(ctx, err)(ctx, err)
raise Exception('No handler found for %s in %s', err, ctx)
def do_stuff(ctx):
results = []
for i in range(100):
results.append(do_little_stuff(ctx, i))
return results
class alpha(Exception):
def __init__(self, text):
self.text = text
def do_little_stuff(ctx, i):
def restart(ctx):
raise alpha('text: %d' % i)
ctx = with_restart(ctx, 'use-alpha', restart)
try:
if i % 3 == 0:
error(ctx, '%d %% 3 == 0' % i)
return i
except alpha as e:
return e.text
def handler(ctx, err):
find_restart(ctx, 'use-alpha')(ctx)
ctx = with_handler({}, handler)
print do_stuff(ctx)
Note how the top level provides a contextual handler which chooses to use the alpha restart, while the restart itself is provided by the mid-level do_stuff, while the low-level do_little_stuff both provides the restart & signals the condition.
Add in some type detection &c. and this could start to be useful. But it'd still be hellaciously verbose compared to the Lisp:
To have this used by libraries and libraries users won't you need some standardization though in the language? Otherwise you get various implementation.
Yup, that's another reason I love Lisp — it's been standardised since 1994, and everyone supports it (and since it's standardised and provided by the implementation, there are efficiency hacks possible which you wouldn't want in normal library code).
Lisp has been doing things newer languages still don't do since before a lot of modern programmers were even born. And there are relatively few warts for that age (upcasing and the pathname functions are the warts which leap to mind)!
One way to implement that is that error-event-raising is just a function which examines the dynamic context for error handlers, and calls them until one transfers control (e.g. by throwing an actual exception). That's pretty simple to implement in any language which has exceptions, panics or similar control-transfer structures.
It gets more complex when you have restarts. The nice thing about Lisp is that all the complexity can be hidden with macros; with other languages you have to be explicit every time.
I agree this could be a solution for some problems but it seems to me that there are problems where it may not work. So you probably use a mix of approaches. How would you document a library function that uses the mechanism you mention? If the user is not handles the error(forgets about a type or a new type is added in an update) will the user notice the error happened?
Same way that you'd document exceptions in a library.
> If the user is not handles the error(forgets about a type or a new type is added in an update) will the user notice the error happened?
In Lisp, ERROR invokes the debugger if the condition is not handled. So an unhandled error condition pauses execution. One cool thing is that the user can use the debugger to invoke restarts. It's extremely nice.
There's also CERROR, which establishes a restart to return from itself — so some errors can be continued on from — WARN, which offers automatically, mufflable warning output, and SIGNAL, which offers non-erroneous condition signalling (i.e., while ERROR never returns, SIGNAL can).
It's really, really full-featured but also easy to use.
Right, so this is basically the Go approach: explicit error checking on everything. No exceptions, because exceptions are a weird sort of non-local control flow, and can escape and take down your program as a whole.
The "if you don't know the answer, find an approximation from another route" is .. situational. In this case it's exactly what the customer wants. In other cases (finance, security, aerospace) it could be a disaster waiting to happen.
I worked on a point-of-sale system where it was a matter of design philosophy that any inconsistency should be an immediate crash-and-reboot; since it also operated as a save-everywhere and audit-everywhere system, if you did manage to crash it then within 30 seconds you'd be back at the screen just before the last button press with no data loss other than the button press that caused the crash. I believe this crash-and-recover approach is very Erlang: https://ninenines.eu/articles/dont-let-it-crash/
Thinking of exceptions and message validation also makes me think of "in band" versus "out of band" signalling and errors. Exceptions are "out of band" - outside the normal function return process. Internet communication is all "in band" and the whole approach of separate control and data planes has almost entirely gone away, apart from SIP and (nearly dead) FTP.
Sum types arguably permit a form of out-of-band return value. A function that returns a sum type has the choice of returning the usual type (an integer value, say) or an error value of some other type.
IMO this counts as a kind of out-of-band arrangement because it doesn't involve using one of the normal return values to signal error.
In Haskell one applicable type is actually called "Either a b".
I'm not sure what your point is. If a function has a "unitary" return type (not, say, a list of possible return types out of which one is selected) then obviously it always returns a value of that type.
You could probably invent a kind of function that had a list of possible return types. Then anyone who called it would have to deal with those possibilities.
It all comes to the same thing in the end. Sum types and pattern matching are a very good system for expressing this functionality.
It's conceivable that a compiler would throw away the Either type information and just generate separate code for the different ways the function might return. Ultimately, though, I suppose it would end up calling a continuation, since the function has to "return" somehow.
> IMO this counts as a kind of out-of-band arrangement because it doesn't involve using one of the normal return values to signal error.
My point is that if you're returning an algebraic data type Result that encodes Ok(val) or NotOk(_), you are using one of the normal return values, because the function always returns a Result. The whole purpose of the construct seems to me to be bringing the error case handling into band.
The way I see it, each of the data types in a sum type is analogous to a channel. The type of the 'message' conveyed by a value is effectively "Either Control Data".
In-band signalling is messier than out-of-band signalling, so I see out-of-band as the goal. If you look at the various approaches to the semipredicate problem on that Wikipedia page, it looks like, historically, we started out with awkward in-band solutions and the evolution is towards the clarity offered by expanding the return type to send errors out-of-band.
Go has panics, which is essentially exceptions with a different name. Failure to handle all panics will lead to process termination, as with unhandled exceptions.
To be fair, their general use is discouraged unless you want the process to terminate. Not that you can't do that in other languages, but the culture in Go seems to prefer returning error codes.
His problem was syntax getting broken and not about "questionable data". In the context of dynamic vs static compile, he's worried about scenarios like this:
With a static type system, the changes on Saturday would have told them immediately at compile time that the last_name field access by other client code was broken. (And yes, one typical answer for handling errors like that in dynamic type systems is unit tests -- but that's veering off into a different conversation.)This essay is a "solution" to a different problem.