It really depends on the project, or put it another way, the ROI. Then core protocols of a storage system or an OS kernel will benefit from formal methods, but I’m not sure a CRUD app will.
Besides, TLA+ is not panacea in formal methods. It’s a specification language, which means its users will still need to master temporal logic, first-order predicate logic, formal specification, and a slew of concepts, such as spurious counterexamples. And remember that essential complexity is in specifying a system? Now try to specify something as simple as quicksort. Let’s just say specifying a system with mathematical logic is beyond 99.99% of software engineers. Not that people are incapable, but I doubt people have enough incentive to swallow the cost. For one, how much does it cost to debug a spec? Don’t believe me, do try out the book by Hehner: http://www.cs.utoronto.ca/~hehner/FMSD/, or introductory books like this: https://www.amazon.com/Verification-Reactive-Systems-Algorit..., and you’ll get the idea. And yes, I thoroughly studied the mentioned books cover to cover and still think they are not for the majority of software engineers.
Considering most corporate websites are CRUD apps, and they keep getting pwned via some bug or vulnerability and millions of private user information stolen and pawned off in the black market, I don't think that's an assumption we can make anymore.
That said, full-on formal methods may still be overkill. A web framework built in a language with a strong type system and input sanitation via type-checking and similar safeguards may suffice.
But that's closer to formal methods than most of the web dev industry is accustomed to.
I'm not sure type safety is the right key here. Rather, better defaults in web and SQL frameworks, even in a dynamic language, would get you most of the way there. You can even use dynamic types to mandate that input is sanitized - a crash is much louder than SQL injection or XSS.
>Rather, better defaults in web and SQL frameworks, even in a dynamic language, would get you most of the way there.
We’ve had 20+ years of building websites, and either we have sane defaults by now but they’re not enough, or we still don’t have sane defaults in our web frameworks.
Either way, that this problem persists given the damage it’s enabling, suggests we haven’t addressed the root cause yet. Web development is still a tarpit, which strong type systems can mitigate in unique ways.
Most of the recent egregious hacks I can think of were compromised by things like out-of-date software, poor access control, stuff on the configuration level. Static analysis might help here, but I don't think formal methods is the lowest-hanging fruit.
The biggest pitfall wrt. TLA+ is that it's not intended as an end-to-end system. You're making up a simplified "toy" model of your software architecture and verifying claims about your toy model, that may or may not apply to your actual code. This is sometimes justifiable when the software architecture is the most interesting thing you're dealing with, but that's not a very common case. Most programs are designed in relatively simple ways, and would benefit more from an end-to-end system that establishes valuable properties (even comparatively trivial ones like type and memory safety, or simple pre- and post-conditions) about the actual code one will be executing.
> The biggest pitfall wrt. TLA+ is that it's not intended as an end-to-end system.
That's a huge benefit, rather than a pitfall at all. End-to-end verification of real-world systems ranges between the impractical through the infeasible to the downright impossible. The biggest software ever verified end-to-end at the same depth as TLA+ was 5x smaller than jQuery, required extraordinary simplification, and still took world experts years. People have found serious bugs in software ten times the size of the largest program even verified end-to-end within days of starting to learn TLA+.
The question is not whether TLA+ can achieve a feat nothing else can, anyway, but whether it has a good ROI.
> It really depends on the project, or put it another way, the ROI. Then core protocols of a storage system or an OS kernel will benefit from formal methods, but I’m not sure a CRUD app will.
I've seen people successfully use formal methods on CRUD apps! The trick is to use more lightweight stuff that gets you some benefit without taking as much time. For example, modeling out the problem domain elements and seeing if there's any pathological cases you need to worry about. This is a particularly good use case for Alloy, since it can make visualizations.
Yep. Anyone who tries to do that in a competitive space will have their lunch eaten by people who slap something together and get it in front of customers.
Of course it isn't, but neither is testing, and we all still do it. The only question that matters is, does it have a good ROI or not? TLA+ does in a wide range of problems.
> which means its users will still need to master temporal logic, first-order predicate logic, formal specification, and a slew of concept
That all sounds imposing, but TLA+ users usually don't "master" any of those, and reaching a level of high utility takes far less time than achieving the same with any programming language. Writing useful specifications that can save real money takes something like 3-4 full days of a workshop or 2-4 weeks of after-hours self study. Learning how to read specifications well enough to implement them takes much less (a few hours to one day), and doesn't require any temporal logic at all.
It's certainly possible that it's still not for the majority of engineers, but most teams will have at least one person who can do it, and that might be enough.
In my opinion, formal methods for software engineering have been heavily advocated for by people and organizations who want to sell books, seminars, and conferences. People who want to act as gatekeepers while extracting money from the system and reframe history in their own terms.
That's not to say formal methods are bad per se, there is just a lot of reason to doubt motives and outcomes behind concerted overarching initiatives.
The notion that software engineering should be grouped under civil engineering is misguided. Part of the reason why we're looking to other fields is to profit from their experience (which is good) but part of it is also because we're desiring a kind of legitimacy we feel is absent. It's a reasonable impulse to a degree, for smart people dedicating their productive years to a discipline. We want to do a good job.
The problem is that we have more in common with architects than we do with civil engineers. There are some goals that can be measured with a boolean outcome: does the code compile, is a certain algorithm provably correct given certain conditions, can a building be built. But it gets murky very very quickly: how well does a thing we create serve its purpose? Does it use resources efficiently? How long will a structure last? What kind of adverse conditions will bring it down? Does the thing make people's lives better and if so by what measure? These are a small hint at the questions where formal methods tend to fall down or even work against us.
So do we need overarching formal methods or do we need to be more scientific about specific objectives and measurement of outcomes that are important to us subjectively?
> In my opinion, formal methods for software engineering have been heavily advocated for by people and organizations who want to sell books, seminars, and conferences. People who want to act as gatekeepers while extracting money from the system and reframe history in their own terms.
Is there really that much of an FM consulting-industrial complex? There's me, Galois, Atelier B, and Adacore, those are the only independent FM groups I can think of off the top of my head. And two of those are tied to specific verification languages.
Most of the FM advocates are academics, which is one reason why there's so much friction between industry and academia wrt FM.
I think it depends on what you are building. For example, if you are building a database that houses user data and user credentials, this is is a very well understood thing that experienced professionals can set up quickly, efficiently, and give you both a lot of security and also enough flexibility to build whatever business you want behind it.
And yet, it seems that most companies and websites that handle user credentials do a poor job at it. So there does seem to be an argument in favor of adding formal methods in certain areas of software design and engineering.
One thing that very much distinguishes the software world from every other type of engineering is the complexity and rate of innovation. In software, I can invent and deploy a thing on the same day and the total R&D cost to my company is almost exactly equal to my salary for that day. Whereas for something like a bridge, the R&D cycle for a new bridge invention could be several months and hundreds of thousands of dollars for a similar idea.
Overall, I think the discussion about putting standards in place for certain common activities (network communications, password databases, encryption standards, and similarly well understood areas) is worth having. Though I also would be concerned that state actors would use such standardization to insert weaponized requirements like backdoors, or that regulation would prevent new innovations from gaining a foothold.
The software industry (SWIND for short) is a mixture of unsafe tooling, high-risk practices, sloppy execution, and total delegation of responsibility. So I agree that the SWIND will never be civil engineering. ^_^
A buyer might accept to buy a product without understanding its fitness for purpose, but the vendor should be selling a product that a) does what it claims, and b) can be run safely and securely.
(a) is the art of software development, which you describe. If the buyer and vendor agree on "robustness", there will be design choices and acceptance testing. But (b) does involve some of (a) if we start to question the shop tools of the vendor.
Given that (b) is fundamentally a problem with bad shop tools (a.k.a. language "foot-guns") and education, a major shift needs to happen in SWIND.
> advocated for by people and organizations who want to sell books, seminars, and conferences.
The publishing business is just one of several economies that live on software. Some of it is good. I think it is beneficial to raise awareness of the need for better shop tools. The ambulance-chasing economy of software security is even bigger than publishing. There are some big shops using bad tools. ^_^
> The software industry (SWIND for short) is a mixture of unsafe tooling, high-risk practices, sloppy execution, and total delegation of responsibility. So I agree that the SWIND will never be civil engineering. ^_^
I interviewed a bunch of civil engineers for a journalism project earlier this year, and my main takeaway was never walk over a bridge. We're not that much worse than other branches of engineering in our tooling and practices.
Formal methods may work for civil engineering where the usual workflow is gathering requirements, developing a detailed project and building the thing, with the implicit understanding that any little change will mean recalculating costs and deadlines.
When your customer decides to pivot the fintech app you were developing into a cryptocurrency investment tool, it makes no sense.
In Civil Engineering it would seem crazy that someone changes their mind and decides that the Tower they had requested, obtained designs for, budgeted tens or hundreds of people's work, and planned for, now suddenly must be a Bridge. They themselves would discard the idea due to how absurdly expensive would be to just basically start from scratch. All that money and resources would be basically wasted.
Maybe it wouldn't be a bad thing that the same thing happens, to a lesser extent, in Software Engineering. Say bye to stupid last minute changes caused by the fact that they are not perceived as expensive, and all the bad incentives it creates in the people writing that software.
Software engineering is still very far from engineering. It's a young field and everything is still very experimental. Its practitioners are still debating whether a hammer or a saw is the best tool for inserting a screw, and whether to insert the head first or the tip first. Don't forget that the art of building bridges has a head start of several thousand years :)
Flexibility and Standardization is a tradeoff and how useful each one is depends on the context.
Should you put a project on hold if you only have 1/2 inch nails and the plans ask for 3/4 inch nails? I don't know, are you building a nuclear reactor or a decorative birdhouse?
No it's not. Engineering does not mean proving a design against a model, although that can be part of engineering (and software has formal proving methods, I might add).
Physical engineering projects are not mathematically proven either. They are verified against some accepted set of tests using models that use various empirically measured properties for materials and the world around the project as proxies for how this will do, and include various factors like 1-in-N year events and some acceptable error margin or chance of failure.
In something like microprocessor design, the physical design and manufacturing of course can be wildly variable and difficult to characterize (as Intel has recently been finding out -- even on a very nicely running process line, variations can result in many devices not working at all, and those that do can have variations of tens of percent in major performance metrics (frequency, power consumption).
Go up a level to digital logic design. Like software there are formal methods for proving logic, but they are absolutely not used to "prove" the entire digital logic part of a CPU. That is much more like software than you might think, thousands of bugs get found and fixed. Many bugs (from performance to correctness) can even get found and after a chip has come back from the factory. They go through intensive verification cycles, and even then it's not at all uncommon for bugs to escape. Many "errata" are just noted and you live with them, some necessitate varying degrees of crippling performance or features (spectre, meltdown, transactional memory, etc). And some actually require new revisions of the product when the errors can't be corrected in firmware.
The only real difference between software and logic design is the cost of "recompiling" means they spend vast efforts with many angles and layers of ways to modify the behavior of a device after it is manufactured. This isn't something you can see, but there are thousands and thousands of dials and switches that can be set to change behaviour, from very low level (oops this bit of logic has a clock-gating bug, we need to flip the switch to disable clock gating for this particular group of gates it will cost 0.01mW of dynamic power), through logical behavior of various algorithms and structures in the chip, right up to higher level big hammers like microcode and micro traps (oops we didn't implement this instruction correctly, switch it to trap to microcode and we'll write a handler for it at a cost of 1% performance).
Is electrical engineering, microprocessor design, circuit design considered to be "not engineering"? Not by many.
Engineering is about applying structured methods to create something that solves problem. Software in some areas is ahead of the game here, using advanced version control tools and methodologies, continuous integration, although digital logic design has been starting to cotton on.
Engineering is about the application of science, about safeguarding the public welfare, the duty owed to your clients, and through liability standing behind the work that an engineer does. The modern software industry doesn’t fit that pattern, eula’s shirk all responsibility and dark patterns are applied to trick users.
Until people writing software are taking ethics courses and are part of a professional organization to which they are accountable for being held to some code, there is no software engineering.
And I disagree, it does fit the pattern which is what my whole rant was about. You seem to have a romanticized idea of what "real" engineers do as well. A _lot_ of it involves software, a lot of it is prototyping and finding bugs and hacking around them and refining.
If you don't like the digital electrical engineering because it's too close to software for you, try mechanical engineering in automobiles. If you've spent much time around cars, you'll know that they have a lot of quirks like this where it's obvious a problem has been found and fixed. For example an engine of a particular vintage might be known to be prone to head cracking, then a few years later they might come out with another engine with a lot of the same part numbers but revised head design or material.
The engineers who designed it were not gazing at idealized otto cycle equations or out polling the public about its welfare, and the company behind it was doing its level best to play down the problem and coming up with ways their dealers could try to nurse cars through their warranty periods.
I think what a lot of computer programming goes on by people who call themselves software engineers but don’t follow engineering methods and ethics, such as prototyping, and having warranties and liability for the work.
I have known some old school engineers who were inspiring in their depth of knowledge, ability to do research and solve new problems, and management of customer and employee relationships, and definitely that has colored my view of what a professional engineer should be.
So even if we ignore for a minute that software can have warranties... having a warranty? Really? That's what you're going try to say means software development is not an engineering discipline? You could have just accepted my arguments with some grace. You don't expect me to dignify this with a response surely.
As an end user of software I don’t have SLAs or warranties, just EULAs that guarantee nothing and the ground constantly shifting underneath me with updates that take away features or turn me in to a subscription revenue stream as some functionality is unnecessarily moved to the cloud.
I am sure there is software created by an engineering process by software engineers, but it is probably less visible to me in my day to day computing. The effort you detailed that is required to make sure a chip works is a good one.
And I agree electrical and computer engineering are disciplines.
This is some absolutely amazing writing. Good heavens, your line about debating whether a hammer or a saw is the right tool for inserting a screw?? Beautiful!!
Do you blog? If so, I’d love to read more of you. If not, again, this is some seriously good writing. Big, end of a long drunken night “I really love you man, woman or whatever” ups!!!
Building construction is full of last minute changes too, it's just that the scope is usually more limited. But a lot of that also boils down to the cost to make changes.
You better get the foundation of your 100 story tower right, because it's a big, costly pain to fix it later. When update costs and consequences of errors are similarly high in software, you see real effort spent on getting it right before shipping.
When costs to update are low and costs from error are low, less effort is spent on getting it right first.
Personally, I've rarely seen a specification in my career, let alone anything rigorous enough to write a formal specification from. And, even if there were a rigorous specification, everything is subject to redesign at any moment. Not a whole lot of value in formal methods in that kind of environment. You don't tend to get a structural engineer to certify your plans for an overnight camp in a two-person tent.
I've seen plenty of specifications, and they were almost all wrong to a signiy degree. We didn't even know what was possible much less which algorithm was right until we tried it to see. We rewrote it three times using previous iterations only as copy-paste fodder before we liked how it handled real inputs. Then we updated the specification. I like this method.
It would be a bad thing though. The fact that software is changed with so little effort is a feature and a positive aspect. The only thing that should factor into whether a more rigorous approach should be taken is the use case.
Bridges stay the same, software changes all the time and _needs_ to change all the time because the business (or whatever) domain that the software is in is also constantly changing. The reason the 'waterfall' method of "gather requirements, design, build, test" fails so often is that the requirements gathered at the start are often out of date by the time you get to the test stage. Or, the users that gave the requirements actually meant something slightly different and that only becomes apparent once the software is infront of them for usertesting.
> reason the 'waterfall' method of "gather requirements, design, build, test" fails so often is that the requirements gathered at the start are often out of date by the time you get to the test stage.
True, but there is no excuse for that anymore. Very little software is really new anymore, so we should have a good handle on requirements by now. (I don't either )
What do you mean by 'bridges change all the time too'? Some of the oldest bridges in the world were built nearly 2000 years ago. That's zero change besides basic upkeep.
What is the oldest unchanging software project you can think of? Typical software projects have a constant accrual of features. Some of the older projects like the Linux kernel are more like a mansion that keeps getting more rooms while others are demolished. Not exactly something that happens to a bridge.
Those are a feature of Software Engineering, as opposed to Civil Engineering.
If there was a cheap way to turn Towers into Bridges (and the cost of failure was comparable to that of an app having an outage), our infrastructure would look a lot different.
I know quickly changing requirements are a pain for those who have to do the implementation, but, commercially, that flexibility has significant value.
Everything you can do in software someone could do with a few ASICs. software is several orders of magnitude cheaper, and that is before you account for hhihiwhow eeeasy it is to change.
That's a bit extreme. In cases where things move quite quickly, we should still be able to provide some formal guarantees. There's a whole spectrum of formal methods ranging from formal proofs to rich type systems and design by contract.
I have personally had a lot of success with some techniques, and I'm quite hopeful ideas such as those in Liquid Haskell will eventually become mainstream.
This doesn't really take away from the point. The life cycle of most software projects is tiny compared to the life cycle of most buildings. And in most all failures, the warnings were clear way in advance.
Formal methods come from a need for form, repeatability. There is nothing in software development that comes close to the process and considerations that go into choosing to build with M30 concrete, or use #5 rebar for that matter.
Sure there is! The most obvious example to me would be a server-side database for storing user credentials. It's a very common problem, it's well understood by the experts in the industry, and also most everyone makes the wrong choices anyway.
There are plenty of places inside of software that are pure-innovation and can't use a formal process to design and implement, but there are also plenty of places where a corporation just wants a website that does customer loyalty, and you often don't need any innovation at all.
Formal methods could be useful when requirements are clear, can be strictly defined, aren't going to change suddenly, and the stakes of building it wrong are very high. Avionics. Medical devices. Cars. Some kinds of embedded systems. Core parts of OS Kernels, databases, network stacks, cryptography libraries, etc.
Most software in the world is not like this. It exists to support businesses trying to make a buck in a changing world, either from consumers or from other businesses. Requirements are messy, shift frequently, and agreeing on what they are at all is more than half the battle. Nobody really knows the minutiae of quite how the whole product works but it doesn't really matter. Money is still coming in. The risk to those businesses of even rather bad bugs is generally low. A few customers have a bad day because they see a 500 error in their browser when they're trying to order food, or they can't summon their ride-share for 5 minutes. The issues are fixed and life goes on. The perfection of formal methods is the enemy of good, and would be a net negative return on investment, much as programmers who (understandably) love perfection and clarity and rigor might want it to be otherwise.
Some bugs can't be fixed: once you've leaked millions of people's personal information, you can't really take it back. The "stakes of building it wrong" are very high in most software, because businesses are all too willing to hoover up all your info.
> The risk to those businesses of even rather bad bugs is generally low.
That is unfortunately true, but hopefully we'll eventually manage to hold businesses accountable for their screwups.
We all make use of those components though, regardless of how CRUD the application is. We'd all benefit from them being validated. The fact that "fully validated" versions of those components don't exist is simultaneously an indication of how difficult the problem is and how little our industry truly cares about product reliability and quality. These issues are not rare or even particularly uncommon in the wild.
I imagine the pursuit of formal methods in software that doesn't strictly require it would go a long way to fixing other systemic issues like tech debt, security, and performance.
Not yet, but I think we’re getting to the point where software will fractionalize and those pieces will formalize.
TCP/IP stack, crypto libs, message busses, networking libraries are all prime candidates for this kind of formalization.
Right now we’re just waiting on the first mass incident where a component/service is recognized by a government as essential and should be regulated then you’ll quickly see those structures/organizations form around that area.
Whether or not formal methods can be applied is almost entirely an economic decision since that kind of rigor drives up cost. Makes sense in certain high value systems when the organization wrapped around them has the resources or the level of risk associated with a less rigorous approach is too high (e.g. at NASA).
What the author seems to miss completely is that there are many systems developed using typical methodologies that generate billions in revenue. What this tells you is that ideas like TLA+ are effectively superfluous in a wide variety of software contexts.
It's not easy to quantify but you can't just dismiss formal methods because we don't know how much development and maintenance overhead they would eliminate.
How much revenue something generates is a poor metric. Otherwise, we all would probably be programming in PHP.
The programming language itself has very little to do with the revenue something generates.
Revenue isn't generated by technical details, those are just the implementation of a product that itself is just an implementation of the goals of the business.
The reason people are and aren't programming in PHP is just plain and simple: taste. Maybe ff 10 developers in a team want to use PHP they can just do that, as long as at the end of the line the product delivers. If 3/10 want to use PHP, and 7/10 want to use C# then no, it will not likely run on PHP. And if as a developer you prefer certain features or tools over others, you are likely to select the technologies that suit that.
None of the technical details had anything to do with the value that was created.
If your requirements say that 100 people per minute need to be able to find, basket and purchase product on a website, and those requirements are met, does it really matter if the codebase is a combination of AppleScript and an ActiveX control? (I know, that technically doesn't exist, not the point)
Now, if those requirements also include scaling needs, long-term support needs, knowledge sharing, consultancy options, then you might not select those technologies.
I don't think the OP dismisses formal methods, only the idea that they have pervasive utility that outweighs costs in most scenarios, at least for certain kinds of formal methods.
But we can't forget the range of formal methods either. Type systems technically qualify, as the article notes, even if they are lightweight, and they're not new. Given the pervasiveness of Java, we're already making widespread use of at least that formal method and have been for a long time.
> How much revenue something generates is a poor metric. Otherwise, we all would probably be programming in PHP.
The OP wasn't arguing that revenue is driving or should drive language selection, only that the revenue shows that the lack of formal methods does not seems to prevent companies from making enormous sums of money.
> you can't just dismiss formal methods because we don't know how much development and maintenance overhead they would eliminate.
That by itself is not much of a positive argument for their use. And do you know how much maintenance overhead they would introduce? In either case, it might be useful to compare how a switch between similar languages (e.g. JS and Typescript) affected code quality (in terms of the number of bug reports filed, for example) and feature delivery time. Maybe there's a point where the trade off makes sense. In which case, languages that support a gradual adoption of formal methods might begin to look attractive.
I mean, it depends on criticality. If something is critical, pushing further along the formal methods spectrum makes sense (see also: why AWS has done a lot with TLA+). Sure, we probably won’t verify all of Instagram, but super critical portions of the platform underlying it are likely worth it and would save money in the long run by reducing outages, bugs, etc.
I'm fond of formal methods, but I'll offer the usual counter-example:
Formal methods struggle to scale up to real-world problems. There is no formally verified TCP/IP stack. There is no formally verified TLS or SSH implementation. That's despite how immensely useful and reusable such implementations could be (that is to say, high demand, and yet no supply).
As I understand it we're not even close to being able to develop a formally verified TLS implementation, but if we were, it would likely take the form of an internationally acclaimed PhD thesis, akin to CompCert. It still wouldn't mean formal methods are cheap and easy.
It's astounding that core security infrastructure isn't formally designed and certified, and isn't available as standardised - certified - drop-in elements.
A top decile PhD thesis is not a high bar in engineering. It's how most disciplines evolve. "Not invented here" is not a credible explanation for why CS feels a need to be different.
also, they dont mean bunk if business side of the equation isnt behind them... following formal methods means that business also follows formal business methods (which would reduce sudden changes and mistaken product designs etc).
too many times "dev" is treated like a roving circus that can just change their routine on a dime with little consequences...
There is a lot of formal method usage in smart contracts. Maker DAO for example did a full formal verification of their stuff, and there is a lot of talk within the security community about it.
There are plenty of shops that apply rigorous, formal methods to software design. Military contractors providing software for mission critical systems do this (or used to). Some will cut corners, but they are supposed to follow the methods and procedures the contracts specify.
It isn’t common in private industry because it’s typically deemed unimportant.
There’s a spectrum of formal methods and with the push toward static types in Python coming directly out of industry I would argue that industry is definitely deeming it important in some degree; the question is to what degree. Right now a lot of the time it’s only worth it for the most critical systems (S3, for example, but not my side project webapp) but if tooling improved, it would be worth it for more cases.
I once had this fun contract to fix code in one giant application written in PHP. The org of decent size who wrote said code in a first place had code reviews, unit tests and what not.
The end result was that they had to hire outside person to fix what they have produced. I looked through the software and noticed whole bunch of iffy patterns. Ended up writing giant Python script that searched said patterns and fixed it where possible marking with appropriate comments and issued a warning comments where it could not.
Not sure if this is a regular occurrence in companies that write and maintain their internal software as this was the only time I did job like this.
Does your opinion only apply to unit testing or to automatic testing in general?
I can't imagine the kind of software that wouldn't benefit from automatic testing (unless you would be writing the simplest of CRUD web apps or doing it completely wrong).
From personal experience I see both better code design from being forced to make the code testable and less bugs due to writing tests revealing bugs I wouldn't have found with manual testing before they go into production.
> Does your opinion only apply to unit testing or to automatic testing in general?
In my experience once people learn about tests, they want to see 1. tests per commit and 2. high code coverage. This leads to unit tests, which don't really test much of anything, and what they do test (the layers under you like the compiler) is by accident.
The most useful testing would be integration tests on the side, and then regression tests because they're guaranteed to have already found at least one bug.
I agree, automatic testing would be nice. Unfortunately, I'm the one at work having to do write the said automatic testing of a component that makes parallel DNS requests [1] in an event driven server [2] and need to test that we cover the case where A returns, then B, and B returns, then A, A returns, B returns after our timeout [3], B returns, A returns after our timeout, A returns, B never does, B returns, A never does, while at the same time, making sure the ping tests (the components send a known query to each server to ensure each server is still available [4]) go through because if they don't that particular server will be taken out of rotation. God, I would love to have that automatically tested. What I don't like is having to automate such tests.
[1] To different servers that return different information for a given query. The protocol happens to be DNS.
[2] Based upon poll(), epoll() or kqueue(), depending upon the platform.
[3] We're under some pretty tight time constraints for our product.
It's pretty easy to find various papers/studies about TDD, unit testing, and the like by searching online but I would not call any of them scientifically or empirically rigorous. Given the lack of any objective or agreeable standards regarding what "code quality" means, these papers can draw conclusions to support any view.
You'd be surprised. Often, the tests are more work than the "working code" itself. At companies that have a test heavy focus, I'd say roughly 50% or 60% of the effort is spent on tests (this includes future work to maintaining tests as they are affected by other areas of the code, etc.) You may not be able to get your PR approved without enough tests.
How do you know it's working if it doesn't have unit tests that exercise edge cases? Or even basic functionality? Also, good unit testing can allow you to reduce the time it takes to write software by giving you an isolated environment to run your code.
With interpreted languages it is trivial, you just try parts of your code. For python there's for example ipython which makes the experience really great.
Also I don't agree with GP, there are places that require unit tests, especially if multiple people are working on a project. You could be a developer that never made a bug in your life, but others might not be as great :) besides many bugs also happen by changing other parts of the code. Some of your assumptions might have been 100% correct when you wrote the code, but someone can change different part of the code and now your function is called with values that were previously impossible.
The great thing about statically typed languages is that they might detect large part of these bugs, but you're out of luck with dynamically typed languages.
Example of this was the Python 2 -> Python 3 migration. Having a large test suite helped tremendously.
No. Failed tests can only tell you that you've found a problem; passed tests can't tell you there aren't any problems present. Provably correct code can be proven correct.
Hmm. What if your unit tests are statements about your code? I view them as a way of locking in functionality and saying that the code does or doesn't do things for certain inputs. That's what unit tests are in my mind. I find it strange to view tests as only way a of catching bugs. You just use them to say what your code is/isn't capable of doing.
Using tests as a way of finding problems seems like not a great way of saying code has issues x,y, and z but they are a great of saying something does a, b, c. If you are going through a refactor you might not be implementing a, b or c correctly anymore and if your tests are designed well this will be caught.
If we are only talking about bug catching, you are right there are better tools. But I'll say if somebody says code does something and there are no tests demonstrating that fact, I don't trust it generally unless it's something like a game where it's hard to test certain things.
Actually, what could really helpful would be a "simple" precond-postcond-invariant calculator...
The principle is quite simple: for every method you write, the "calculator" tells you what "outside" variable are modified. Obviously it seems quite simple with FP... but not so simple with object/imperative language as soon as you start to call other procedures or subsystems or syscall!
Of course, it could be nice that this "calculator" works inline (while you write your code) and not afterwards (when the program is compiled). It should be "smart enough" for example to tell you that your code is calling an INSERT on a database in a sub-sub-sub procedure of your code or that it is effectively calling some file writing somewhere in the called path
Code is already a formal specification of sorts. I'm sure more rigor will lead to fewer errors, it also will increase development time, which is a poor tradeoff in many domains. Plus, a source of many bugs is that the specification is wrong, which will remain true even if rigor is increased.
Edit: Of course there is low hanging fruit that can be incorporated into languages. I write a lot fewer bugs in C++ than I do in C and a lot fewer in Swift than I do in C++.
> But as a goal, that's really what you want: we want to solve the problem by saying "this is the solution" and then poof the solution appears!
Actually, that is very often not the case. Many SW projects start with an idea of what the software should do, but there is no easy way to describe it in full detail. Writing down the complete specification right from the start would be very hard and prone to wasted work.
For example, say you want to write a messaging system. You could try to imagine the whole thing and describe it in some formal language, but this will most often produce huge pile of ideas that weren't tested and are not part of desired solution in the end. It's too much "building in the head".
Instead, we should prototype. We write the simplest possible thing that roughly does what we want, and then add and remove features as convenient/needed and improve various characteristics.
We handle corner cases as they appear, sometimes learn that something new has to be studied / understood first, and some parts that have been written have to be rewritten and then we do so.
Writing the spec for the result may be done in the end, after the software is done and works and the ideas are proven to work.
I think the way software will evolve into a more traditional engineering role will be in layers of abstraction starting closest to the hardware. Which will become certified services/systems to build more loosely on top of or using as components.
We’re starting to see this with commoditization/formalization of platforms. OS, storage, cloud, infrastructure, e-commerce are getting to the point where they’re pretty well explored. In another 20 years it’ll be so specialized and optimized that you’d need a Ph.D to make meaningful improvements in those areas.
So people won’t feel the need to try and recreate it themselves except for learning and as that process continues we’ll see those things formalize best practices which would eventually turn into engineering standards, etc…
You’ll have licensed and bonded engineers building low level highly formalized systems/services which meet certain guarantees. Similar to how I imagine civil engineering works.
Then for the rest of the industry you’ll have what we currently have which is like home construction where almost anything goes, but there are pieces that need to be validated by actual engineers or licensed professionals similar to how electricians and plumbers are licensed.
Everyone knows that software engineer is not an "engineer in the traditional sense". There is no substantial chance of confusing HR or manager hiring for software development when you say to them "I'm software engineer".
It's a protected job title in most state/provincial governments
For example, "In Ontario, the titles 'engineer' and 'professional engineer' are restricted by law ... When individuals are found to be using the restricted title 'engineer,' the courts can impose fines, restraining orders and jail time" [1].
It's just hardly reported/enforced in the software industry because most "software engineers" do not work on things that affect public safety or can result in loss of life.
These days, coding is the quickest way to find out what's needed. That is: implementation precedes requirements.
This is the real reason software is eating the world. Incumbents lose because aren't able to reorganize themselves around a software core, and so can't try things.
Formal methods are intrinsically satisfying, as are all quests for perfection and truth.
"Wouldn't it just be grand if we could write a spec for some piece of code, then let the machines do the rest?"
This is quite possible, if you write the "spec" in a programming language: i.e. write a test case. The problem is that people imagine a formal proof in, for example, English, being simpler or easier to think about than in, say, Python or C. In general this is not true. In general, moment you expect your "spec" (code) to be sufficiently precise for a machine to act upon it, the simplest way to describe it is a programming language.
And if you can invent a better way to describe it, and it is unambiguous enough for a machine to act on it, you've invented a new programming language.
Software isn’t a special unique snowflake as some developers, who have never worked in other fields, will tell you. Software certainly seems unique though when you have no other professional employment experience to compare it to.
The primary problem with missing formal methods is supplemental guidance bolted on artificially in search of a problem producing some canned solution. The reason for that people tend to focus on their personal needs instead of business needs. For example many developers will suggest an approach of personal familiarity even if it costs more, performs poorly, and lowers product quality. Furthermore many developers tend to focus on composition and tools first as opposed to saving composition for the final step and using either no tools or a more correct tool for the job.
Formal methods on the other hand provide guidance for a planning process that articulates the work to be performed opposed to how that work should be done.
I’ve once thought this way, but over the years I’ve come to decide that such formalization will never outpace frameworks. Frameworks are easier to grok and scale better than rigorous education.
I wrote a (dummy) paper for a class about automatically generating a formal specification based on a given program using a GAN. I don't really know how feasible it would be, but I feel like that would circumvent some of the issues being brought up in the comments about how difficult it is for organizations to use formal methods in their workflows.
Program verification is an expensive activity. Navigating the cost-benefit trade-off of program verification is ultimately a business decision. https://t.co/IUCLtDLROC
I am a research intern working in a group studying and developing formal methods. The tech we are working with has served to verify planes, trains and has been used in NASA. AMA if anyone here is curious about how the world around FM works.
For business applications, formal methods are really limited to core algorithms.
The real problem is that data entry and validation should be a solved problem. Sadly this is not the case. Try using .Net WPF or React and your only recourse is to Stack Overflow. These frameworks are over complicated and are full of weird edge cases that make everyday programming tedious and error prone.
Lets get our house in order before we break out the temporal logic.
We could start by stop calling software engineering to 6 months coding bootcamps, letting it for actual 5 years experience being taught correct engineering practices and respective skill assessment.
Then we can start talking about adopting what is still missing from real engineering practices into the field.
Ah, and in the process make companies liable for skipping quality control when everything goes wrong.
I think a sweet spot is "specification by example" also called behavior driven development (BDD). You take a more or less standardized, but human readable language to write down the specification and implement test adapters for it.
Having done BDD at work, I can only say it's not the right approach.
The "human-readableness" of the test cases produced with BDD is debatable - at least at our company only technical people familiar with the code could understand the spec.
At the same time BDD was very cumbersome to do, while not really providing us any benefits.
The proper place for these is compilers, linters, fuzzers, unit tests, etc.
Rust does a pretty damn good job of preventing memory and threading errors for example. If it compiles and there is no unsafe it will run. It may not do precisely what you want but it will run and not crash.
Rust programs can fail gracefully when they hit unexpected conditions - Safe Rust guarantees only extend to memory- and data-race safety. However, these tend to be table-stakes if one wants to make any kind of interesting statement about the semantics of programs, and there's a lot of ongoing work wrt. enabling this kind of use of Rust.
The implication there is that you may have other bugs (i.e. logic bugs, cache invalidation errors, off by one errors) but your program will not crash due to memory issues. It removes a class of errors.
It's not saying that it will provide wrong results in cases where it would crash otherwise due to memory errors.
It's not about not crashing. C lacks memory safety, which leads to many bugs, many of which are security vulnerabilities. Memory safe languages such as python or rust just don't have those problems. The vast majority of languages are memory safe but only C is to-the-metal fast. Well, until Rust came along.
The really bad thing about C's memory unsafety is that crashing is pretty random.
Unless you use a plethora of tooling that patches over these shortcomings, wrong C code doesn't crash reliably (and even then it's not finding everything, only most bugs).
You like security vulnerabilities? I don’t think we are talking about the same thing at all. It’s perfectly easy to make Rust “crash” or log errors when it does the wrong thing, and to write unit tests for it.
Memory errors and threading bugs are never ever something you want.
Erlang addresses the memory safety problem to my satisfaction. I'm not eager to embrace the verbosity of a language like Rust until we get closer to a guarantee that if code compiles, it's logically correct as well as safe.
Erlang is my go-to language when I have the freedom to choose, and it is explicitly designed to crash when there's a mismatch between the expected and current state.
So, I was indeed missing the point of the statement.
Programmers need to come to terms with their jobs being Art and Science rather than Science.
Unless you are keeping track of transistors, you are not doing science/engineering.
Sure we gladly accept the title SWE because it pays well, but in physical world, engineering has a significantly higher qualification for truth.
"Should we write this in x or y? Should we save this as an array or directly in the database?"
Subjective, and impossible to come up with a single, undebatable correct answer.
As opposed to, "What metal should we choose for this wire harness?" With cost and quality metrics, there's only 1 correct answer.
Making a clear cut distinction between engineering and programming would do literal wonders for programming. Anything deemed SWE should be objectively correct. Anything else would be free to be scrutinized by Art, Tradition, Science, and Authority.
Besides, TLA+ is not panacea in formal methods. It’s a specification language, which means its users will still need to master temporal logic, first-order predicate logic, formal specification, and a slew of concepts, such as spurious counterexamples. And remember that essential complexity is in specifying a system? Now try to specify something as simple as quicksort. Let’s just say specifying a system with mathematical logic is beyond 99.99% of software engineers. Not that people are incapable, but I doubt people have enough incentive to swallow the cost. For one, how much does it cost to debug a spec? Don’t believe me, do try out the book by Hehner: http://www.cs.utoronto.ca/~hehner/FMSD/, or introductory books like this: https://www.amazon.com/Verification-Reactive-Systems-Algorit..., and you’ll get the idea. And yes, I thoroughly studied the mentioned books cover to cover and still think they are not for the majority of software engineers.