Hacker News new | comments | show | ask | jobs | submit login
Software Engineering ≠ Computer Science (2009) (drdobbs.com)
482 points by nreece 164 days ago | hide | past | web | favorite | 305 comments



Software engineers need to know to recognize and classify problems in CS. You need to know what algorithms and data structures exist, what their properties are, and what they are called. The areas that come up will come from Math and Computer science (which are closely related). A solid computer scientist person knows how to derive some Dijksttra algorithm from first principles. A good software engineer recognizes the problem at hand, and recalls the algorithm to pick when presented with the problem.

What is that problem in front of you? Gradient descent? Tree traversal? Multiple dispatch? Path finding? What structure represents the data or algorithm? Ring buffer? Blocking queue? Bloom filter?

You rarely need to remember a pathfinding algorithm or trie implementation by heart. What's important is that you a) recognized the problem at hand as "path finding", "bin packing" or whatever. Terminology is important here. The good software engineer needs to know the proper names for a LOT of things. Recognizing and labeling problems means you can basically look up the solution in no time.

So CS is definitely very relevant for software engineering - but you need a broad understanding instead of a deep one.

There is always the argument that a lot of devs basically to monotone work with SQL and some web thing in node and rarely even reach for a structure beyond a list or map. That's true - but sooner or later even they bounce into a performance or reliability issue that's basically always due to incorrect choice of data structure or algorithm. I'm only half joking when I suggest that most of todays "scaling" is compensating for CS mistakes in software.


Sometimes you're just forced to accept you need to take some shortcuts. There's a few fields for which my general approach is to just try and maintain a mental index of "when might I want to use this".

I'd have a hard time implementing my own crypto, but I've learned enough to know how to use it to secure communications, hide or protect information, ensure no alterations have been made to some arbitrary asset, identify an asset's source, etc.

I love working with a well understood and boring RDBMS. It's predictable and it lets you quickly move on to other problems. But you still need to have a good understanding of how it's implemented in order to store and query your data efficiently. If you have a poor understanding of how indexing works, you'll probably have a hard time selecting the right data model.

There's actually lots of fun problems in the frontend world. Try to write a multi-touch gesture responder, it's very tricky to get things right. How about a natural animation system that allows interruptions? CSS animations tend to look unnatural because they're largely time-based, and they don't handle interruptions very well. (Spoiler alert: springs are the magic sauce.)

Learning about compilers unlocks lots of powerful skills too. You can implement your own syntax highlighting, linter, refactoring tools, autocomplete, etc.


> I'd have a hard time implementing my own crypto,

Yes. Or, it's actually rather easy - but it's extremely hard to do well. That's why one shouldn't do it (or, you should, but perhaps for toying with and not for production). See, you obviously know enough about the topic to know this! This is exactly one of those nuggets of "broad knowledge" you need as a software developer. How much do you know about crypto:

- You need to know when to hash ansd when to encrypt.

- You need to know the properties perhaps of symmetric/asymmetric crypto and key exchange

- You must know that you never implement any of these algorithms yourself, you only choose from them.

That's about it. You need to know what you don't know (how to write reliable crypto) in this case.

> If you have a poor understanding of how indexing works, you'll probably have a hard time selecting the right data model.

Right. Basically understanding a database is around the amount you need to implement a toy database. You know the difference between an index lookup and a scan and so on. Lacking that understanding means the database is some oracle (haha) you feed SQL and it spits out data. If you do know a bit more you might have a vague idea about how data sits on pages that are oriented into a btree. You might know that how the disk tree is magically updated in both ends to be consistent even if power is switched off mid-write and so on (I don't know, this really isn't my area - I have coded 13 years without a db). You didn't invent or even deeply understand any of these algorithms to the point where you could write them on a whiteboard. But it does help you if someone asks "what happens to users' bookings if I cut the power?" or "will it be faster to join in the DB or later in the language".

Another pet peeve of mine is people who can't identify NP /exponential problems. It happens several times per year that junior colleagues of mine develop solutions that are exponential in time/space, because that's what the problem is.

Them: "Look, I optimized the order in which we pick the X from the list"

Me: "That will take the remaining life of the universe already with 30 items!"

Them: "Dang :( that took me two days to write"

Me: "Do it the dumb way and get off my lawn"


> Yes. Or, it's actually rather easy - but it's extremely hard to do well. That's why one shouldn't do it (or, you should, but perhaps for toying with and not for production). See, you obviously know enough about the topic to know this! This is exactly one of those nuggets of "broad knowledge" you need as a software developer.

Everyone should write their own crypto library at least once. Nobody should ever use their own crypto library for anything. :)


While I was an admin for hackthissite.org I created some basic encryption algorithms for people to break.

One person gave me a step by step guide to how he broke it. It was amazing, and incredibly enlightening how hard encryption truly is.

Mathematical analysis of the encrypted data poor encryption easy to break.


Eh, I'd much rather "software engineers" had good product and business sense, since most product managers and CEOs sure don't. No point in building the wrong thing well.


> No point in building the wrong thing well.

This is validation, and in systems engineering (which software engineering both came from and fed back into before we, in software, forgot about it). It's part of the V&V (verification and validation) portion of system development. Verification means ensuring the system is correct with respect to the specifications and requirements. Validation means ensuring that what's being made is actually the correct, desired thing.

We need our software engineers to study systems engineering, where you will find formal methods being applied to the task of developing complex systems, and rather effectively at that.


That's of course another factor that goes into software engineering skills. Collaboration, Domain knowledge, Design skill, Customer knowledge, some ops knowledge and so on.


If the product managers and CEOs don't have it, how do you expect the software engineers to do better? That isn't our area of strength.


Problem solving is our area of strength.


> No point in building the wrong thing well

Very powerful statement there.


I don't have a formal CS background and have indeed run into the issues you've described.

I generally resort to Google and then go find the best approach and implement it if necessary.

I taught myself the basics of the underlying stuff (and it helps that I'm an older developer who grew up on Turbo Pascal and C since I do have a working knowledge of what the machine is doing underneath).

Those are rare cases though.


Same here, I'm completely self taught, I know when the appropriate data structure for the problem at hand might be a binary tree, yet I never learned how to implement a tree data structure.

I've read plenty of implementations, but I never "studied" it thoroughly. And that's one of the reason I never went through an interview with a Big 5 company, where you're normally expected to implement i.e. a tree insertion algo on a whiteboard.


I've been programming since I was a kid, I've implemented a linked list exactly once in all that time and that was when I went back to uni to do a foundation degree in software engineering.

In reality if I need a particular data structure I just pull it in from the standard library.

You don't have to be a structural engineer to build a house ;)


> You don't have to be a structural engineer to build a house

But you need to know exactly what parts actually require a structural engineer or structural calculations. You'll be much faster at building your house if you don't have to think about when you call your engineer and when you can just guesstimate. And you are obviously re-using a lot of structural engineer work when you do (because you buy a prefab door where someone already did the calculations on the hinges etc.) Same with not needing to be a computer scientist to do most software engineering, but you need to use a lot of CS work done by others, and it speeds up your work immensely if you know what term to google.

Also, implememting these deep things (trees, linked lists, hash tables, etc.) mean you have a much better understanding of the tradeoffs you do when you use them. Trying to remember some O(N) numbers for various structures is much harder than just spending 1h making a toy linked list, 1h making an array backed list, and 3h making a hash table just ONCE and then you are set for life understanding the complexity of those things.


> You don't have to be a structural engineer to build a house ;)

No, but you're going to need one on call.

I'm reminded of when when I was still living at home and my parents had an extension and garage conversion done.

Two builders did the whole thing, one in his late 40s and one in his 60s, and for the most part everything they did was just grunt work with very little need for craftsmanship. It's just banging together stud timbers, pouring concrete, digging holes, laying and packing store bought materials etc. Sure there's a lot of experience behind doing that safely and efficiently, but it's not rocket science and nothing a confident DIY enthusiast couldn't read-up on as they went along.

However there were 3 times when they had to call in experts. 1) a bricklayer (a surprisingly impressive craft if you don't want your house to look like shit). 2) roofers (you definitely don't want a roof laid by amateurs) and 3) a structural engineer to advise on (and to sign-off on) reinforcing a supporting wall that held up part of the new roof.

Software isn't that different. You need someone who really knows their shit for maybe 20% and the rest you can sort of palm off and can be done by any of ours peers with just a few years experience.


Nothing to study. Trees are recursively understood and the methods are recursive. Understand recursion and you understand recursion.


>> A good software engineer recognizes the problem at hand,

>> and recalls the algorithm to pick when presented with the

>> problem.

In my experience people able to pick the right algo straight away is extremely rare.


What's the right algo might not even be a clear answer, but even just identifying something as needing a tree-like structure (rather than say a linear brute force search) could be a huge leap in the right direction. Being able to do rudimentary complexity calculations in your head helps.

Broad knowledge could even identify several different approaches to the same problem, which can be compared on strengths and weaknesses even before trying one.

E.g for a nonlinear an optimization problem you might consider simulated annealing, genetic programming, gradient descent etc. and you need to know which is easy/hard to write, that gradient is good if you have (monotone) gradients etc.


Everything you suggested software engineers needs to know was covered in two courses in mt university- one called data structures, the other called algorithms.

And those didn't really need any requirements apart from basic math.

There is absolutely no reason for a software engineer to learn abstract algebra, infinitesimal math or any of the other dozen courses that you'll never ever use.

And even then, throughout my 10 years now, I can count on one hand the number of times I actually needed to use these things.


To me the whole point of a university course is to learn part of the theoretical knowledge you would not learn spontaneously by practicing. That includes way more than just algorithms and datastructures.

For example, a good knowledge of language theory will prevent engineers from creating scripting languages that cannot be parsed. Probabilities are everywhere in machine learning. Many engineers work with by-products of operational research and need to understand the theory behind them to make them efficient. Complex numbers and trigonometry (quaternions etc.) are needed to build even basic 3D engines. Recent probabilistic data structures such as hyperloglog are being integrated into modern database systems. Good understanding of operating systems is useful for security and parallel programming...


It depends on your area. I am a software engineer and write software for biomedical signal analysis and finance/trading, and I promise you it is very math intensive. There is a pretty wide gap between a SE creating websites and a SE designing trading systems or medical devices.


The math involved in these can be reduced to replace|reduce operations|calculation|procedures in existing algorithms for the most part (or not)? If you need to devise new algorithms for a stock trade or for pattern recognition, signal inhibition or enhancement, etc.. based on domain you might be in an area I don't recognize.


As others have pointed out, some math is CS that is almost independent of the problem domain, and then there is obviously lots of math that is domain dependent. I use tons of linear algebra, trig etc. in my field (CAD).


Well done! These are the two topic areas that are really interesting for the practically minded as well. It does help to like math rather than hate it for a software engineer but refined math is mostly out of scope for most programmers.


Where could we find a kind of comprehensive list of CS problems, algorithms with association of the real problem what it can solve in practice?

You mentioned a few here, I suppose, there could be a full list already somewhere. :)


Excellent question. I'm sure this already exists, but if it doesn't it would be fantastic to have an archive with a toy example/challenge similar to Project Euler (https://projecteuler.net/archives) but with more "high level" algorithms with concrete and concise examples. For lack of a better place, this is a good starting point (with examples for each)

https://en.wikipedia.org/wiki/List_of_data_structures

https://en.wikipedia.org/wiki/List_of_algorithms


> So CS is definitely very relevant for software engineering - but you need a broad understanding instead of a deep one.

Absolutely. A general understanding of CS is necessary to be a competent software engineer. Just like a general understanding of physics is necessary to be a civil/mechanical engineer.

If there is a VENN diagram, there is definitely an overlap of "theory" and "engineering". But theory != engineering.


I don't understand why the author is so suspicious of formal methods. Other engineering disciplines are based on the application of solid, well-understood principles from the natural sciences to practical problem domains. There are few solid, well-understood principles in computer science that are directly and obviously applicable to software engineering so far.

I vigorously contest the idea that software engineering cannot be rigorous and so shouldn't try.


Because there are what, six types of bridges? (EDIT: 36 according to wikipedia.)

There are six thousand types of programs (as a wild guess), and they all interact with each other in an exponential explosion of complexity.

For a formal method to work, it has to be generally applicable across a wide range of situations. There are methods like that in software engineering, and you see them in situations where the program is potentially life-threatening. But most programs would be hindered by this rigor.


I'll put it this way. In mathematics, the atomic unit of progress is the proof. Proofs are constructed from axioms and other, smaller proofs. The Curry-Howard correspondence is the direct correspondence between computer programs and mathematical proofs.

When programming in a well-typed language, everything we do is a degenerate proof of something. I think it's entirely reasonable to try and make it easier to make less degenerate proofs, and encourage the re-centering of proofs as the atomic unit in programming, because they are for programming (as in math) a well-defined, solid formal system that solid systems can be built on in a way that nothing else can be in software, at least thus far.

In a world where software flows more freely than water, the correctness and reliability of software systems must be taken with utmost seriousness. This applies to basically any tool that contacts end users, and a lot of ones that don't as well.

What do you propose as the theoretical basis for the engineering science of software development?


I think it's self-evident that there is no theoretical basis for software development in the way that would satisfy your criteria. Software dev is generally a scattershot haphazard endeavor that involves trying dozens of angles until one of them works. That's an unsatisfying answer, but if we try to hide from the implications just because we don't like it, we'll end up creating companies that fail to startups without these constraints.

It's also self-evident that in most situations, correctness and reliability aren't a concern. The counterexamples account for maybe 1% of the field. 99% of the time, if your program breaks, you can pay someone to go fix it and no serious harm was done. Even Github outages, which affect almost all of us, hardly matter.


>I think it's self-evident that there is no theoretical basis for software development in the way that would satisfy your criteria.

Yes. Writing software is not math. Does it use math concepts? Of course, but so does just about everything else. While we're at it, writing software is also not engineering, even though there are engineering concepts that can be applied.

Writing software is its own thing. Pretending that it's something else (like math or engineering) invariably leads to category errors.

Think about food recipes. They certainly use measurements and timing, and engineering principles can certainly be applied (especially when executing them on an industrial scale) but they are not math (or engineering) either. Examining the measurements, timing, and the production chain doesn't tell you anything about whether the recipe is delicious or inedible.

Arguing that a piece of software should be "proven correct" makes about as much sense as arguing that a recipe should be "proven correct". You might as well judge the recipe by the standards of poetry ("Does it have evocative imagery? Does it rhyme or alliterate well?").

People have been chasing the unicorn of software correctness proofs for 60 years, with a notable lack of generalizable success (there are plenty of toy examples, of course). What usually happens is that the "programming is math" people come up with some bizarre academic language that no real-world programmer would use unless forced to do so at gunpoint (followed by the new language sinking without a trace). Alternatively, the "programming is engineering" people come up with some baroque formal process that requires you to write a 500 page document and get six committees to sign off on it before you can write "Hello, world". I'm old enough that I've seen these things happen multiple times.

>It's also self-evident that in most situations, correctness and reliability aren't a concern.

I wouldn't say they aren't a concern at all, but if you're wasting six months screwing around with formal proofs, UML diagrams, or things of that nature, while in the meantime your competitor is iterating three or four times, that is definitely a concern. Operate that way and your milkshake is going to be drunk, yo.


>Writing software is not math.

You're wrong, and I've explained this upthread.

>Think about food recipes.

Do you have experience with non-imperative programming paradigms? I'm sorry to say that the comparison to recipes in this context seems fairly naive.

>People have been chasing the unicorn of software correctness proofs for 60 years, with a notable lack of generalizable success (there are plenty of toy examples, of course).

Static type systems are arguably a product of this, especially advanced ones like Haskell's.

>What usually happens is that the "programming is math" people come up with some bizarre academic language that no real-world programmer would use unless forced to do so at gunpoint (followed by the new language sinking without a trace).

Now you're just being anti-intellectual. The whole point is that the real-world programmers are stumbling into all this crap constantly without realizing it. It's completely fucking unavoidable and very much tied up with the fact that programming is math. The only question is what you choose to do about that: learn the math, or stay ignorant.

I don't mean to be condescending but I find comparing UML diagrams and proof assistants a little offensive (among other things), and it suggests you don't know what you're talking about. Modern proof assistants and other formal techniques like dependent typing happen while you're programming (Idris/Agda) or generate programs themselves (Coq), they aren't some sort of Waterfall-ish thing where you have to deal with all the ceremony before you start to get shit done. On the contrary, you get shit done, and it works better when you're done with it too.


Your post contains a lot of unnecessary aggression. Rather than suggesting someone "doesn't know what they're talking about", just provide evidence to the contrary.

That aside, there seems to be something missing in your analysis, or there would be a lot of successful startups stealing the market using proof assistants and formal techniques. It's not anti intellectual to point out that programmers don't want to use academic languages that have poor usability, steep learning curves, and garbage for standard libraries.

Consider when a regular house is being built. There are many problems that could be avoided if an engineering firm spent an entire year analyzing the designs and their interactions with the target site. However, that's not done because it takes significantly longer and costs obscene amounts of money.

It's easier for the construction crew to fix problems as they encounter them and for the owner to do repairs on the house 40 years later. Yeah, the house isn't as reliable, but it cost 100,000 instead of 1,000,000.

If you want developers and managers on your side, you need to show them benefits over the quick development of python/php/Javascript other than bugs in edge cases being less frequent. Don't just bemoan how unenlightened everyone is.


Well, all the naysayers could also do some research instead of being as dismissive as they are. Here's an example of actual use of formal methods: http://lamport.azurewebsites.net/tla/amazon.html Why aren't they used more widely? Because when people hear"math", they immediately think of it as academic. It would also be overkill for most projects.


TLA is one of the systems that are actually very practical. You can go up or down as many abstraction layers as you need . There's a great introduction written by @pron on his blog [1].

However, it's worth pointing out that you don't magically get the final program out of the TLA proof. The proof only works for the abstraction level that you chose to write it at.

[1] https://pron.github.io/


While it is true a full formal proof of a C++ program, for example can take significantly longer and cost obscene amounts of money, that is not true when starting with this approach from the ground up.

There is also a certain level of compromise. By relaxing the requirements a tad, you can still gain many of the benefits while maintaining the light and nimble feel.

So why don't startups use this? Well because there are entrenched technologies that make it very difficult that have nothing to do with the merits of the approach itself.

It is up to us, as developers to take the charge and push for these techniques through open source development, advocacy, and training.

But to claim these techniques are a failure simply because startups aren't using them is pretty ridiculous.


Higher assurance from the start requires something functionally similar to Ada's Spark.

I've never seen that used in the wild (though I'm in the wrong domain (web) these days).


>>Writing software is not math.

> You're wrong, and I've explained this upthread.

I suspect there is a difference in semantics, here. Software is inherently mathematical, yes. But the practice of writing software is not the practice of doing math.

The output of doing math is proofs. The output of writing software is...something that does something when run on a computer, hopefully interesting, meaningful, useful, or entertaining. In the vast majority of cases, we have not and will not need formal proofs of correctness for software to achieve these things.

If I want a blur effect on some portion of a UI, and choose to implement that with a Gaussian blur, what value is there in formally proving that a specifically Gaussian blur has been applied? All of this is inherently mathematical, but that doesn't imply a need for mathematical proof.


I agree. It's kind of a focus thing. When I'm writing code, I do it because I want stuff to happen - in real life, in our physical reality. Any concept of code-as-a-proof is not even on my radar, unless I'd be writing life-critical software for NASA or a hospital, or something.

Here I also thing that 'Turing_Machine is both wrong in details and correct in the general point with their recipe example:

> Examining the measurements, timing, and the production chain doesn't tell you anything about whether the recipe is delicious or inedible.

You could, in principle, apply the knowledge of medicine, chemistry and biology, coupled with process engineering and wide-scale people studies, to construct a theory of tasty foods, which could lead to the situation in which you could evaluate any recipe on a theoretical basis. But getting to that state would require tons of up-front work to be done (some of which is being done for unrelated reasons, so maybe in the future a "food theory" will assemble itself) - and in the meantime, getting a piece of tasty food is done much faster and cheaper by finding the solution instead of deriving it. This search is done through iteration.

Similarly, in software, 99% of the time we find a solution, not derive it from first principles - because the former is much cheaper when we care about the solution, and not solving the entire general class of a problem at the same time.


The same is true for math. It is the same thing, because of the Curry-Howard isomorphism.

The reason for all the confusion is that programmers are already doing math. They just don't realize it and reinvent the wheels invented by the math community in the past century. It's a matter of semiotics.


Curry-Howard isomorphism does not mean the same practices that are performed in mathematics research must be applied to writing software.

Some aspects of language design are reinventing wheels invented by the math community. This is far from constituting the set of "writing software" or "software engineering".


Nah, I still think it's about different goals.

But assuming you're right, I'd like to know - what mathematical wheels am I reinventing in my dayjob of building UIs that let people click up some stuff that later gets put in business-specific XMLs?


Math, at least applied math, is not the goal, but the method. Programming neither is the goal, it is the method. Understanding that both are language to express the path is of the essence.

The XML as a vessel of human-knowledge is limited. Good intentions have brought it OWL/RDF, XML Schema, XSLT; examples where others before us have tried to extend the XML into the domain of semantics and algorithms. Nevertheless, it was found that, without an expressive type system, large and complex business domains cannot be modeled. Apparently, in order to model abstract business domains, we need a language that composes both high- and low-level with near invisible seams.

So, that click-your-XML-application might benefit from a reflective logic, enabling the user to explore the possible state-spaces. If your app uses relational algebra from DBMSs, it might be able to combine the relational algebra with the algebra defined by your schema's. The UI state-space and the XML schema might be an isomorphism, which helps prove completeness of your UI-builder implementation.

Above all, the mathematical way of thinking helps reasoning, communication and correctness. It might not be the only way or perhaps the way is dated. Nevertheless, ignoring math as a programmer, feels like ignoring music theory as a musician or linear algebra as a structural engineer.


We haven't been arguing that programming can't benefit from math.

> ignoring math as a programmer, feels like ignoring music theory as a musician or linear algebra as a structural engineer.

We're not ignoring math.


You're defining relationships between types through functions.


The Curry-Howard correspondence says that a program is the same as a proof. A proof of what, though? Not that the program does what it's supposed to. No matter how far you go with formal methods, what the program is supposed to do is always informally specified. (There may be a formal specification. That specification wasn't handed down from on high at Mount Sinai, though. It's a formalization of the informal, badly-stated, half-unconscious informal specification that is the impetus for creating the program. Does the formal spec match the informal one? Can you prove it? No, you can't - certainly not formally.

It's like you read the article, and set about disproving it, without ever really understanding it.


> You're wrong, and I've explained this upthread.

No. I'm not.

Are formal math and engineering useful in cooking food? Yes, they can be (particularly if executing on an industrial scale). Are they necessary? Not really. Plenty of great cooks just throw in ingredients in the amounts that seem right to them, perhaps tasting the result once in a while. Are they sufficient? Nope. If the best mathematician and best engineer in the world collaborated, the result might be edible or it might be an inedible mess.

If neither formal math and engineering are necessary nor sufficient to produce good cooking, we can safely conclude that cooking is not math or engineering.

> Static type systems are arguably a product of this, especially advanced ones like Haskell's.

Haskell is used by, to a first approximation, no one. Which was my point.

> Now you're just being anti-intellectual.

No, I'm not.

> The only question is what you choose to do about that: learn the math, or stay ignorant.

I was one math class away from getting a dual BS in math and CS in undergrad. Rather than stick around for another semester, I took the BSCS and went to grad school.

You can safely assume that I "learned the math", and that I am not "anti-intellectual".

The point here is that while, on the most fundamental level, the universe may indeed be made of math, that doesn't mean that treating everything with the math toolbox is the best way to proceed. Expecting that math methods will produce great software is a fundamentally goofy idea -- just as it would be to expect math to produce great poetry, painting, architecture, or anything else (and the same for engineering).


>If neither formal math and engineering are necessary nor sufficient to produce good cooking, we can safely conclude that cooking is not math or engineering.

Bingo. Good thing we're talking about programming, and not cooking.

I asked if you understood what you were saying because you can't really cook declaratively, recipes are inherently imperative. The comparison to programming thus only fits for imperative languages.

>Haskell is used by, to a first approximation, no one. Which was my point.

If that's your point, then I agree. Not sure how that's in disagreement with my points though.

>No, I'm not.

So what then were you intending to convey by vague references to incomprehensible academia? Surely you weren't meaning to imply they're just wrong, were you?

>The point here is that while, on the most fundamental level, the universe may indeed be made of math, that doesn't mean that treating everything with the math toolbox is the best way to proceed.

Yes, but such a general claim is not what I'm arguing for.

>Expecting that math methods will produce great software is a fundamentally goofy idea -- just as it would be to expect math to produce great poetry, painting, architecture, or anything else (and the same for engineering).

Well, it of course depends on what you mean by "great" software. But I still think you're missing the point here. Computer science is a lot closer to math than poetry, painting, and architecture are. There is a direct, elegant, simple, formal correspondence between programs and proofs. The same cannot be said for those other disciplines.

My only claim is that proofs are slightly more solid intellectually and formally speaking than programs, so converting more programs into proofs will make easier to reason about, and since programs can be easily converted into proofs (relative to poems or architecture or paintings or whatever) that this is probably a good idea. I still don't understand what your objection to that claim is.


> The comparison to programming thus only fits for imperative languages.

I am not comparing recipes directly to programming. I used recipes as an example of something that has mathy and engineery facets, but that is not engineering or math.

> So what then were you intending to convey by vague references to incomprehensible academia?

I wasn't making "vague references" to anything, nor did I say that academic languages were "incomprehensible". I did say they were bizarre, which is a different thing entirely.

I'm not sure why you find it hard to believe that someone could understand academic languages of the sort you evidently prefer, and yet somehow still choose not to use them. Your attitude seems to be that anyone who doesn't use your preferred methods is "anti-intellectual", "ignorant", or any of the various other personal insults you've used.

Why not just go off and write some awesome software using your methods? If they work as well as you claim, you'll have some hard evidence to back up your assertions

>There is a direct, elegant, simple, formal correspondence between programs and proofs."

You are defining great software as "software that can be proved to behave in accordance with some formal spec", while people who actually use software (i.e., the people who pay the bills) define great software as software that performs the task they need to have done, can be written economically, and that is easy to use.

By your definition, a great recipe would be one that came out exactly the same every time, even if it tasted like shite, or took three weeks to make, or...


>>Haskell is used by, to a first approximation, no one. Which was my point.

> If that's your point, then I agree.

You are agreeing to something that is false. See my reply to the grandparent.


> Haskell is used by, to a first approximation, no one. Which was my point.

Unless you have a very loose definition of"no one" that simply is not true.

Just off the top of my head:

Haxl at Facebook Bond at Microsoft Supply chain management at Target

Also see:

https://wiki.haskell.org/Haskell_in_industry


>Unless you have a very loose definition of"no one" that simply is not true.

What part of "to a first approximation" was unclear?

https://www.tiobe.com/tiobe-index/

Haskell is in 47th place, which is consistent with where it ranks in every other popularity list I've ever seen.

I'm standing behind "to a first approximation, nobody".

Note that I'm not saying that Haskell is a bad language, or that Haskell programmers are bad people or anything like it.

I'm saying that the vast majority of programmers do not use Haskell. An anecdote about a particular group that uses Haskell (or even several groups) does nothing to refute that fact.


Surely you can see the part about recipes was an analogy to make a point and not directly equating two?

I think you're dismissing the post without engaging its arguments, so it hardly seems fair to start calling people naive, anti-intellectual, and ignorant. (btw, we all get that these are different ways of calling someone stupid, which is never productive and isn't justified in this case.)

To attempt to engage your argument, as far as I understand it... i think type systems and programming paragidms, however formal, can at best solve problems only in a corner of the problem set of software development. The limitation is because these do not take into account various kinds of constraints on software systems which nevertheless exist and are often the dominant constraints, depending on the project... Requirements, maintainability, usability, estimation, etc -- most of the stuff above the red line from the article.


What is there to prove in most cases though? I spend half my time redefining requirements.

I like Idris, and there's room for these languages. I think you give them too much credit though, programs written in them still have bugs. Your spec can be wrong. But above all else, they can't help you with scale, performance, recovering from hardware faults, and delivering what users wants.

The languages are still new, they'll gain traction, and for certain use cases they'll make sense, for others they won't.


The reason to use formal systems is to be able to reason about larger, more complex systems. Our brain is rather limited and to empower ourselves, we've found the divide and conquer approach. For that to work, we need assurances that our composition is correct due to the subcomponents being correct (and the composition operator). This is something we do all the time. You assume your compiler is correct, or your common libraries. However, if we want to grow larger systems, our foundations become more critical and we need stronger assurances.


I don't consider myself anti-intellectual, and I wrote a blog post on my claim that computer science is not math: http://www.scott-a-s.com/cs-is-not-math/ We discussed it a few years ago: https://news.ycombinator.com/item?id=3928276


well that pretty much nailed it.


I havent thought of this before but i was a mathematician that does software engineering, and Im starting to see some stark similarities between a mathematical proof and a "test" as in a unit test. I imagine if you went down this road you would end up with something very similar if not identical to TDD. What i was hoping this thread would discuss is how to turn software design patterns like the adaptor or factory patterns into a science in the same way that differential topology turns differential equations from a haphazard collection of methods into a rigorous science. I dont know why more people havent focused on formulating this more practical side of software engineering. Why is it that SOLID works so well, etc


Dijkstra spent a lot of his research effort on ways to prove that programs are correct.

"Today a usual technique is to make a program and then to test it. But: program testing can be a very effective way to show the presence of bugs, but it is hopelessly inadequate for showing their absence. The only effective way to raise the confidence level of a program significantly is to give a convincing proof of its correctness."

https://www.cs.utexas.edu/users/EWD/ewd03xx/EWD340.PDF


There is great wisdom in that observation, but I also enjoyed Knuth's take on the matter:

"Beware of bugs in the above code; I have only proved it correct, not tried it."

http://www-cs-faculty.stanford.edu/~knuth/faq.html


"I don't need to waste my time with a computer just because I am a computer scientist." - Dijkstra


This basically describes the dichotomy between CS and software engineering.


Yes, this is of extreme interest to me as well. The why behind SOLID. I think you could formulate more rigorous support based upon information and graph theory, maybe neuroscience, and other areas I'm less familiar with.

For example, the Single Responsibility Principle (SRP) is primarily concerned with making software more manageable on both an individual and team basis. How? By minimizing:

1. Communication overhead between teams/modules

2. Information overload in an individual

You could look at #1 from a graph-theoretic and information basis. I'm sure there are many interesting things to prove there. Like Amdahl's and the opposite of Metcalfe's Law [1][2] applied to team/communication/module dependencies. Just as a basic example, if one class has 10 responsibilities shared by 10 engineers with no boundaries specified, then the probability of conflict and unintended consequences rises. Thus the rate of development slows.

As for #2, applying more rigor and proof to the question of how these principles help a human understand quicker.

I don't have the necessary background right now to explore this in more detail, but like you, I'm very interested in any possible formulations.

[1]: https://en.wikipedia.org/wiki/Amdahl%27s_law

[2]: https://en.wikipedia.org/wiki/Metcalfe%27s_law


The problem is that even things like "how many people touch this class" and "how many external teams do we work with" aren't simple at all. Some external teams are much easier to work with than others. Sometimes you want your whole team to intimately understand an area of your code. Making software is an intensely human process, which is why managing software teams is almost entirely about considering the people on your team, and what makes each of them different.


Tests are negative proofs; you can show that a specific invocation doesn't fail. But unless they are exhaustive over the input space, they are never positive. You can't say something works universally.

I am not a full subscriber to SOLID. I think it promotes a certain kind of degenerate over-abstraction that leads to bugs of a different nature, premature decisions on what needs future substitution, and decreases agility in the medium term.


> I think it's self-evident that there is no theoretical basis for software development in the way that would satisfy your criteria. Software dev is generally a scattershot haphazard endeavor that involves trying dozens of angles until one of them works.

But the actual practice of doing mathematics (i.e. coming up with novel proofs) works exactly like this as well. The differences are in how much confidence we have in the results, and how much we're able to reuse them.

> It's also self-evident that in most situations, correctness and reliability aren't a concern. The counterexamples account for maybe 1% of the field. 99% of the time, if your program breaks, you can pay someone to go fix it and no serious harm was done. Even Github outages, which affect almost all of us, hardly matter.

I agree that we collectively don't currently care as much about correctness as we pretend we do. I believe software correctness is becoming more important (e.g. the rise of ransomware) and is going to become much much more important, but that's due to what I accept is a non-mainstream view of the future.


Not really familiar with RTOS/embedded/security/military/medical software then, are we.

If those examples are 1% of software, they take disproportionately more funding than 1%.

And I'm sure customers of software, from personal computers, to business apps, to corporate websites would appreciate more consistent results as well. Reckless software engineering has damaged the reputation of the field as a whole.


> scattershot haphazard endeavor that involves trying dozens of angles until one of them works

Sounds a lot like engineering/design/art. There's a large amount of overlap between the three.

I had the pleasure of speaking to the principal electrical engineer whose firm was responsible for designing the entire electrical system for One World Trade Center in Manhattan and the Citi field as well.

He described his work a lot like a software developer would describe their own. And then, in his words, when his company was done they "passed the plans over to the electricians to build it".

It just so happens, in our profession, we've automated the part that electricians are responsible for in his project. Our electricians are called compilers. They dutifully carry out the plan and from time to time they surface warnings/errors back to the engineer for input/correction.

People conflate the meaning of software development because one person shares so many different responsibilities, that are usually handled by separate people in other professions: architecture, design, analysis, implementation, maintenance, etc.

What a civil/electrical/mechanical engineer does isn't so different from what a software engineer does. It's a "scattershot haphazard endeavor that involves trying dozens of angles [based upon guiding principles] until one of them works" usually run through loads of simulation and analysis. Now, the difference is, when these engineers want to bring their idea into the world it takes physical labor, unlike software development where the feedback loop is near instantaneous. We bring our programs into the world with compilers and can run them immediately.

I mean, there are plenty of examples where civil/electrical/mechanical engineers failed in their design and thus created a bug in their project. See [1] or any contemporary CPU from AMD/Intel or automobile recalls or spacecraft failures, etc.

There are certain principles of software engineering that lead to more effective software. That's a fact. And the companies that understand this fact will not fail to startups, in fact they exhibit a severe technological advantage.

Google and Facebook are two companies that understand software engineering. Talk to the YouTube/Instagram teams and see whether they felt better off -- technologically -- before or after acquisition.

[1]: https://en.wikipedia.org/wiki/Citigroup_Center#Engineering_c...


Agreed. Reading [1] was a fantastic eye-opener for me. The code is the design, and design is always "trying until it works".

[1] http://www.developerdotstar.com/mag/articles/reeves_design.h...


That was a great read, thanks for that. Indeed that article really codifies a lot of the thoughts that have been floating around in my head.


>I think it's self-evident that there is no theoretical basis for software development in the way that would satisfy your criteria.

I don't think this is self-evident whatsoever. One of the problems in software engineering is that we never know when a solution is "right," we only know when it's vaguely not-wrong, and even then almost everything we create is still subtly wrong in a way we forgot to think about. Math is solid because there is certainty in the correctness of certain proofs, and these can be used as building blocks for further results. Philosophically speaking this is the closest we're going to get to "real" engineering practice in software, and we're pretty dang far from it if you know what that looks like.

>Software dev is generally a scattershot haphazard endeavor that involves trying dozens of angles until one of them works.

But this is a symptom of any pre-scientific field in engineering, it's not specifically endemic to software engineering. The main difference is that in software engineering we're applying math directly, and in other engineering fields we're applying science. In a certain sense, the natural sciences are purely empirical (modulo advanced physics of course, which overlaps with pure math and philosophy these days) while math and consequently computer science are purely rational. So, what are the solid elements in math? Proofs. Proofs are also how you ensure that the field actually moves forward and you're not stuck recreating solid results that already exist. Sound familiar? It happens constantly in software, but imperative and poorly-typed languages inhibit composability and reuse because they lend themselves easily to extremely specific solutions to general problems.

>It's also self-evident that in most situations, correctness and reliability aren't a concern. The counterexamples account for maybe 1% of the field. 99% of the time, if your program breaks, you can pay someone to go fix it and no serious harm was done. Even Github outages, which affect almost all of us, hardly matter.

This is preposterous. Just because lives are not in danger does not mean that it's not worth doing right by the solution, and frankly I think a lot more lives are waiting to be put in danger by crappy software with bad security than you realize. It's already become almost dogma that all software engineers need to have a deep understanding of security, and I don't understand why correctness can't fall under that rubric as well. If we're going to be having Geohot or even Google or whoever building self-driving cars for possibly billions of people, enormous critical infrastructure projects coming under computer control, etc, the whole industry from education on up is going to have to have a serious attitude adjustment to keep up with the demands of safety, security, and reliability.

It might not matter 99% of the time, (I think it's a lot more than that of course) but we need to make sure that we can as a discipline deliver that 1% when it is absolutely critical, and as of right now it doesn't seem like we can.


It might not matter 99% of the time, (I think it's a lot more than that of course) but we need to make sure that we can as a discipline deliver that 1% when it is absolutely critical, and as of right now it doesn't seem like we can.

FWIW, we agree on this point. But it seems worth treating this 1% case as a separate discipline rather than trying to lump it together with software dev. No one would claim that NASA's software engineering is the same type of work as, say, writing a new HN feature.

It's already become almost dogma that all software engineers need to have a deep understanding of security

The most secure programs are those that undergo frequent penetration tests and have bug bounty programs. Speaking as a pentester, I think there's not much chance of regular software devs being any good at security. There's just too much to know.

I want to agree with your other points, because in principle it's the correct thing. Unfortunately experience has taught us that the correct thing usually loses in the real world. Being first to market mattered way more for Ethereum than the fact that the DAO had a bug in their smart contract, for example. But there are hundreds or even thousands of examples of this type.

If we pretend that a teenager hacking in their bedroom is doing something fundamentally different than what most developers do each day at their jobs, then we lose out on the ability of that teenager to innovate. We become an exclusionary clique rather than an inclusive group. Luckily market forces still prevent us from becoming that insular, but in the era of walled gardens it's easy to imagine we're not too far off from that fate.

The main issue is that if we try to restrict the free market e.g. with legislation, then the important work will simply move overseas to areas without those restrictions. And unless you're proposing legal restrictions on the software dev trade, it's unclear how to enforce any of the proposals upthread.


>No one would claim that NASA's software engineering is the same type of work as, say, writing a new HN feature.

Well no, but the same underlying principles are still operating. The only difference is how much you care about heeding them. You don't need an aerospace engineering degree to make a paper airplane or a short bridge or a raft or whatever, but that doesn't mean that your knowledge of how to do so reliably and correctly wouldn't be improved by such a degree, or that if you're going to sell a product that you hope people will pay you for and subsequently depend on that you shouldn't bother trying to make it functional (in the "it works" sense) to the best of your abilities.

>If we pretend that a teenager hacking in their bedroom is doing something fundamentally different than what most developers do each day at their jobs, then we lose out on the ability of that teenager to innovate.

Ok, that teenager can hack, sure, just the same as they can build a two-stroke engine or an electronic alarm for their door or play around with nuclear fusion or whatever. But if they're going to sell those things and make claims about their safety and reliability, the validity of those claims should be enforced by an industry guild accreditation program or legal regulations or whatever. There's more than one way to skin this cat, but it really needs skinning.

>The main issue is that if we try to restrict the free market e.g. with legislation, then the important work will simply move overseas to areas without those restrictions.

The goalposts are being moved here, though, near as I can tell. I told you what I wanted done and why I thought it would work, and now you're telling me I have to figure out how to do it specifically in such a way that it can't be circumvented. I haven't thought up a specific solution to this question you pose, and as such I would point you in the direction of how liability works in other disciplines for similar cases and so on. I do actually think a lot of the same systems could work for enforcement, the biggest difficulty is actually figuring out what the principles should be. If the biggest problem with enforcement is that teenage hackers can't innovate anymore, I'm not really all that concerned. Romanticizing that image does nothing to change the hard realities of the industry.


> Well no, but the same underlying principles are still operating. The only difference is how much you care about heeding them. You don't need an aerospace engineering degree to make a paper airplane or a short bridge or a raft or whatever, but that doesn't mean that your knowledge of how to do so reliably and correctly wouldn't be improved by such a degree, or that if you're going to sell a product that you hope people will pay you for and subsequently depend on that you shouldn't bother trying to make it functional (in the "it works" sense) to the best of your abilities.

Not really. The time scales are completely different: that HN feature that is schedule a week of dev time is incomparable to that probe feature that is scheduled years of dev time even if the amount of code involved is the same. You could argue that both are "just programming", but the activities and process involved in both are going to be completely different.


>The time scales are completely different: that HN feature that is schedule a week of dev time is incomparable to that probe feature that is scheduled years of dev time even if the amount of code involved is the same.

You missed the point though. Whether it takes you five minutes to code the feature or fifty years, you're still doing math. The so-called "engineering" principles might change, with regards to division of labor and so on, but the underlying science/math you're dealing with doesn't significantly.


And this is the difference between math and engineering: in engineering you just have to be close enough to perfect that it doesn't break in the real world. It can never be absolutely perfect, because that would be either physically impossible or too expensive. This is true for mechanical, civil, chemical, and yes, software engineering.

In engineering the time it takes to implement the design/proof/code/output is one of the constraints. In math this constraint is ignored.


How are you "doing math"? Math might be involved in the feature, but many features just require lots of pattern matching and decision making. Maybe for some vaguely definite non-specific definition of meth that also encompasses most of human thinking this is true, but I fail to see how that is useful.


"If we pretend that a teenager hacking in their bedroom is doing something fundamentally different than what most developers do each day at their jobs, then we lose out on the ability of that teenager to innovate. "

That all contributors don't have the capability to formalize their output has never been a viable argument against formalization.

Before music education was formalized music was a mysterious craft you could only learn by learning with a master for a decade if you weren't incredibly talented.

Once the orphanages of Naples formalized music teaching in the 17th century it became something much easier to learn and teach.

Sound familiar?

Yet, music teaching did not eliminate the capability of non-formal craftmen to innovate. They just have collaborators to help formalize their thought process.(John Lennon did not know music theory and still made awesome stuff. But without formal theory his output could not live on in sheet music, and it would be a lot harder to reproduce it).

Even Einstein needed help with his math. But without math, theory of relativity would not have been much of anything.

It's all fine to imagine one is traveling in a traincar, but once one needs to compute, say, the orbit of mercury all need formal methods.

Creativity and formalism go hand in hand. You need both for a superpowered discipline.


> I think there's not much chance of regular software devs being any good at security. There's just too much to know.

I have over 30 years of experience and I couldn't agree more. The security field - heck, the website security field - is way too complex for me to navigate it properly. Sure, I know about SQL injection, but "authentication is not authorization" is something I tend to forget, and I am pretty sure I have no clue about at least half of OWASP top ten.


Looking at the discussion around your and Turing_Machine's points, I think one could generalize the problem like this: in a large and complicated search space, finding a solution is much faster and cheaper than trying to comprehend and formally describe the search space (to later derive the solutions). The up-front costs of doing the latter are so high (I don't think we even have a good bounds on them) that nobody who needs a solution bothers with it. Iterative search is the only way individuals and teams can feasibly find the solution they're looking for.

This way, I think formalizing software development is a worthwhile goal - just like formalizing cooking is - but it's also obviously so uneconomical that we can't expect the industry to bother with it. Formal methods are pretty much basic research - not useful for us in any meaningful timeframe, but hopefully our grandchildren will get some mind-blowingly amazing tools out of it.


> Software dev ... involves trying dozen angles until one of them works

It is a way to solve mathematical problems, See "Guess and Check" from this book https://en.m.wikipedia.org/wiki/How_to_Solve_It


> Software dev is generally a scattershot haphazard endeavor that involves trying dozens of angles until one of them works.

Maybe in an agile web startup. SW engineering in a big industry project is a very deliberate process. You start by a requirements engineer writing very precise requirements. Then a SW engineer turns this into a module design and a test specification. These are turned into source code (for which we use code generators to an ever higher degree) and test cases by yet other SW engineers. In the end everything gets reviewed and/or tested.

Trying things until they work is just not done. That would be way too dangerous for SW that controls cars, airplanes, nuclear power plants, rockets etc.


Safety critical software is a small subset of the software that gets written in "big industry" projects.

There are hugely more systems not written to that standard.


Yes, but those systems are engineered in the exact same processes. Just with less testing and fewer documents.


> Software dev is generally a scattershot haphazard endeavor that involves trying dozens of angles until one of them works.

So software engineering isn't engineering, either? I would imagine any civil engineer who actually knows what it takes to build a bridge would tend to agree, at least on the basis of the points you have put forward so far.


if you look at civil engineering at it's childhood stages that is, back when pre-historic man tried to build huts and lean-tos, you'd find that they too, didn't use any solid engineering practises, but simply tried different things and found out what worked. Being constrained physically makes it easier to determine what works, but the essense isn't any different to what software engineering is today.

I bet if we fast forward 1000 years, you'd find that software engineering will have become as rigorous as what we have today in civil engineering.


This. Someday software developers will look back on us the way doctors look back on barbers' bloodletting.


Now that we brought out the doctor analogy, I have news for you....doctors deal with an unknown system, gain personal and institutional experience through trial and error, are basically detectives, are rarely working with full information, and have, despite their heroic efforts, a significant task failure rate (the patient dies).

How is that not like software engineering?


i dont think doctors fail as often as software projects. and they only fail on the fringes of medicine - like treating cancer, or rare diseases. You don't often see doctors failing to treat the cold or a broken angle badly that they kill their patients!

But in the software world, it seems the equivalent of the cold is a CRUD app, but still fails so often that it's newsworthily talked about when such a software project succeed!


No, that's not equivalent. A doctor treats a cold the same in every case. Software engineers are building different systems for each case, or else they would just use existing software.


Part of that is because software development is usually concerned with optimizing things for their sale value, not for being useful. In my example, a lot of problems could easily be solved with the same CRUD stack, if not for the needs concerning the design (i.e. it can't look similar to that competitor's thing) and other goals unrelated to actual use of the product.


A doctor doesn't really treat a cold though. They can treat an infection with anti biotics, they can treat something more specific than a cold with something. But if you come in with just a typical cold and say "treat me," they'll give you a placebo and prescribe lots of rest and water.


> I bet if we fast forward 1000 years, you'd find that software engineering will have become as rigorous as what we have today in civil engineering.

This isn't evident at all. In software, every situation is novel, so there is no room for repeatable processes.

You can build a "bridge library" once and tweak it for each situation, but you never need to develop a rigorous process to build that library again.


> every situation is novel

this is the point i m contesting - that most situations aren't as novel as the stakeholders think it is. The failure happens because the assumption that it's novel is there!


In general there may not be that much novelty per project, but that isn't the novelty that one needs to worry about in software projects. The relevant novelty is the novelty to the project team. Also lot of the novelty comes not from the technical side, but domain and organization that will be using the software.


This is a big argument in favor of experienced devs, who have deep technical and domain expertise.


I've been doing a lot of low-ish level programming with video hardware on an embedded system. I mmap a buffer, which gives direct access to the hardware. I trust that the driver reports the correct size of the buffer, and that the hardware puts the bits in that buffer in the way its spec promises. How the fuck do you formally prove any of that? I can prove that my software, for example, moves bits from offset x to offset y in the buffer correctly, but how do I prove that it makes sense to move those bits in that way? Or that I have interpreted the driver's documentation correctly, or the v4l2 API correctly, or that the hardware or driver does what it says it does? Keep in mind that this is C++ of course, so there's no mathematically rigorous type system; I just have a uint8_t pointer to a place in memory.

I can potentially lay down a bunch of assumptions, and prove that, given those assumptions, my program acts correctly. However, most of the bugs arise from incorrect assumptions.

One concrete example: When feeding pixel data to the onboard hardware h264 encoder's driver, after having set a resolution of 1920x1080, the resulting h264-encoded frame will be displayed in a h264 decoder with a resolution of 1920x1080. This turned out to be wrong, because the hardware can only deal with blocks of size 16x16, so the width and height must be a multiple of 16, and the driver isn't smart enough to add the necessary metadata to make a decoder crop it, so there's 8 green (because YCbCr) pixels on the bottom of the video. The solution to this is to manually splice the h264 bitstream, to insert my own metadata which has the correct cropping. How the fuck do you formally prove any of that?


>Keep in mind that this is C++ of course, so there's no mathematically rigorous type system; I just have a uint8_t pointer to a place in memory.

Well you basically answered your own question - you would need a language with a more sophisticated type system. You would then need an API/DSL to interact with the graphics hardware.


The difference is that software development interacts with life, and that's where most problems lie: specifications from people who don't actually know what they need; interaction with ill-defined data and systems; unforseen cases; and everything changes over time, beyond the scope of the project.

I suppose that life is irregular.


And I'd argue that for a whole bunch of applications and software systems these days correctness has no business value.


Software engineering isn't math research. While I'm not against making it easier to formally prove correctness, at the same time I'm not sure why you think it's so important. We've been doing pretty well without it. How do you justify your certainty that we need it?


"How do you justify your certainty that we need it?"

Not OP but all pre-scientific crafts from medicine to construction have benefited considerably of application of formal methods. I don't see why software engineering a should be different.


How many doctors in the field are practicing formal methods these days beyond say statistics?

Anyways, surely SE as a field benefits from formal methods like other fields, whether applied by specialists or in the field by practitioners we can debate, but hopefully we can all agree SE isn't defined as just applying formal methods, that there is a lot in development that will never go near a formal spec.


Doctors are required to have extensive knowledge of chemistry, physics, and biology before even beginning medical school. Those are all highly scientific fields with centuries of practice behind them, not to mention medicine itself which is nearly as old as people are.


Yes, but computer scientists are required to take engineering physics also, as well as creative writing sometime. We are talking about the practice of medicine, and if you know any doctors, they aren't exactly very interested in math (they have a lot of other stuff to do!).


I'm not required to take any physics in my degree actually.


UW CSE requires first year engineer-level physics. I don't think that's unusual.


"we can all agree SE isn't defined as just applying formal methods, that there is a lot in development that will never go near a formal spec."

I've yet to see a field without artificial restraints that has abandoned need for human intuition and creativity.


There are multiple problems here, and they all stems from a lot of misunderstandings. Disclaimer : i do have a maths background and formal methods do have their use. But not in the general case.

>they are for programming (as in math) a well-defined, solid formal system that solid systems can be built on

Except you forget one big thing. Maths do not deal with errors or real world complexity. Maths exists by themselves in a world where errors and failure at other level do not exists. And i will not even talk about Gödel incompleteness or Turing proof of the the halting problem as other fundamentals mistakes here.

> can be built on in a way that nothing else can be in software, at least thus far. What about System Engineering? Because guess what? Other complex systems exists in the Nuclear, Space, Chemical or Aviation industry. And they are not developed with formal proof... and they still do work... But they use Engineering, proper one. Not wishful thinking and all.

>In a world where software flows more freely than water, the correctness and reliability of software systems must be taken with utmost seriousness. There is a really really small but essential mistake here. Two to be honest. The first one is that any sufficiently big software can not be proven correct and reliable. That is what Complexity Theory tell us.

But most importantly, it does show a complete lack of knowledge of the meaning of the word reliability and the research in Complex System. Reliable and Correct systems are inherently dangerous. What you want is a system that is SAFE. A system that is Safe, is a system that accept to do the "wrong" thing if it is the safe things to do.

>What do you propose as the theoretical basis for the engineering science of software development? Complex System. SNAFU catching. Stop trying to ignore the stack we live in and begin to think about runtime instead of over complex architecture. Safe guards and operators. Debugging as a first class citizen. Bulkheading and recovery. In general, System Engineering and Human Factors.

You can begin with reading web.mit.edu/2.75/resources/random/How%20Complex%20Systems%20Fail.pdf then follow with Nancy Leveson free book https://mitpress.mit.edu/books/engineering-safer-world .

PS: oh by the way, to everyone saying that other Engineering discipline are based on rigid step-by-step methodology that enforce correctness. It is not. It is even more a mess with slower feedback loop. I worked in other engineering fields before IT and it is not really better. It seems to be really American to believe in the "scientific method". But that does not exist. The world outside is messy and complex.

Try accepting that. There are solutions, but by living in a dream, you are working around the bug instead of fixing it.


Six types of bridges... Were you serious? There are definitely ways to apply formal engineering methods to software that greatly help its development.

Computer programming is not a special snowflake. It's a field that (in many areas) refuses to grow up.


Forcing it to grow up would harm it, though. The engineers would move to companies that don't force those kinds of constraints.

It's important to keep in mind what "growing up" translates to: slowing down. This has both economic and competitive implications. Sometimes it might be a win, but the vast majority of the time your ability to move quickly (and yes, occasionally break things) is an advantage.


The current default is for people to work extremely hard and make minimal progress in software. Startups or giants most development is horribly inefficient. So, I don't think there is much to defend about how things actually work in practice.


I agree with this. Anything is better than how its currently done


Yes and no. It is a natural progression of the industry. Those who cannot do "formal engineering" probably don't belong in engineering roles. Engineering, in general, imho, can be distilled down to this: Clearly define the constraints in which your project will operate under and prove that they will hold under those constraints. If someone as an SE cannot do that, they really have no business being anywhere near any field of engineering.


Depends on the constraints you want to prove that the program will uphold. Some constraints may be impossible to prove, especially in non-real time contexts.

For example, a program must run on a platform that doesn't have everything formally defined or have guarantees. If a program relies on that platform and that influences the program in a way that makes the constraint reliant on the platform, then you are out of luck.

Another issue is choosing the constraints that the program will operate under. You would hope that the clients would give you the requirements including the constraints the software must operate under. But this isn't often the case. Having an incremental approach which includes clarifying requirements by mapping what the clients need to do to get their work done, which includes recording the specific steps that they perform or need to perform works pretty well when clients don't provide the requirements or constraints upfront.

I am relatively ignorant on formal engineering, but I do like that you have provided a fairly clear definition, which was "Clearly define the constraints in which your project will operate under and prove that they will hold under those constraints". It is better than the people saying know more math, without saying how, or what, or why it is useful in any clear way, and saying software is math just doesn't cut it.


That is a great counter example. I expect there to be a slow migration to platforms like [this formally verified microkernel](https://sel4.systems/). This will take time and a tremendous amount of effort. But will ultimately address the counter example. In the conclusion of this migration, this puts the SE field into line with the rest of engineering.

Choosing the constraints for a project is no different in the context of CS than the context of any other fields -- electrical, mechanical, etc. It is a conversation, as you stated. This helps in many ways: it helps the developer understand exactly what the client wants, helps the client understand exactly what they are going to get, and helps build an implicit timeline replete with discrete and obvious deliverables and (if done correctly) those deliverables are fairly modular.

Thanks! I, too, get frustrated that people just say 'math! math! math!' That isn't helpful or meaningful. This is a conclusion I came to after talking to quite a few of my friends and colleagues in different engineering disciplines.


Thanks for the information, it is very much appreciated. I was a little frustrated that comments on this thread had people saying you should learn the math and prove your programs but providing no information on how to do so. It would be great if provided resources to show people how to do it. Or even better, provide a project that they worked on and show how they used formal methods to move through the project in detail, which would be wonderful. If someone wants to promote formal methods, then should not just argue for it, but teach it, even a tiny bit. That would get a way better reception.

A little off topic, but it would be great if there was a method/system which could provide requirements tracking down the set of artifacts to where you could point to a bit of source code and say it implements that certain requirement. Sort of like a chain of responsibility for code and associated documents. That might be a pipe dream, I don't know.

Anyway, thanks for the information, I remember seL4 being mentioned in discussion of the sorry state of software on medical devices, but had forgotten it until you mentioned it. Thanks.


To address your specific concerns regarding "learning the maths": that's a bit disingenuous from the people stating it. It is actually logic they are talking about, rooted in the discrete math branch. For seL4 you can actually read the proofs and their conclusions: http://sel4.systems/Info/FAQ/proof.pml. It's a lot of (structural) induction style proofs. That page will be a good starting point and just google around for missing holes as you need.


As far as I've seen, there are generally more inefficient and unsustainable ways of doing anything than efficient and sustainable ways. If this is true, the most likely to occur software implementations are most likely wrong, and making money from these wrong implementations by exploiting their low burden of knowledge and low initial cost serves mostly to create a positive feedback loop of less and less efficient and sustainable software systems. Assuming civilization is going to last and keep using software longer than your own lifespan, this positive feedback loop is a bad thing.


Actually, it is fundamentally different in that just building the thing / just running the experiment is cheaper than doing the simulations or proofs.

Every other discipline does simulations because actually running the experiment is magnitudes more expensive and time consuming.

See "Why software development is an engineering discipline"

    https://www.youtube.com/watch?v=zDEpeWQHtFU


I agree. The primary thing distinguishing software development from other engineering fields is that the cost of building the product is essentially zero. Because building is what your compiler does. What a programmer does is design, and that is always influenced by the constraints of manufacturing - which, for software, are almost non-existent.


It is a special snowflake because the semantics and correctness of nontrivial programs are often in the eye of the beholder (is Microsoft Word or Google Search 'correct'?). This isn't the case for bridges, cars or toasters. It's not just a matter of adding magic engineering dust until programming is engineering.


You're missing the point though, formal verification ensures that a program hews to a formal specification (proof) which is well-defined and can be reasoned about. Whether the formal specification fits the problem is a different story, but seeing as you can neither prove you understand what the program does nor whether you understand the problem, I think getting one half of that dichotomy is really important to getting the other half.


I don't see how I'm 'missing the point' nor do I have anything against formal methods. They just don't turn programming into engineering - we're a long way from that and some of it has to do (in a completely unglamorous way) with the nature of programming. We just don't know how to turn it into engineering the way bridge-building became engineering.


> They just don't turn programming into engineering

No, what turns programming into engineering is having a rigorous understanding of proof methods and being able to apply them to solve problems the way solid science is used to solve problems in other engineering disciplines. Civil engineering and electrical engineering and mech-e and so on became engineering through the discovery of underlying principles that allowed processes to be formalized, standardized, and improved.

Formal methods have to be at least part of the answer here, unless you can think of a better way to formalize, standardize, and improve the process of software development.


what turns programming into engineering is having a rigorous understanding of proof methods

To believe that, you have to believe that simply adding maths to something turns it into engineering. I think this is fairly obviously not the case.


You literally lopped off the second half of that sentence which does all the work. Of course the part you quoted is not the case, that's why it's not what I said.

You're also misunderstanding me if you think I think "adding maths" to CS will turn it into engineering. We're already doing math! I think CS could be made a lot more solid by being honest that we are doing math and adopting the relevant formalisms that already exist to make reasoning about mathematics easier.

Do you just disagree that that would help, or do you reject the premise that it could be made better entirely?


The thing is that math isn't all that rigorous either. Formal methods (i.e. computer-verified proofs) are still very much the exception. Proofs are written for other humans to consume, and details tend to be glossed over. Usually this is fine, because the readers are mathematicians themselves and can fill in the gaps, but occasionally this leads to a crucial counterexample being overlooked.

Now what does make mathematics slightly more rigorous than other disciplines is the fact that once someone bothers to actually work through the details and finds a counterexample, they can convincingly demonstrate to other mathematicians that the proof is incorrect. Often, the detail in question isn't even all that important, and a workaround can be found that saves the overall proof. But software development is also rigorous in this sense; once you find a case the program handles incorrectly, you can write a testcase to show the difference between expected and actual behavior. And that doesn't usually show the whole program to be misguided, you can just rewrite a small part and things work again.

You seem to want a method that can not only make definitive statements after the fact, but that can actually ensure for almost all programs and almost all desirable properties that the program fulfills the property. But this is actually much more rigorous than most mathematicians ever bother with, and for good reason.

Complete verification requires stating every obvious fact in excruciating detail, because that is the only way to be sure that it is indeed obvious and a fact; in addition to tracking the complex interactions of the whole thing. Most humans really don't want to make this kind of mental effort, if they can avoid it. Even static type annotations are too verbose for some, which has lead most modern languages to include some form of type inference. I don't think you will see widespread adoption of formal methods before proof assistants are developed that can similarly handle most of the simple but tedious tasks, so that humans can focus on the actually important bits.


I actually think dependently-typed languages like Idris are the most promising in this area right now, at least until proof assistants get a lot better.

It's also not specifically the rigor in mathematics, but the grounding in solid principles. They have axioms, they know how to prove things, and they know which proofs to trust and why for the most part. What frustrates me most is stuff like programmers haphazardly reimplementing monads over and over again instead of moving on with what they were actually working on before the language and type system got in their way.


> formal specification

Where did the formal specification come from?

How do I know the formal specification is what I actually wanted?

Is the connection between the informal requirements to the formal specification easier to trace/follow than the connection between the informal requirements and the code?


Code is the formal specification.

There are two separate problems here - one is that you might not be able to clearly explain what you actually want. The other is that you might simply fuck up when writing the formal spec (code). Formal verification is meant to assist you in the latter - when you know what you want, but can make mistakes writing it down, or not realize your description is self-contradictory in some aspects.


Code is a formal specification. You can have layers of such specifications. The problem with viewing code as the specification is that a lot of languages obscure the intent and only reveal the how.

I'm working on a radio product, it's clear how data is being moved into certain buffers. It's clear when data is moved into certain buffers. It isn't clear why data is moved into certain buffers, or if the code is correct, only that it's being done.

A higher level specification set allows us to have a documented understanding of why the code does what it does, and allows us to reason about it at that level. It also makes it feasible to bring new people into the project, because 150k sloc (not a huge project, but not small) of largely low-level code is not something someone new can jump in on and understand quickly.

We also have something that's much easier to reason about when designing system tests, and to test the code against. We write the tests as though the specification were the reality, and test the code against that model to detect where it diverges. If we only had the code, what would we test it against? If it's its own spec, then it can't be wrong.


> Code is the formal specification.

Yes, people tend to forget that. Having a second formal specification can be helpful, because writing things down multiple times (differently) often helps understanding.

What I've seen so far is that (non-code) formal specs can be very useful when the domain is highly technical, for example network protocols, because they illuminate aspects that are hidden in "production" code.

Of course the fact that important aspects are hidden is a more general problem.


No, the correctness is in the eye of a spec. Now, whether companies formally define their specs is a different story. Many startups don't.

With that said, if Microsoft Word meets the given spec, then it's correct. One way to prove this correctness is with testing.

This is the case for bridges as well. If I hand a spec to an engineering firm requesting a bridge from A to B at any given price. Well then, there's a lot of bridges that satisfy those constraints.

Just because many software companies today don't choose to formally specify a spec and prove a programs correctness upon delivery doesn't mean it's a special snowflake. It just means those teams are immature.

Sometimes software development doesn't require a mature team. Like most decisions, there's a cost and timing tradeoff.


The problem with this line of thinking is that the problems that matter are not related to 100% implementing the spec correctly.

The problems that matter are in the design of the spec to begin with. The important part is picking the CORRECT business requirements, NOT in implementing those business requirements correctly.

And do you know what the best way is to test whether you got the spec right? You deploy and see if the feature gets traction.

The "spec" that matters is the success of the business.

And building a product, and releasing it, is the way to formally test if your feature is "correct" according to the spec that is The Market.


>The important part is picking the CORRECT business requirements, NOT in implementing those business requirements correctly.

It's both, though. You also can't tell if you've implemented them correctly if you don't have a formal spec.


Not really.

If I make a mistake building my web app, then it will either not matter, or someone will notice the bug in production.

And then when someone notices the bug in production, it can be fixed. If nobody notices it, then I guess it wasn't very important to begin with and can be left "broken".

Implementation bugs are the EASY part, and don't matter a lot of the time.

Or here is a better scenario.

Lets say I am writing a feature, and I think of a way to implement the feature much quicker, but isn't 100% "correct" according to the spec. The quick and dirty, but "incorrect" way to implement it may actually be a better thing to do, because now I can spend my time working on other stuff that is more important.

Purposefully doing the "incorrect" thing according to spec, may actually be the right decision.


Your notion of correct does not apply to everyone: for many of my projects, the implementation that goes against the spec but has higher traction is implemented correctly, and the implementation that follows the spec but has lower traction is implemented incorrectly. The spec is just someone's idea of what will have the highest traction - it doesn't make code that follows it correct.


No, you just don't understand the definition of correctness as it relates to software. Correctness -- like safety -- is formally defined. You might want to look it up.


Uhm... I was responding to a conversation about software engineering not product development. The two are orthogonal.

You just described the responsibility of a product manager, and finding product-market fit. None of which requires software development. Software development may help but product management certainly doesn't require it.


Your job as a software developer is to solve problems with code - not to take the thing someone else designed and make it go. That line of thinking just doesn't work.

Designing and building software are now the same thing. And that is only going to become increasingly true. Designing solutions in the context of modern systems inherently requires intimate knowledge of those systems and the capabilities of the technology being used to solve the problems. The "design/spec/build/test" pipeline is dying.

You can recognise that that's a good thing and get on board or you can be left behind.


Thanks for the laugh.

You misunderstand what a spec is. And you misunderstand what product management is. At no point did I say software engineers aren't responsible for designing and building software. And at no point did I say designing solutions doesn't require intimate knowledge of those systems and capabilities. In fact, that's exactly what I've been saying.

That's what software engineers are there for. To provide expert knowledge and guidance.

But, do you think the electrical engineers defined the spec for One World Trade Center? No they didn't. They got the spec from the architects and they worked to satisfy those requirements.

The spec is not static. There is no design/spec/build/test pipeline in the strictest sense. The spec is dynamic. It's updated through design/spec/build/test iterations.

Just like an electrical engineer may surface new knowledge back to an architect that requires the spec to change... or an electrician in the field will surface new knowledge back to the electrical engineer that requires the spec to change.

Changing the spec is expected. It's called change management.

I don't need to get on board with anything. Google isn't getting left behind anytime soon. And the way they approach software engineering is largely the correct way; I agree with it. (They do however lack coherent product management in some areas.)


This is not at all the case.

Product development skills are extremely important for a software engineer to have, especially at smaller companies, because a lot of the time the person making these product decisions IS the engineer.

During my engineering career, most of the time my boss gives me a general goal for a product or feature that needs to be built. And then I take that general idea for a product, and make all the product decisions about what to build and how to build it MYSELF.

At smaller companies there may be NO product manager. Or maybe the product manager is only making very high level decisions, and isn't really involved in every little nitty gritty detail about the product.

You the engineer have to make the product decisions. And you have to balance those product decisions against tradeoffs, such as how long would it take to build, how high quality it is, and other software engineering design tradeoffs.

Product design and software engineering are very closely related, and any good senior engineer should be competent at both. Software engineering makes you better at product design, and vice versa.


Like I said: A may help B or B may help A, does not imply B requires A. Where A = software development; B = product management.

Yes, most startups don't formally define their spec. That's what I've said.

Managing the spec and ensuring product-market fit falls under the domain of a product manager.

So, that's great, you did software development and product management. You wore many hats... like most people do in startups. Doesn't negate the fact that you were a software engineer taking on additional responsibilities.

Trust me, when you work on a team with a clear separation between product management and software engineering and you have a great product manager... it is pure bliss. The only company I've ever felt that technical nirvana with was Google. My god they know what they're doing when it comes to software engineering and separation of responsibilities. At least the team I was on did.


But software is the spec. In many cases, if you can formally specify what a function should do, you have already written it.


You're misunderstanding what I mean by spec. A spec is made up of a list of requirements. This list of requirements can be considered constraints on processes, performance, features, limitations, whatever. Whatever makes sense in the world of the specifier.

For example, I can have a requirement that says: "The program shall generate a set of weekly time schedules that are maximally preferable, based upon each students' preferences, for all students at a given location while taking into account resource constraints re: room availability and teacher availability." Of course, this isn't specified to the level of detail I'd put in a real spec, but it serves as an example.

There's many ways to solve this requirement, i.e., there are many different programs that satisfy this spec.

There are two solutions that may stand out: (1) a brute-force approach, and (2) a convex optimization approach.

Since I've not explicitly defined a speed of execution requirement in this spec, then a brute-force approach may make sense... even if it runs for 5 days. In the real-world, you'd confirm this undefined assumption.

If instead my spec said this needs to complete within a day, then maybe the convex approach would make more sense. However, you'd first seek more definition in the spec, i.e., how many students are we generating schedules for? If it's only 10, maybe brute-force, if it's 20k, then convex.

So on and so forth.

Now, the interesting part comes when you create the end-to-end (E2E) and acceptance tests. These tests should be the first thing you write because they follow directly from the spec. They will stand up the program as if it's running in production and drive it as such to test whether it adheres to the spec.

Once all your E2E and acceptance tests are passing, and we've eatablished that your tests cover the spec completely, then we can mark the program as correct with respect to the spec.

There are many different software designs that satisfy a spec.

The software is not the spec.


Thing is, the need to formally define "behaviour" like that is a symptom of poor organisational/team structure. If you have multi-discipline teams with end-to-end accountability for a complete product (or vertical slice of a product) then your "spec" can be replaced with a problem statement and all of a sudden people can start focusing on whether the output solves the problem rather than whether it satisfies some arbitrary behaviour.


I agree with that totally.

Put another way, the problem with having a "software spec" is that you already have the "software which is a spec" and now you have 2 specs.

Or a 3rd way, figuring out the right interface for 2 components is at least 50% of the work when writing software. So, spec writing is like programming. And you wouldn't want to program by committee, would you?

This is one thing Amazon got right - having small internal departments and interacting like separate companies.


Software is not the spec, it's an implementation... of which there are many implementations that satisfy one spec.

Spec writing is not like programming, and if it becomes like programming, then you're doing it wrong.

Specs are meant to constrain the visible solution space by defining the problem sufficiently.

To put it in programming language terms: a spec is declarative not imperative.


I guess you have never heard of declarative programming, design by contract or logical programming, which explicitly does constrain the visible solution space by defining the problem sufficiently.

Though imperative programming looks at the problem from another angle which is closer to execution details, it's not fundamentally different. A smart compiler/interpreter may completely ignore those details as long as it produces the right output, as specified by the spec. That spec being the code.

This is obvious when you see that one style of program can be converted to another style without losing the semantics.


It's blindingly obvious that you've never worked in, worked on, or managed a successful large organization (say >500 people || >$1B mkt cap) or a successful large project (say >5k man-hours || >50 people || >$10M budget).

> whether the output solves the problem rather than whether it satisfies some arbitrary behaviour

The spec is considered the solution to the problem! It's not arbitrary in any sense of the word.


Re: first paragraph - incorrect.

The problem with "specs" is that you don't know whether they solve the problem until the solution is implemented and shipped. And even then, you have no real way of knowing what's meat and what's cud. So lots of time, money and people are thrown at solutions that - at best - are wildly bloated or inefficient and - at worst - completely fail to address the actual problem at hand.

The _need_ for specs is - even in large organisations - typically down to misplaced accountability. When you decompose the problem space into small pieces and give development teams a high degree of problem-solving autonomy _and_ accountability for production, the organisational disconnects that lead to the "need" (or, rather, the ill-conceived desire) for burdensome process and specification largely go away.

This isn't witchcraft - it's progress. It works; and it works in large organisations. You haven't witnessed it, so you don't believe me. If/when you do, I'd be willing to bet you'll be converted as I was.

But I'm sure I've given you plenty of reasons to double down on your skepticism.


> Re: first paragraph - incorrect.

Please, do detail.

> The problem with "specs" is that you don't know whether they solve the problem until the solution is implemented and shipped.

No. Good product management eliminates this risk. And that's what you're talking about: risk.

You can test a product, feature, anything, many different ways before you build an actual implementation.

That's basic product and risk management.

> And even then, you have no real way of knowing what's meat and what's cud.

Monitoring.

> When you decompose the problem space into small pieces and give development teams a high degree of problem-solving autonomy _and_ accountability for production

You just like... defined what a spec is... man.

> This isn't witchcraft - it's progress. It works; and it works in large organisations.

Please provide the proof to back this up. Otherwise it's a baselsss claim.

> You haven't witnessed it, so you don't believe me.

No, I've arrived at my beliefs through the data on this point.

And the fact of the matter is that 100+ successful companies that have shipped successful products/projects operate with specs across many different industries from pharmaceuticals to construction to aeronautics and so on and so forth.

If the data supported your conclusion, then I'd agree with it. But the data does not support your conclusion.


"But software is the spec."

This whole discussion can be replaced with this sentence.


There are very, very few languages (or few applications of every language) where we can honestly say "software is the spec".

Please tell me, what is the specification of this code so that we can verify and validate it:

  (defun f (x y) (* x y))
Is the specification: `f` shall return the product of two numbers?

Perhaps. Perhaps it was supposed to be addition. Perhaps it was only supposed to apply to integers. If the above spec is correct, is the code correct? Maybe. It doesn't react well when given non-numeric values. Is that a problem? I don't know, the code doesn't explain who is responsible for validating input and who is responsible for handling errors.

A specification is a hybrid prose/formal document that would give us all that information (if it had any value). The code above is not a specification, it is an implementation. No different than a gear or a cog or a lever in mechanical engineering. It is a thing which does some work. We can examine it and see what it does. But we cannot, by observation or execution, determine why without greater context. That context is the specification.

The software is an artifact, one among many, which (hopefully) satisfies a specification.


So like, for example, Monads. Monads are applicable across every program ever written and with some rules laid out around them have remarkable properties. Comonads too.

Can we use those? By your criteria, I mean.


There may be thousands of typs of programs but how many software design patterns are there? This is a much better analogy I think.


You can't seriously be putting software engineering above civil or mechanical engineering... They're two completely different fields which both require years and years of training.


Wouldn't it be more fair to compare the bridge types to for loops?

You need science to come up with it and engineering to apply it in practice.


    > why the author is so suspicious of
    > formal methods
I studied Z-Notation and CSP at what is perhaps the home of formal methods (cs.ox.ac.uk), and I've yet to come across a real-world situation where I've found either in the least bit useful.


I agree with you about the potential value of formal methods, but I think you completely missed the point of the article. Formal methods can't supply specifications. They can perhaps tell you if your specification is consistent and therefore implementable, but they can't tell you whether the system you've specified will meet your needs.

I think those of us who promote formal methods need to remember this. At best only verification -- making sure the implementation matches the specification -- will ever be fully automated. Validation -- making sure that what we specified is actually what we wanted -- will always be a human activity.


Yes, you're entirely correct. But you also seem to be the only person in this thread who can pull that distinction out of this discussion, everyone else is ranting about how The Market is the only real spec or whatever.


It is right to be suspicious of formal proof. In most areas of software engineering, employing formal proof makes you about 10-1000 times slower. Knowing how to do a formal proof in principle though lets you often reap a lot of the benefit without actually getting slowed down much. This is similar to how mathematicians know how to prove something in principle without running it through Coq or Isabelle.

By the way, Curry Howard is just one way of doing formal proof (one I personally don't like). There are many foundational and practical problems that need to be solved before formal proof is ready to go mainstream (but I am convinced that it will one day).


Because entropy. The natural state of the world is chaos.


That's about meaningless a statement as they come. How do you mean?


When you take your nice, theoretical software package and expose it to the real world, you always find things you wouldn't have predicted in the design. The world in general and humans in particular are messy and chaotic. People enter things you wouldn't have expected. People use your software in ways you wouldn't have predicted. Simple communication protocols suddenly become victims of race conditions.


Here's the graphic transcribed as text for non-English speakers.

Software Engineering: Requirements, Modifiability, Design Patterns, Usability, Safety, Scalability, Portability, Team Process, Maintainability, Estimation, Testability, Architecture Styles.

Computer Science: Computability, Formal Specification, Correctness Proofs, Network Analysis, OS Paging/Scheduling, Queueing Theory, Language Syntax/Semantics, Automatic Programming, Complexity, Algorithms, Cryptography, Compilers.

In my opinion, some of those could be on the other side of the line (estimation could be CS, language syntax/semantics and network analysis could be SE). But I agree with the general division.

I studied Electronic Systems Engineering, but somehow always found jobs in software companies. One problem I struggle with is the division between DRY (Don't Repeat Yourself) and WET (Write Everything Twice) coding styles.

Most programmers hate it when code is repeated. They prefer to spend days trying to integrate external libraries instead of just copying the necessary functions into the main branch. There are good reasons for this (benefiting from new features when the library gets updated), but there are also risks (the code breaking when the library gets updated).

Software Engineering priorities include Safety, Portability, Modifiability, and Testability. I interpret that as a WET programming style. "If you want it done well, do it yourself." There's no arguing about responsibility then - the code is mine, and I should fix it if it breaks.


I don't think you understand DRY. It's a concern within the code you write rather than without. Whether you choose to freeze your dependencies is an entirely different concern.

Say, for example, you have a complicated condition you test for frequently within your code. DRY is when you decide to extract that condition into a testable function you can rely on everywhere in your code (e.g. `isLastThursdayOfMonth(date)`) You can extend this same DRY thinking to all the other abstraction tools (e.g. types/classes) you have as an engineer too. I'm sure you'd agree that it would be an enormous liability and maintainability nightmare to rewrite the logic for that function everywhere. God forbid you're ever asked to change your littered logic to the equivalent of `isLastWeekendOfMonth(date)`.


> Software Engineering priorities include Safety, Portability, Modifiability, and Testability. I interpret that as a WET programming style. "If you want it done well, do it yourself.

None of those demand “write everything yourself”, only setting the same criteria for external code you integrate as you would have for code you write yourself.


> but there are also risks (the code breaking when the library gets updated).

This is the entire point of Semantic Versioning: to communicate breaking changes through the version number, and to build tooling to programmatically avoid breaking dependent code.

(No, it isn't generally perfect: it does require that human realize what the API is that that a given change is breaking it. If we had some programmatic language for specifying the API… type systems start this, but tend to not capture everything¹)

¹I suspect there are some formal analysis folks who know more than I do here, screaming that there is a better way. I work in Python day-to-day, so generally, it's all on the human.


See this talk [0] about this addition to Clojure called core.spec [1]. I read the paper first, and it isn't until the end of the presentation that I understood how it was related to the feature at all. Basically, core.spec is a kind of formal verification tool designed to deal with the growth of software. It is not a type system, though it resembles one in some ways. The objective is to support comparisons between the signatures of a function at two different points and say, are these compatible? Is this a breaking change? You have to design for growth: make no assumptions about unspecified keys and always assume that later versions may return more than they used to. And so on.

If you're a fan of semver, be warned.

[0] https://www.youtube.com/watch?v=oyLBGkS5ICk

    https://news.ycombinator.com/item?id=13085952
[1] https://clojure.org/about/spec


Elm has automatic semantic versioning based on the type system:

https://github.com/elm-lang/elm-package/blob/master/README.m...


And freezing / pinning dependencies (and their dependencies...) to avoid deploying a version of a library you haven't already tested first.


> Computer Science: Computability, Formal Specification, Correctness Proofs, Network Analysis, OS Paging/Scheduling, Queueing Theory, Language Syntax/Semantics, Automatic Programming, Complexity, Algorithms, Cryptography, Compilers.

It's interesting/funny that when talking about CS, or more academic point of views around software development, the terms "Formal Specification" and "Correctness" are often mentioned, yet most CS students/labs still use languages that are really badly suited for this job, such as dynamically typed languages like Python.


There are various fields in computer science, nobody claims formal specification and correctness in a NLP/ML library nor in a graphics library. Formal verification is orthogonal to other fields.


Are languages like C & C++ that much better?


Neither are "better". Its all about tradeoffs.

I hate large python code bases with no strict types specified anywhere. It's a nightmare to maintain that code.


Of course not, but Haskell/F#/OCaml/F*/Rust are.


No, but languages like Haskell, Ocaml, F#, Idris, Purescript, Elm etc are definitely, when talking about correctness.


How does copying the english words from the image to english words as text help non-English speakers?


Makes it easier to look up (copy/paste) and/or use dictionary tools.


They can copy and paste them into google translate?


Software Engineering looks like Systems Engineering with code applied.


That's a wonderful description, thank you! Yes. Software Engineering is applying Systems Engineering principles to the field of programming. It's a pretty good field to work in, because the biggest risk I face is only data loss, unlike other engineering disciplines where the stakes are much higher.


Software Engineering definitely has jobs/applications that can cause catastrophic loss if done incorrectly. For example, software in commercial airlines.


WET is "we enjoy typing" where I'm from


Its a naivety of early programming that DRY is the chief principle of software engineering. Try understanding the underlying patterns and applying the right design patterns for the job and you will realize that DRY without deeper thought is just shuffling the problem around like a kid throwing his mess in the closet.


It's a misunderstanding of DRY that leads to most of the nonsense. The original formulation, in the Pragmatic Programmer, was: "Every piece of knowledge must have a single, unambiguous, authoritative representation within a system."

This very much requires "understanding underlying patterns" - what knowledge does your program encode? How can that be broken apart and localized?

Obsession with repetition of surface syntax isn't DRY and isn't useful. It goes wrong in two ways. First, knowledge can be repeated without repetition of syntax. If I'm telling my HTML "there is a button here", and my CSS "there is a button here", and my javascript "there is a button here", those three statements may look nothing alike but that's not DRY.

Second, if two pieces of code happen to look identical, but each is the way they are for different reasons, they encode different pieces of knowledge and collapsing them is not DRYer. As I've said before, in that case you're not improving your code, you're compressing it. I like to call that kind of overzealous misapplication of DRY "Huffman Coding".


I've seen similar articles to this one, both in print and on web sites. I used to explain it to people as the difference between 'coders' and 'engineers' but I think my own hubris at having a degree got in the way of my thinking on it.

Over the decades I've met a bunch of people who program computers for a living, and there is clearly a spectrum where on one end is a person who spends the weekend benchmarking different sort algorithms under different conditions for the fun of it, and the guy who left the office at 5PM once an integration test passed on a piece of code that he pasted in from stack overflow was deemed to have no regressions. There are many different disciplines that have such a wide dynamic, from chefs who spend their weekends trying different flavors to cooks who take frozen patties out, reheat them and serve. Painters who throw human emotion into a painting and painters who lay down a yellow line on a road according to a template for $20/hr.

It seems to me that most, of not all, of the 'theory' stuff in computer science is just math of one form or another. This is not unlike all the 'theory' stuff in electrical engineering is just physics. You can do the tasks without the theory, but you rarely invent new ways of doing things without that understanding.

But just like carpenters and architects there is a tremendous amount of depth in the crafting of things. That brilliance should be respected, college trained or not, so trying to 'split' the pool doesn't lead to any good insights about what being a good engineer is all about.


You seem to be labeling overworking as "good" and in-time as "bad"... Here's an alternative perspective - I've seen underperformers who try to make it up by staying late and even allocating weekends for their employers, on the other hand, I've seen brilliant engineers who do the job and leave the office at 5PM, spending personal time on recreation, including learning other tech/stuff not relevant to their employers...


If I gave that impression I apologize it was not my intent. I don't attach any judgement to either hours invested or method of code construction. I was just sharing my observation that it is a multi-variable spectrum. That leads directly to your exact observation. Very little correlation between hours spent 'at work' and the quality of code produced.

When I started managing programmers I realized the best thing I could do was to manage people by agreed upon deliverables for their capabilities, quality, and maintainability. And if it took them 10 hrs to do it or 70hrs didn't matter. As long as I understood, and they understood, how long it would take them to do something we could manage to that schedule.


That's because you just fixated on the 5PM and not the rest of the words in each sentence.


I don't think so, since the examples mention overworking towards the current assignment. Anyway, 5PM stigma exist and there is no harm to reiterate this topic.


>I used to explain it to people as the difference between 'coders' and 'engineers'

This isn't going to be popular, but it's true.

Coders are what people called themselves before business started making the decisions about how to write software.

Engineers are what people called themselves after business started making the decisions about how to write software.

Guess what? People who write software aren't engineers, they are programmers. You have crap programmers and you have exceptional programmers, but you they are paid to write programs. "Software Engineer" is as valid as a "Sanitation Engineer."

Coder is a slang for programmer because they write "source code." it dates back to at least the 80s, probably earlier. It is also acceptable term for programmer.

If you have to make up a fancy term, call yourselves software developer or software designer. If you want to be called an engineer, go to engineering school.

https://www.theatlantic.com/technology/archive/2015/11/progr...


The terminology is shifting though, and has been for some time now. While I agree that CS/SE is not "engineering", it's the common term across the top companies, with words like "developer" now fading in use.

For me, I have the opposite reaction to coder vs engineer.

For a coder, I think code monkey, someone who writes boilerplate MVC apps and can't handle algorithmic complexities, write code that others can understand, or consider lower levels of a system they are interacting with. Also synonymous with hacker, which I would bet also has a reversed connotation for those who see coder as a good thing.

When I hear engineer, it's synonymous with the jobs at the big name companies and implies the person thinks their job is complex enough to warrant the title, even if not deserved by the standards of other engineers.

Of course, all of this is semantics, including the debate over what is an "engineer". In the end, it's pretty meaningless in terms of impact. CS vs SE I think should be the focus, as yes, the two can be quite different, even if most CS degrees end up working in SE.


>The terminology is shifting though, and has been for some time now.

Yes, I agree. "Coder" and "Programmer" are grey beard terms.

>Also synonymous with hacker, which I would bet also has a reversed connotation for those who see coder as a good thing.

Yes; a hacker is someone who can bend the computer to their will, even when it isn't supposed to do it. That's one of the reasons people who crack software and infiltrate systems sometimes get that moniker.

>implies the person thinks their job is complex enough to warrant the title, even if not deserved by the standards of other engineers.

Ya, that's kinda the problem. It has an inferiority aura about it.

> In the end, it's pretty meaningless in terms of impact. CS vs SE I think should be the focus, as yes, the two can be quite different, even if most CS degrees end up working in SE.

It absolutely is meaningless, I guess that's what bugs me about it. CS is for creating an improving algorithms. John von Neumann discovered/created merge sort in 1945 (among other things). He is CS. "Engineering" is taking merge sort and using it for efficient joins in an RDMBS; applying science to an actual thing.


It's sad to me when people try to trivialize the profession of the people responsible for the information age like this. Why does the term 'software engineer' hurt your feelings so much when these are the people responsible for taking mathematical concepts and turning them into working marvels like the Internet, wifi, cellular data, networks, etc?


Don't be sad, it doesn't hurt my feelings, it's just silly to call yourself one. Why not call yourself a "Software Doctor?" You should be proud to call yourself a programmer, developer, coder or hacker; those are great skills to have. Impress people with your ability, not some made up title.

BTW, I've been a "Software Engineer," and "Software Architect" for 20+ years. Don't get me started on people who call themselves "Software Architects" that don't even code out their designs.


There is nothing special about "engineer" either. It's just a made up title.

Look at the origin of the word:

>Middle English (denoting a designer and constructor of fortifications and weapons; formerly also as ingineer ): in early use from Old French engigneor, from medieval Latin ingeniator, from ingeniare ‘contrive, devise,’ from Latin ingenium (see engine); in later use from French ingénieur or Italian ingegnere, also based on Latin ingenium, with the ending influenced by -eer.

Electrical engineers certainly aren't building fortifications and weapons, yet they have it in their title. Software controls the behavior of physical systems just as much as electrical engineer-designed circuits do.

The only people that get pissed off by the use of the word are people that think people who write software are intellectually beneath them.

>Why not call yourself a "Software Doctor?"

If you have a PhD, that would certainly be fine. Doctor is pretty well-defined throughout the history of the word.


Colleges have "Schools of Engineering," and they are typically 5 year degrees. Engineers also have requirements such as professional required certifications and ongoing learning requirements, similar to CPA's, MD's and JD's. You have a reasonable guarantee that the person who is an actual engineer knows what they are talking about. "Software Engineers" have none of that.

To become licensed, engineers must complete a four-year college degree, work under a Professional Engineer for at least four years, pass two intensive competency exams and earn a license from their state's licensure board. Then, to retain their licenses, PEs must continually maintain and improve their skills throughout their careers.

https://www.nspe.org/resources/licensure/what-pe

>The only people that get pissed off by the use of the word are people that think people who write software are intellectually beneath them.

Great theory presented as a fact, but I write software for a living, so I guess that theory just flew out the window.

>If you have a PhD, that would certainly be fine.

Just like calling yourself an engineer if you actually had a degree from a school of engineering would be fine too ...

Honestly, if software engineering were treated like real engineers with licensure, required degrees, etc, we'd make a whole lot more money. Companies like to call us engineers because they can blow flowers up our posterior in lieu of actually paying us for it. Personally, I'd rather make double than feel good about myself.


Your whole justification is just that some engineers have tried to guard the term by forming schools and associations. However, unlike the stupidity of Canada, it's not legal to just co-opt a common nebulous term and ban everyone else from using it in the US.

Your argument as well as that Atlantic piece is just "they don't do the same things to what previous engineering fields did so it doesn't count." If that kind of stupid logic held, there would be no such thing as petroleum engineers or electrical engineers because neither of those were things for a long time.


I didn't study Computer Science in college. Not one single course. But I'm not stupid. I made straight A's in math through Calculus III. So a lot of these comments frustrate me. I've taught myself literally everything I know. I've read dozens of books. I practice coding obsessively--it's my passion. Do I "get shit done"? Yes, absolutely. Do I not care about the efficiency of my algorithms? No, I care deeply. I don't always know the "computer-sciency" term for things. But my goodness, get off your high horse and tell me what you want accomplished. Chances are I'll implement a solution that's just as efficient and arguably much better than most "engineers" can. And no, I'm not going to be obsolete at age 40. By the time I reach age 40, PhD's will be coming to be for advise. Because I didn't study computer science in college. I'm studying it for life.


I think your type is the exception rather than the rule. I have a background in computer science and I love the beauty of correctness. But at work, I constantly find myself struggling to get my fellow developers (even with CS degrees) to think in terms of mathematical correctness instead of just, "how can I get this program to work today?" For example, can we define our methods to use the most general base type that is appropriate, or an interface that is yet more strict, and guard clauses that, the combination of which, create a method that mathematically cannot fail except for upstream, downstream, or machine level (e.g. out of memory) issues. If we do so consistently, then we have rock solid methods that we can trust. But usually I find that my co-workers don't want to think in terms of these sort of rock solid contracts. And instead they just want to get the program to work today; To get their scrum story done. Inevitably they keep going back to these methods to fix a scenario they didn't anticipate. It's such a colossal, collective waste of time. I don't think everyone has to study computer science, but an appreciation that programming can be more mathematically precise than many programmers try to make it would benefit the industry. It would drastically reduce bug counts, improve the ability to reason on code, and increase developer productivity.


Why do you consider asking for advice as a bad thing or being inferior? Every PhD knows that if you don't know something, you ask somebody that does. Collaborating is a huge part of academia. If you think you are better than a PhD just because they ask you stuff, you are definitely wrong.

You seem to have a completely wrong understanding about why people do a PhD. They (okay, most) don't do it to attach a title to their name but because they are passionate about a specific field and want to expand their and other's knowledge about it.


Or because they are more interested in theoretical application rather than practical application. PhDs are where you explore questions that have unknown market value.

This is often interpreted as, "you can't hack it in the real world, so you hide in academia," and in some cases that is true. But really it has more to do with what problems you are interested in solving/exploring.


That works great for you. What happens when my objective is to develop a solution that requires 100s of developers, spans multiple years, costs billions, and has major liability concerns?


My recommendation would be to hire people who are smart, responsible, and passionate. The language or framework can be taught. That's what the FAA does. Their initial assessment exam determines cognitive aptitude--not textbook knowledge. And air traffic controllers from all backgrounds are responsible for billions of dollars worth of aircraft and the lives of hundreds of thousands of people every single day.


You're confusing systems engineering with knowledge and skills


...and by the time you finish, it most probably will be obsolete :)

Jokes aside, software engineering discipline of its own would not give you skills needed to accomplish that.


I'm convinced that the only useful definition of a Software Engineer is "someone who has 'Software Engineer' in their job title". Most other Engineering disciplines are far more rigorously defined. That said, observing a disconnect between theory and application is hardly novel or unique to software disciplines.


> all computer hardware is essentially equivalent.

This is quite inaccurate. Hardware directly influences software. "if" statements, functions, and threads didn't exist at one time, and all require explicit hardware support. I believe that as we come up with different abstract constructs at the hardware level, we'll influence the possible software that can be written.


Sometimes true (e.g. the PDP-11 instruction set did influence C), but software can also influence hardware.

Many early computers had very rudimentary subroutine call mechanisms (e.g. the B-line of the Elliott 803), but this didn't prevent programmers from using functions which returned values, sometimes recursively.

Burroughs mainframes were designed to run Algol 60 (with a few additional instructions for use by COBOL programs), and Lisp Machines were designed to run Lisp. In these cases, the influence of the languages extended to the entire instruction set. This is a better approach, as it's easier to experiment with language design than it is with hardware design.


> software can also influence hardware

This happens more often than you'd think. Intel (and later, AMD) added AES primitives to their instruction set to speed up encryption. VT-x (and the AMD equivalent) were both designed to improve the performance of virtualization. Outside of the realm of CPUs, the use of FPGAs -> ASICs for accelerating bitcoin hashing certainly wouldn't have existed if not for the software. Hardware support for CUDA / OpenCL accelerated existing parallel workloads.

> This is a better approach, as it's easier to experiment with language design than it is with hardware design.

FPGAs certainly lower the barrier to experimenting with hardware design, although yes, it's probably still higher than language modifications.


There are many abstract concepts that need to be realized before we can usefully call a particular device a 'computer'. Many, but certainly not all, of these, are gathered up into things like Turing Completeness, Von Neumann Architecture, etc. On that (admittedly mostly theoretical) scale, it is meaningful to discuss computers as a broad class having certain characteristics. That's what allows us to reason effectively about things like efficiency and correctness in computer algorithms. They even allow us to meaningfully compare digital, analog, mechanical and quantum computers, despite radical differences in the physical hardware. If the object you're showing me doesn't support conditional behaviour ('if' statements), it's going to be pretty hard to convince me to discuss it as though it were a computer.


As an example of different hardware, there was a Russian engineer who built a couple of ternary computers in the 70s. And of course there have been analog computers.

Quantum computers would certainly not be considered "essentially equivalent".


Software engineering is where the rubber hits the road in terms of requirements definition, creating a solid design, fitting stuff into an existing legacy environment (SAP anyone? Java EE?), iterating prototypes with stakeholders... and usually in large corporations. It was out of many years of budget overruns in defense procurement that software engineering cornerstones such as CMMI emerged.

To me, the essence of software engineering is that 20% is about building the 'good' solution itself, e.g. architecture, code, release / deployment, ... the remainder of the engineering is navigating / tolerating the inherent corporate messiness of politics, opinions, power, and everything else... engineering the solution is the easy part; engineering good requirements and quality is tough.


Let's revisit the definition of "engineering", in a simplified form:

    Science -> Engineering -> Technology
Engineering borrows scientific[1] knowledge to create[2] technology[3]

[1]: or empirical knowledge

[2]: or maintain or implement

[3]: or processes

The relationship between science and engineering has been clear for a while now, even before the appearance of software engineering.

There's a lot of science at work in existing software, so it would be inaccurate to say that software is "unscientific". However not many people get to work on those projects.

A vast majority of people can make a decent living working on user facing technologies built with existing technology. At that level appealing to non-technical stakeholders has much more weight than applying engineering rigor.

But that's not the reality for everyone.


The author has a slightly funny use of the word engineering. If you look at its use in a conventional field like making cars then the science bit is the basic physics and chemistry of how gasses expand etc, the engineering is designing the machinery so the brakes work, the engine produces enough power and doesn't break and the like and then human issues like whether the workers go on strike or the end users are idiots and crash are not engineering.

Similarly I'd say in software the engineering bit is making reliable systems that are fault tolerant and secure and so on and then the people bits like the user interface are something like design and psychology, not engineering.


Most developer jobs contain parts of both, with more time spent in software engineering.

Software development, app development, game development, web development are all probably 90+% software engineering and 1-10% computer science depending on the project. Specific projects may differ such as writing standard libraries, engines, data, standards, teaching, etc. In the end most of it is production and maintenance as part of shipping.


These are complicated terms. Harvard's CS was part of their Applied Math department. There are Applied Maths of scheduling programmatic Engineering outcomes for sure. Fred Brooks taught us all that.

I studied Russell, Godel, Tarski and Quine and then compiler and runtime logic (as a Philosophy major). Back then CS was mostly a realm of 3-Page proofs on alpha renaming or newfangled Skip List speed/space utility.

As an old VAX/Sun or 512K/DOS C programmer working in DC for decades around lots of TC, datacenter and transaction processing folks, an SE MUST have basic speed/space, set theoretic, programming by contract, data integrity and MTBF abstractions in their heads while they plan and develop. Both accuracy and performance against test and measure just matter for the business cases 24/7.

Content software developers patching together framework components on 2 day schedules for consumer Web bloatware rarely understand something like data integrity needs of billing system logic embedding in redundant switches failing over on rough schedules. Typing commands is not even Software Engineering.

Software Engineering is not an individual identity phenomenon. SE is how groups show responsibility for stakeholder outcomes unrelated to paychecks. First rule of SE is everyone on the team passes the bus test. Nobody is essential. Unless we seek luck, we can't improve what we don't measure. Learning how and what to measure takes real training and group method application. So many out there never know what they are missing.

Business competition minus lucky windfalls is largely based on COST ACCOUNTING. Successful operations will discover heat dissipation costs challenges. Basic CS speed/space, contract covenant assertions, data integrity and MTBF logic in Software Engineers translates very easily into understanding business innovation problems.


rarely has asymptotic complexity mattered to my code. usually the most important factor is modularization and readability. i spend more of my time reading or re-using code, and my time is more expensive than a computer. plus, highly optimized code can sometimes be unreadable and lead to bugs, which are also more costly.


> asymptotic complexity

If it hasn't mattered to you, it's probably because you are using libraries or apis which have solved for optimal performance.

In short, performance mattered a lot to you code. Only, you didn't slog long hours to make it so.

Back to the topic at hand, if you didn't spend time to understand why a particular module or library is part of your code base - be it for performance or maintainability or any other -ities - you're halfassing your job as a software engineer. Would a structural engineer ever claim with a straight face that they have never worried about the integrity of their struts? That's basically what you said with your claim.


Nah, that isn't what was claimed. What you describe is more like claiming an engineer is half assing it if he doesn't verify that a given strut (library) that has been analyzed to death (already verified to have characteristics meeting the requirement) in fact meets them.


The OP said they don't care/haven't had to care about performance. What you're saying is that they used a library which is known to be performant. If the OP knows it to be performant, they are misstating that they don't care/have never been asked to care about performance. If they truly don't know about the performance and picked a library at random, they are halfassing it and aren't doing their job as a software engineer.


Well, remember: don't start with performance in mind unless you have to or you already know the right way.

Take a Python library like requests. Who the hell will read the source code and run a profiler on that module if you see a consumer? I don't care until I have to. Perhaps before shipping my production code I can run a profiler. If you are going that deep at the beginning, you are wasting your time instead of building MVP and iterate. Library's code out there is ever changing. One version can br slower than another. You aren't doing meaningful work if you start with performance.

OP didn't say he/she is picking a random library out of the blue. I don't remember reading that "random" part.


This entire discussion is about what an SE does vs what a CS does. Item #1 for an SE is to pick a library to use instead to reinventing every wheel every time. OP says they don't care about performance which is a naive, potentially stupid, approach to software engineering. Beyond this, I'm out of crayons.

Just because some hotshot (PG?) said to not optimize prematurely doesn't absolve you of your responsibility towards picking a library to use. Don't optimize early by all means but at least know why the library you picked might not be a great choice for problems you face in the real world.

Do you write your websites in C outputting raw HTML? I'm sure you don't. Clearly you made some sort of reasoned deduction about the tools at hand and went with one which got the job done. Why did you not pick C code spitting out HTML? No point premature optimizing your productivity, right?


The "hotshot" was Donald Knuth, and he was talking about getting clever before you're sure your program does what it's meant to do. The keyword is premature; analysis and optimization pretty much forms the bulk of his body of work, although he doesn't believe in sacrificing readability until you're out of other goats to kill.


You're right! @ Knuth. I knew that but it got overwritten by PG's point about scaling prematurely.

Like a good astrologer, I'm going read into your comment to assume support for my larger point - know your tools.


Let's talk about goals.

If the project you are taking on is to optimize performance existing codebase, then yes, you worry about performance, because that is your primary objective.

I do care about performance myself. I did look up which Python json implementation library out there is the best in terms of performance as well as whether the library is actively used and developed.

But that's only because I knew from the beginning json marshal and unmarshal are expensive. However, I stopped worrying now and only use the native json module that comes with the standars library because I see no gain for my projects. Perhaps that matters if I am Google. A 10ms gain was not even a problem for me in my projects.

Anyway, back to the argument. Let's take architecrure and structural engineering. Building a skyscraper is a complex task. Everyone wants the next skyscraper to look different and taller. But no architects or structual engineers I am an acquaintance with would start the question "how do I reduce the cost? How do I make my building taller while maintain resistance to 9.0 earthquake."

Those are concerns, but they will use whatever knowledge they already have to draw a model. Then they run simutations and go over challenges and problems they need to resolve to meet the requirements. No one starts the actual project by looking at how much they can save.

The only people in computer science and software engineering would always bear performance in mind from the very first step are computer scientists. No one design an algoritm or a new data structure or a novel method to build origami unless the purpose is to find a better complexity (space and run time). But I want to emphaize that software engineers are computer scientists if they want to claim to be one. A formal degree is not a requirement to be a computer scientist. A good software engineer does take performance into account, but not until some MVP working code exists. One might implement the solution using quicksort knowing it is easy and effiecent enough, until they recongize qs is not fast enough then another sorting method maybe used or developed.


Productivity is the thing that is worth optimizing.

Seriously, stop worrying about performance. Don't consider it when picking libraries. You'll write better programs as a result. I know how implausible it is, but it happens to be reality.


Tell me once more how I should be approaching my profession. I'm all ears.


I have seen the impact of linear or exponential complexity quite a bit in real world code so I think it's good to be aware of it in case you are having performance problems.


In Germany we have Informatik, which was treated as CS&SE long time.

Lately there are sprouting more and more SE degrees.

On the other hand we also have universities of applied science, where Informatik is often more like SE


In Peru we also have "Informatics Engineering", which is my degree. It is a mixture of CS courses (and a lot of math courses as well), plus SE and also EE courses (i.e. digital electronics' fundamentals / computer ALU/CPU design).


Scientists use the scientific method to make testable explanations and predictions about the world. A scientist asks a question and develops an experiment, or set of experiments, to answer that question. Engineers use the engineering design process to create solutions to problems.


We also wrote an article about this about a year ago: https://code.berlin/en/blog/computer-science-software-engine...


I neither agree nor disagree with the article. I think it conflates a lot of stuff.

But look, what the math and science sides of the room throw at us definitely informs the engineering. In every other engineering principle from architecture to ditch digging, there is a feeder system from a variety of mathematical and scientific disciplines. While many other engineering disciplines are well established, they are not immune to this and in general don't begrudge it.

Doctors are required to keep up on the state of treatment. Architects need to keep up on materials science AND new mathematical modeling techniques and tools. Car designers care about new discoveries in lighting, battery and materials technology.

Here's a good example of the kind of stuff we all should be on the hook for. I've tried to push this paper up to the front page a few times now because it's roughly the same as if someone walked up and calmly announced they'd worked out how to compress space to beat the speed of light:

http://www.diku.dk/hjemmesider/ansatte/henglein/papers/hengl...

Folks are generalizing linear sort algorithms to things we thought previously were only amenable to pair-wise comparison sorts without a custom programming model and tons of thought. No! And then a famous engineer-and-also-mathematician made an amazingly fast library to go with it (https://hackage.haskell.org/package/discrimination).

We're seeing multiple revolutions in our industry made of... well... OLD components! While deep learning is starting to break untrodden ground now, a lot of the techniques are about having big hardware budgets, lots of great training data, and a bunch of old techniques. The deep learning on mobile tricks? Why that's an old numerical technique for making linear algebra cheaper by reversing order we walk the chain rule. O(n) general sort is arguably bigger if we can get it into everyone's hands because of how it changes the game bulk data processing and search (suddenly EVERY step is a reduce step!)

We've similarly been sitting on functional programming techniques that absolutely blow anything the OO world has out of the water, but require an up-front investment of time and practice with a completely alternate style of programming. But unlike our fast-and-loose metaprogramming, reflection and monkey patching tricks in industry these techniques come with theorems and programmatic analysis techniques that make code faster for free, not slower.

Even if your day job is, like mine, full of a lot of humdrum plug-this-into-that work, we can benefit from modern techniques to build absolutely rock solid systems with good performance and high reliability. We could be directly incorporating simple concepts like CRDTs to make our systems less prone to error.

It's our job (and arguably it's the hardest job of the field) to dive into the world of pure research, understand it, and bring what's necessary out to the world of modern software. That means more than just tapping away at CSS files, or wailing about NPM security, or shrugging and saying, "Maybe Golang's model is the best we can hope from in modern programmers."


What is software Engineering !? Making an excel sheet ? Making a web site ? Writing SQL ? Using programming language X, Y, Z ?


Software Programming != Computer Science Software Programming and Science != Engineering

While we're drawing distinctions stop calling yourself an engineer unless you're legally licenced as one. Programming may share similarities with engineering but it lacks the professional accreditation and liability.


Computer science is neither about computers nor is a science :).


People are obviously not getting this popular joke in academia. Open a book such as Introduction to Algorithms by CLRS and you will see its all about creating algorithms in pseudo-code, proving its correctness, evaluating runtimes etc. Authors not only pass on "systems engineering" but even producing algorithms in actual language that can be compiled on actual computer. Don't get me wrong, I love the book and have gone through every single page and vast majority of excerices twice and truly enjoyed it. It is then when it hits you what computer science is really about.

The folks in fields like mathematics or physics didn't used to consider "Computer Science" as "real science". As fun fact, there were no journals on computer science for quite long time. Researchers like Dijkstra would identify themselves as "Mathematician" and publish their now very well known algorithms in mathematical literature :).


It's about the math of computation then, which can be carried out by any arbitrary system that we humans deem as carrying out symbol manipulation.


This is not a particularly new observation.

My half-assed analogy:

CS is to SE as Physics is to Mechanical Engineering.

In both cases, it's unwise to trust one category with screwdrivers...


I've often thought there should be a third position a 'software designer' someone whose job it is to translate customer requirements into a design which the engineer can build.

In Engineering you have architects/industrial designers etc. They work out the product specifications and then ask the engineers to deliver an efficient workable solution that fits those specifications.

Sometimes at least from my view it feels like software engineering reverses the two roles. i.e The Engineer supplies the api and customer works around design. Think about something classic like Unix it feels like in some cases engineering has constrained the design rather than design constraining the engineering. This is not necessarily a bad thing but it is different.


> In Engineering you have architects/industrial designers etc. They work out the product specifications and then ask the engineers to deliver an efficient workable solution that fits those specifications.

I've always thought programming would eventually go the way of mechanical engineering with engineers doing design vs fabricators doing the manufacturing. The closest we've come so far in software I think is having one person doing architecture or write a spec and others implement the code. Not quite the same but I wonder if we'll get there eventually.


I quite like your analogy. If software engineer is like a mechanical engineer, what would a programmer be like, a mechanic?


I love your analogy! Then there's the Howard Wolowitz's of the world who try to throw a pitch at a ball game using the Mars Rover. Not quite an applied technical person and not quite a pure theorist like Sheldon.


Sure. Engineering is an applied science. So, these cannot be equal.


100% agree.

So unless you spend all day writing compilers from scratch or calculating Pascal's Triangle, please stop with the ridiculous CS questions in interviews.

Software Engineering is more of a trade, and requires vocational knowledge and experience. A mountain of theory may not always be required to Get Shit Done.


I disagree.

* I routinely find people using the wrong data structure, when there exists a better one, with better O() time/space.

* I find people tend to not understand BTrees, particularly when there are two attributes being indexed. Given an index on (a, b), I find it common misconception that the BTree can efficiently answer `$a_min < a < $a_max AND $b_min < b < $b_max`. (I.e., people do not understand that the tree cannot make use of the second < condition, and must scan potentially many more rows than they intend.)

* Graph theory. git uses it. Any sort of dependency tree uses it.

That said, I acknowledge that software engineering does require a lot of non-theoretical knowledge, which is why I ask both types of questions in an interview.


That's where I disagree. For 90% of the code it wouldn't ever matter if you're using a list with O(n) or an array with O(1). And in the 10% it can usually be remedied afterwards. Usually the reviewer will point it out and it's a 10min fix.

On the other hand I've seen so much hand-optimized code that used the correct data structures to solve the wrong problem.

Yes, that is kind of unfair - you want people who grasp the problem and find a good solution. But I prefer the people who solve the problem at hand in a wrong way instead of solving the wrong problem in the best imaginable way.


> Usually the reviewer will point it out and it's a 10min fix.

And in order for the reviewer to recognize the situation at all, and then point it out, they would need knowledge of the underlying data structures, their performance, and as you mention, pragmatism to know when to use it.

(I'm presuming that everyone on your team shares the responsibility of code review, and you don't funnel it through a few people with knowledge of how things work. (I believe that robs those that don't of the opportunity to learn through review.))


You're absolutely right. But there's a difference between "spotting an error" and "being able to construct $datastructure_I_use_once_a_year" in an interview


Yeah, but data structures and binary trees are about as much as you really need to get. There's a big difference between selecting the right data structure and re-implementing dijkstra's work on a whiteboard.


Counterpoint - I routinely see people ignore the actual real life cost of pointer chasing because O(n) (you don't need a linked list if you only have 10 small elements).


I am assuming you are talking about database indexes. The documentation for the database that I have most used actually specified what the various supported types were good for. It made choosing the correct type of index much easier. The documentation may be sufficient for determining which index types will perform well for the expected queries, even if knowing detailed information about data structures is nice to know. Having documentation which describes the data structures well enough is really nice though, and should be encouraged a lot.


Does your job actually require knowledge of BTrees?


Yes. They're the typical data structure backing most relation database indexes. Knowing when they will perform well (and more often, when they won't) follows directly from knowing their structure. (The example of trying to do a range search on two dimensions in the post above is an example that doesn't perform as well as — again, I find — people naively expect it to.)

Have I ever implemented one? Not yet. Do I ask for a BTree implementation or exact, low-level understanding on an interview? No.


Yes! Our database blew up in our face because we misused and had wrong performance assumptions some indexes. Colleague had knowledge of btrees (the underlying data structure of database, iirc) and database internals. Optimised the whole thing. Queries are now 500x faster


Computer engineer != programmer.

1. Computer engineer is someone who knows computer science. Acquires knowledge that survives when tools die. Hired by Google and other firms.

2. Programmer. Blue collar worker writing bean counting programs. Mainly writes customer software for order. Gets shit done. Becomes difficult to employ when he turns 40 if the shit he does is outdated.


Google and the majority of silicon valley grossly abuse the term software engineer (which is what you describe). There are very few software engineers at google, mostly just developers, many of whom are actually PhDs, which is also not an engineer.

Engineering is about process and responsibility. A professional engineer is often a protected term, and comes with a certain legal authority to approve designs as sound and correct. The purpose of this is to give people a reasonably reliable way to ensure that the people they go to for engineering designs meet a minimum level of professional knowledge. AFAIK in some cases the engineer can be held liable if their design fails causing death.

Can you imagine if developers had to take on some liability by legally and officially "approving" that a software design meets some criteria of being "free from error"? It's almost crazy to consider it, though you can imagine that such a thing would actually be pretty important for say, medical equipment software, or aircraft control software. Unfortunately this is pretty far from the reality of software engineering.

Unlike other forms of engineering, there is no such thing as a "professional software engineer", though you can get a degree in the _topic_ of software engineering.

TLDR there are no "software engineers" in the same sense that you have electrical or mechanical engineers. The protection on the term engineer in software is commonly contested, though perhaps less so in the US then elsewhere.


I agree with you that "Software Engineer" is often used pretentiously. Supposedly a lot of North American companies advertise "Software Engineers" because the job category "engineer" is mentioned in the NAFTA treaty (i.e. it's easier to hire people that way).


Funny, in Canada they appear to do the exact opposite because engineering jobs are regulated. So instead of "software engineers" it's more common to find "software developers".


Most electrical and mechanical engineers aren't Professional Engineers either. A Professional Engineer's sign off is only needed on safety critical projects, so most don't bother with it because they don't work on safety critical projects.


A "computer engineer" is someone who designs microprocessors and physical computational hardware - not software. (And for the laity, it is not someone who fixes your broken desktop PC either).

Even if you meant SE instead of CE, I feel your distinction is arbitrary and almost classist - while it's true that the top notch folks hired by Google and Microsoft could write "bean counting programs" its fallacious to assume the contrarywise - major accounting and business software firms are just as picky when it comes to hiring - similarly I know plenty of small startups writing exotic software that are able to ship without needing to hire everyone from MIT an Stanford - I also know plenty of very intelligent and capable minds going to waste at companies like Google, MSFT and Facebook working on projects they dislike or for low-impact internal systems - while their contemporaries who went to a coding boot camp got picked up by Snapitterbook and become hot stuff despite never having read the Mythical Man-Month.


Computer Engineer != computer scientist.

1. Computer engineer works at and around the boundary between electrical engineering and computer science. Hardware, firmware, drivers, and other low-level systems that interact closely with the physical world. Hired by Intel and other firms.

2. Computer scientist knows the study of computation. Computer science tends to abstract the hardware away and consider ideal systems.

There is of course some overlap. Computer engineering is all about the overlap between CS and EE, so some CS people will work in some of the same areas as CEs. Likewise with EEs. And the terms are fuzzy, and may have different definitions to different people, but the distinction I made seems to be present in most college degree programs I've seen.


Assuming you meant "Software Engineer" not "Computer Engineer".

>> Acquires knowledge that survives when tools die

I would actually agree with this, the kicker is though this "acquired knowledge" is not CS knowledge - it's understanding of how to get shit done in the "real world" and is not something you can easily acquire by reading a CS book.


No, programmers get paid 6 figure salaries to work at top tech companies in the bay area.

Also,the "computer science" stuff that is asked in Google-style interviews is vastly overstated in terms of difficulty.

The interview stuff can be learned by most any programmer, in a couple months of intensive self study, involving going through practice problems in Cracking the Code Interview. You don't have to be a genius to do that. It is mostly just repetition and pattern matching.

I went through Google's process recently, and every single algorithm question that I was asked, was a problem that I had studied/seen in advance, word for word.

It turns out that there are only a pool of a couple hundred (thats a high estimate actually) algorithms questions that most interviewers will ask you, and if you've studied all X hundred of them, then you are good to go. No PHD or CS degree required.


> So unless you spend all day writing compilers from scratch or calculating Pascal's Triangle, please stop with the ridiculous CS questions in interviews.

> Software Engineering is more of a trade, and requires vocational knowledge and experience. A mountain of theory may not always be required to Get Shit Done.

First, interview questions are engineering problems that either applicable to real world systems, or were inspired from them.

Second, interview questions are used to weed away candidates. At certain point, there will be a few candidates that can answer the questions and articulate sound engineering process. And that's one got hired.


"Implement a recursive depth first traversal of a tree" isn't an engineering problem. It's a CS textbook problem. A fairly basic one, surely, but about as useful to a software engineering project in most cases as the ability to smelt and construct a screw is to an automobile engineer. And the software interviews GP complained about these days don't even start with something that trivial. Usually it's more absurd, and involves what are essentially little math tricks.


> but about as useful to a software engineering project in most cases as the ability to smelt and construct a screw is to an automobile engineer

I strongly disagree. Having a deeper understanding of how the data structures you use day-to-day (whether or not you implemented them or not) is vital to you picking the correct structure for any particular task. Not knowing the basics of how to implement a binary tree, a singly or doubly linked list is a massive red flag in terms of the competence of a candidate in my humble opinion.

I think absolutely there are classes of data structures that are obscure or leftfield that shouldn't be asked of a candidate, but there's a hell of a lot that a professional should know, regardless of whether they're using a library that does it for them.


> Not knowing the basics of how to implement a binary tree, a singly or doubly linked list is a massive red flag in terms of the competence of a candidate in my humble opinion.

I agree.

On the flip side-- upon successfully implementing a linked list, a clever candidate might let it slip that he/she designed the interface to be performant with a particular use case in mind, and that they weren't sure if the interview had the same expectation for that use case. If the interviewer consequently signaled agreement with the premise that linked lists have performance benefits in certain cases, it's a good indicator that the people in charge probably aren't thinking critically about the software they are producing.


Is having a deeper than big-O understanding of data structure truely necessary?

A mechanical engineer will have an understanding the characteristics of different screws (tensile strength, corrosion resistance, material compatibly etc) enabling them to pick the right one for the task at hand. Understanding the crystalline structure and the ability to smelt one yourself is likely out of scope and probably predefined by domains experts.

Why can’t software work like this?


By the time you've memorized all the cases where a binary tree does and doesn't make sense, you might as well have just learned how they work.


Exactly. I'd be very wary of any professional that hadn't had the intellectual curiosity to at least understand how they work, never mind actually try to implement one.


Where are you working that you don't have tree structures, nor can't recite the generic three lines of pseudocode to traverse it?

I have used trees in almost every moderately complicated program I've ever written, both for fun and at the office.


Sounds a bit like deriving the equations of kinematics over and over again to me. To each his or her own, but that just sounds incredibly boring to me.


I mean, it's literally a for loop:

    def visit(node):
        for child in node.children:
            visit(child)
I don't hire people in my current position, but I certainly think anyone baffled by that would be a giant red flag. It's simpler than FizzBuzz.


I mean, I keep writing for-loops even though I've written many of them before.


Found the imperative programmer.

I'm only half-joking: how often do you need an actual for loop instead of the more abstract concepts of mapping/reducing collections?

I think you made his analogy stronger, not weaker.

As a programmer I am (should be?) concerned with the actual implications of which data structures I use, not how to use them. It should be seamless.

You wouldn't judge an EE on his ability to build wires after you give them some copper and plastic. Sure, he could with tools and a bit of time. It's just a waste of everyone's time and effort.


Some languages don't have first-class functions that you would pass into a map or reduce function, which makes writing loops unavoidable.

But even if you do use map/reduce all the time instead of loops, you're just saying the same thing I did in another way - unless you're doing it for the intellectual thrill of writing a call to map/reduce for the n+1th time.


I wouldn't ask this question but I don't think it's irrelevant.

It shows you that the person understands concepts like recursion, is familiar with data structures like trees, and understands how to express and work with those concepts in code.

Completing this problem should take only a few minutes and tells you whether a candidate has a basic degree of competence. When someone breezes through a problem like this, you learn something. If someone struggles with it, you also learn something. You can use that knowledge to calibrate subsequent questions and find the limits of their knowledge.

I would be skeptical about the effectiveness of software engineer candidate who isn't comfortable with trees or recursion or algorithms like search. It's a weed-out question like FizzBuzz.

I wouldn't ask this question because I think there are better questions available that allows candidates to demonstrate mastery of concepts like these and also have depth so you can go into substantially more detail when the candidate performs well. (And questions that permit multiple solutions, etc.)


Then you want to hire people are good at interviewing; not necessarily people who are good at their job.


I have learned programming on the job, not in school. I think I have a pretty good handle on things like recursion or traversing or implementing structures. But I don't speak the lingo so I don't know what "Implement a recursive depth first traversal of a tree" means exactly. I have been doing this for 25 years now and I don't think I have ever heard anybody formulating a problem that way.


What do you mean, you don't understand "lingo"? Do you not understand english? Because that's an easily understood english sentence, and you just claimed "pretty good handle on things like recursion or traversing or implementing structures".

What's the difficulty with questions like that? But honestly you are lucky if you get asked these questions. Usually it is more about some absurd dynamic programming questions.


I suppose the debate is really between those who want to know and those who don't want to or feel that they don't need to.


That is a perfect example for why you need CS in engineering! Please do a right click view source on this page, what do you see? Look at the comment structure on this page, what do you see? Nobody could work with any of those without using recursive depth first traversal. And once someone understands recursion, the implementation of it is absolutely trivial and obvious. Why would you not use that? The same goes for all the other CS textbook problems, they are specialized screwdrivers that are useless until they aren't.


It's hazing.


I would never hire someone who would rather spend 20 minutes thumbing together a solution that may work instead of someone who would rather read available documentation and literature to find either...

    1. A trusted library that implements a complex feature
    2. A way to abstract away the need for a complex solution
    3. Feel the need to rush an implementation of a mission critical piece of code
I'd say software engineering is more about organization, abstracting, and simplification of a problem than it is about writing complex data structure implementations or complex algorithms. When you do need to implement a complex data structure or algorithm I think it's much more wise to survey available (and current) literature and implement the algorithm after thinking about the problem for a day or two then it is to attempt to implement it yourself in front of 3 people in a stressful time.

I'd expect no one I've ever worked with to be able to correctly meet any business requirement (full battery of tests, an attempt to avoid the more complex solution, documentation for the need and edge cases of an algorithm, real error messages, abstraction or library-extraction of this algorithm into a documented sub-project in our company's git server, etc) I have with a big 5 styled interview question.

It boils down to this:

    1. Your question is so simple it is stupid to ask someone who is actually qualified
    2. Your question is so complicated that anyone who would feel semi-confident in having actually solved it in 40 minutes is someone too dangerous to keep around


> It boils down to this: 1. Your question is so simple it is stupid to ask someone who is actually qualified 2. Your question is so complicated that anyone who would feel semi-confident in having actually solved it in 40 minutes is someone too dangerous to keep around

Really? You don't think there's any possibility of a middle ground here?


I don't think there is anything representative of an employee's quality of work that can be done during a short interview that would have a strong correlate to productivity and quality. I think a real-world take-home problem followed by a meeting-style presentation of your work, your solution, and a Q&A about the implementation with a representative subset of your peers would be better as this would actually be representative of the work load at hand at the company.


the basis of the question its self invalidates middle ground


bingo


> interview questions are engineering problems that either applicable to real world systems, or were inspired from them.

My experience has shown me that in all cases except for two, this is false.


Agreed, a good interview question is also a springboard to a discussion of engineering practices related to the problem.


The only work environment viable engineering answer is to get a library to solve whatever the problem at hand.

At best a top candidate should be able to turn a whitepaper into working code. Knowing trivia rarely translate in solid, productive teams.

Very, very few companies are pushing the what's known barrier.


It's not as simple as being "more of a trade." It's quite similar to the distinction between physics and, say, aerospace engineering in that regard. I'd never describe the latter as being a vocation or trade.


Agreed. Engineering is a "profession". You wouldn't call medicine or law "trades". In some jurisdictions (Canada for example) calling yourself a "software engineer" without having a professional engineering license is unlawful (although lots of people still do it and it's difficult for the regulatory agencies to enforce it at scale).


[flagged]


Personal attacks like this don't belong on Hacker News. We detached this subthread from https://news.ycombinator.com/item?id=14945322 and marked it off-topic.


Ha! Which one of your syllable-splayed psych terms defines a person who is tired of being stereotyped by academia?


I'd say neither. The parent sounds like he knows his stuff without down playing his knowledge (inferiority complex), and isn't pretentious enough to call himself a software engineer (superiority complex).


I don't think there was any need for that kind of response.


Yeah, no shit...

Truck Driver ≠ Road Planner




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact

Search: