Hacker News new | comments | show | ask | jobs | submit login
The Physics of Software (physicsofsoftware.com)
86 points by tronotonante 10 months ago | hide | past | web | favorite | 37 comments

I agree this is an abysmal article.

A much much more rigorous proposal was put forward by an American mathematician several years ago regarding the geometry of trajectories of computer programs, which holds considerably more promise than whatever flimflam this author was going on about:


Thank you, this looks quite interesting.

Haven't read anything ever, including the subject of this thread, and including any number of language-advocacy opinion pieces, and including software development methodology opinion pieces, that in any way compares with the empirical data-based work of Manny Lehman (R.I.P.).


His resulting model of software evolution explains many observations we engineers make about "feature creep," "technical debt," balance of effort between development and maintenance, etc.

I think his model also helps point the way to techniques we can use (e.g., DSLs in preference to OOP) to help improve how we build and maintain software.

These aren't laws of nature. It's just people being shit. The solution is fixing our culture so that people aren't shit.

Sorry to not understand exactly what you mean by those two statements. Would you please clarify?

I of course realize that people are not perfect (along any axis at all), but I think the importance of this work is highlighting:

(1) software must evolve in order to continue being useful (is this the "cultural" part of your comment?), and

(2) that there are some unavoidable limitations of human capability to achieve that (is this the "people being shit" part of your comment?), particularly in dealing with complexity. And that complexity is an increasing function of lines-of-code count.

So, the obvious conclusion to draw (for me, anyways), is we should work to minimize SLOC by various means: DSLs, &/or expressive/powerful languages.

(0) We need a culture that penalizes writing incorrect specifications and programs as harshly as possible.

(1) we perform vastly below our potential because we spend so much time working around our own and each other's bugs. People are shit not because they're dumb, but because they stick to the wrong attitude in the face of disastrous results.

Regarding (0): I am not sure what mechanisms for penalty would be better way than those currently in play (e.g., disuse/cloning in the case of open source, bankruptcy in the case of closed source, etc.).

I believe Lehman formalizes the dichotomy you imply as bugs in the "model" (you say "specification"), and bugs in the model's implementation. I think the distinction is important in that the model/spec is an evolving/moving target, subject to evolutionary forces. And the number of bugs in the implementation of any given model's snapshot in time is an increasing function of SLOC, and that updating the model can produce more/latent bugs in previously un-buggy code. (Which I take to mean we need minimize lines of code for any given amount of functionality. Less to write, less lines containing bugs, less to modify, etc.)

Regarding (1): I agree that we perform vastly below our potential due to bugs, but also many other factors - though it might be hard to agree on the what those contributory factors are, and their relative contributions.

Caveat: I’m a high school drop out with zero college and poor study habits. I’m a consultant and have a tendency to spout bullshit from time to time.

I’ve been striving to find a set of fundamental laws or first principles for software development for several years now. I don’t think we should confuse a need for first principles with physics itself. I see no benefit to finding parallels between each aspect of physics and contrived relationship with software.

What I humbly believe is necessary is an understanding of the natural fundamental laws that come into play when making software. These would be evidenced by natural observable phenomena. I believe that most of these laws WILL NOT COME FROM COMPUTER SCIENCE. This is because making software is more than computer science.

As a brief introduction to my thesis let’s consider what things come into play when making software and their relationships to each other. Then let’s consider the areas of study which come into play for each relationship.

We have humans, computers, and software. There are fields of study for each relationship.

Humans <-> Software (HCI, semiotics) Humans <-> other Humans (group psychology, language) Humans <-> themselves (positive psychology, et.al.) Software <-> groups of humans (complex adaptive systems, economics, ?) Software <-> computers (computer science, computer engineering) Computers <-> other computers (networks, information theory) Software <-> other Software (complex adaptive systems)

If we look at each of these fields what laws or axioms can we find that come into play when making Software? Can we reason logically from those principles into a better understanding of software development? Perhaps some kind of field theory? That would be nice.

Software engineering is almost a pure design activity, compared to every other engineering discipline which ultimately results in manufacturing and assembly processes at some stage (software build systems are not the same for various reasons).

Software also suffers in that it is largely unconstrained by physics. If I want to design a new car I am physically limited in how components can interface with each other.

The same constraints are largely lifted inside a computer. I can write a module that interfaces with 5000 other modules directly (it'd be horrifying to maintain, but it can be made). In order to get a handle on software engineering, we have to introduce artificial constraints into our process. Systems engineering and the other engineering disciplines already have these constraints built-in (what materials are available, how does physics work, what kind of power can we actually send down a wire of certain gauge and metal).

What sorts of constraints do I mean? Well, first is simple the structured and modular code of old. This just makes reasoning about the system much easier. The second is things like layered architectures or designing to interfaces. By having to narrow down my activities and communication at and across boundary layers I force myself to come up with either more reasonable designs or something horrifying. Hopefully I go for the former.

You can also introduce constraints via our languages (this is going to come out of CS). Static type systems introduce a constraint. These force you to negotiate with the type system and be more deliberate in your decisions. A dynamic type system or duck typing lets you get away with a lot of things (for good or ill) that can produce hard to maintain systems and hard to reason about systems. Does this mean we need static typing everywhere? No. But if we don't have it, we need to find other ways to constrain ourselves.

Perhaps it's with conventions. Thou shalt have 100% code coverage in unit testing! If it's not covered, you have to get it covered before it can be integrated into the master branch.

Practices help. TDD introduces a physics to the world of your code. TDD may result in a test that says: It is a violation of the physical universe I'm creating for this module to ever change this value (say you're using shared memory across multiple threads, processes, or objects). Obviously, it can be violated, but your test tells you when this violation occurs so you can address it. Much like a CAD system may help a vehicle designer realize that the door cannot open because it will hit some other object. Sure they can place the door there, but physics will not allow it to behave the way they want it to.

Ultimately, to engineer a system is to design within a set of constraints. We need our software to be constrained in order to start engineering it. There will be no single set of constraints for every activity (see Fred Brooks' No Silver Bullet). But there are activities and techniques we can use, language options, etc. that will help within and across many of our programming domains.

Weekend reading material found, thanks!

While not as well-thought out and presented, I've been working on a similar project for the past 6-8 years. I called it the "Grand unified theory of software" (after its physical equivalent) [1] and the attempt was to define, describe and quantify code, how its created, and changed.

The best exposition of my thoughts is probably this mindmap:[2], but the TLDR version is:

* Software has size. Sequences of code adds height, modules add width; together giving area

* Software has weight, like page weight or memory weight.

* Forces act on software to change it, and they could be design constraints, organizational will, etc

* Code is data REALIZED, data is code ABSTRACTED

* and so forth

There is also a book (not mine) called the "Grand Unified Theory of Software Engineering"[3] which takes into consideration not just what software is, but the processes and organizations that cause it to be created and explores justifications for statements like "you ship your organization", for example.

We need more such exploration and discussion, IMO. Since software is largely an abstract thing, having models to help thinking about it in ways analogous to concrete things is useful, if not precise.

1: https://github.com/vinodkd/guts

2: Mindmap flash version: http://vinodkd.github.com/guts/mmap/full/guts.html

Mindmap non-flash version (look below diagram for complete text): http://www.vinodkd.org/guts/mmap/basic/guts.html

3: GUTSE: https://books.google.com/books/about/The_grand_unified_theor...

Having scanned over some of what you link, I would suggest learning a bit about fractal geometry and the possibilities of non-integer dimensions. I suggest this because code and modules probably don't multiply exactly as an area. This will be especially true if you try to go beyond 2D, because the discrepancies between the full R^n space and what software actually does (streak through it in something like a powerlaw distribution) will become worse and worse. (See something like https://physics.stackexchange.com/questions/55269/why-do-fra... .)

See also some ideas in https://codeburst.io/software-estimation-in-the-fractal-dime... or https://www.quora.com/Engineering-Management/Why-are-softwar... for more thoughts that head in that direction.

Yes, I was heading in that direction, while having misgivings about using terms like area. The advantage is that its a familiar term, the disadvantage is that its really not the same physics (if there is one to be found). Thanks for the links

Sometimes, an analogy is just an analogy (and not a model). They both have their uses, but if you are taking an analogy to be a model, you are likely to wander into areas of inapplicability. In particular, you cannot validate a claim about software simply on the grounds that it is so in the physics analogy.

That is true. In many countries, there have been efforts to create a "software engineering" profession, mostly motivated by the desire to elevate software development to the level of a profession, with the attendant standards, professional accountability and governing body.

However, this has been fraught with problems, because software engineering differs from physical engineering in fundamental ways and many people have realized that applying physical engineering principles to software leads one to commit an fundamental error in paradigm.

At a basic level, software is a procedural epistemology that manipulates ideas that do not necessarily have to correspond to physical objects. It doesn't mean that physical principles cannot be applied to a subset of the software development process.

However, there are usually better ways to achieve ends in software if one is not constrained by paradigms adopted from the physical world. Even paradigms that are natives to the virtual world (like mathematics -> category theory, functional programming) have not gained the kind of widespread acceptance that one would think they ought to.

software is a procedural epistemology that manipulates ideas that do not necessarily have to correspond to physical objects

This is why I often argue software development is applied semiotics.

"The map is not the territory." -Alfred Korzybski

This is really bad. It's a set of loose analogies that tries to borrow from the language of physics in order to justify some intuitions about software in general.

On a quick skim there was no solid empirical principle, only a couple of wishy washy equations thrown in to support some wishy washy principles.

There is a principle of minimal action for the physics of nature. Is there such a thing for software? If there is, what is it exactly?

Probably that software grows to fill the available cycles and memory in the computer.


It is often easier to add code than refactor code.

Refactoring is a way to fight against the entropy of software code growth. But it is a deliberate activity and rarely happens by happenstance. Outside of languages like Forth where factoring is a way of life, and disciplined environments like what I read about in Smalltalk shops, entropy is the order of the day.

Software, the process not just writing the code... is so complex (often times even irrational) in terms of interactions (all kinds) that as humans we continue to treat it as damn near alchemy because our comprehension can't keep up.. I think to get what you're looking for there might be more hope for systems writing systems to understand more about what's going on than with humans.

I'm not sure how much sense that makes but anyway I look forward to reading more

Suppose you are a physicist and you have a theory you want to test, ok? Then, you design an experiment to test your theory. If it fails, you are sure that your theory is wrong. Otherwise, you know that it works, unless future experiments show that it fails.

In my opinion, this is pretty like Test Driven Development, which has lots in common with the practice of physics (aka Popper's falsification).

In this physicist's opinion, that's a fairly strained analogy. I sort of chuckle a bit when I hear things like "code coverage" and "test driven development" and the like. These things are dressed in the robes of empiricism, but there's very often a lack of genuine rigor that makes the analogy a bit of a stretch.

It could also mean that the experiment didn't approximate the ideal you were aiming for. It's not nearly that simple.

TDD has almost nothing in common with physics. Engineering is about repeatable process. Science is about repeatable results. Testing, CI, and all the rest are part of a process of repeatably shipping releases of software that are functional. That's all a distraction for science (though some engineering may be necessary to construct the apparatus you want).

TDD is one solution to the troubles humans have when designing complex systems where the entire solution can not be arrived at through a constructivist approach. Put differently the solution is to complex for a human to keep in their head at once. I came to this conclusion after reading “notes on the sysnthesis of form” by Christopher Alexander

TDD is about people, not software.

This looks really nice. I have been thinking of something along these lines for a while, both as a way to handle larger software systems and to grasp quantum mechanics in familiar terms.

What is the “quanta” in software? The bit, byte, computer, program? I think any potential parallels are a seductive trap. Quantum physics is elegant and we desire an elegant model for thinking of software. I don’t think quantum physics is the right model for understanding software.

Quantum mechanics is not mostly about quanta, but a qubit and a bit are parallels, in that both can store 1 bit of information.

But QM is at least 99% about information, maybe 100%. How information is organised - how it can evolve. And QM is code + data rolled into one. It's also made of pure functions; actually reversible ones.

It would be good to understand entanglement from a programming perspective; the article does try to shoe-horn it into his system, but it seems like a very shallow understanding of the physical situation.

Perhaps we can build the software-equivalent of the Large Hadron Collider, and finally get to the heart of Google's search algorithm!

It's revealing that the only measurable qualities of software the author cited were "coupling" and "cohesion".

Physics is based on measurable values and measurable effects.

>the only measurable qualities of software the author cited were "coupling" and "cohesion". [...] Physics is based on measurable values

You've misinterpreted the author's message and didn't realize he actually agrees with you.

He wrote, "However, a property in physics [...] precise stimulus [...]" to contrast that "coupling" and "cohesion" is not measurable using our current mental models for discussing software.

His 2nd paragraph explains the the flaw in his 1st paragraph just like your 2nd paragraph did to your 1st paragraph.

I went back and read it again, though I had found it a waste of time the first time through. I think the third paragraph is more clear: "We just don’t know those forces well enough to even name them"

I feel like he's straining too hard to make the analogy with physics, which doesn't seem very beneficial. Spoken as a fan of both physics and software, the analogy with physics just doesn't seem to offer much additional perspective on the target field (software).

Civil engineering seems like a better analogy (constructed artifacts involving engineering tradeoffs that must work in a complex environment), or biology (evolving systems responding to selection effects in complex, evolving environments).

Not read yet - does he aspire to make the things you're writing off as unmeasurable, measurable?

Can you quantify and measure it? If not, shut up.

There can be value in talking about non-quantifiable, vauge, things too. And besides, this is just a random blog on the web. It hurts no-one and exists mostly to provoke thoughts; it's not yet a finished product.

> There can be value in talking about non-quantifiable, vauge, things too.

There is no scientific value in it, so it shouldn't be presented as a scientific theory (or even as a draft version of one), and especially not as “applied physics to software”.

...And then see those measurements independently reproduced over and over, and does the result offer fundamental insight into the universe we occupy?

The number times I see the word "design" in a statement correlates well with how high it registers on the bullshit-ometer.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact