I'm not sure how they could even measure the goal here. The synopsis from FBO implies this is a broad program:
----
DARPA is soliciting innovative research proposals in the areas of formal methods, program analysis, compiler design, and runtime and virtual machine implementation to realize tools for the construction of long-lived, survivable, and scalable adaptive software systems.
----
Skimming over the PDF, they seem to be brainstorming all over the place without much of a clear direction.
There's nothing I can gauge that is theoretically stopping us from having century-long running software systems. It's already fully possible. All this talk about formal verification, replacing protocols, defining API/ABI subsets sounds over the top, when more fundamental issues are providing language runtimes that allow for high concurrent units of execution with resource and failure isolation, restart strategies, preemptive scheduling, live upgrade mechanisms and so forth. The VM is your platform.
And it's not like even that is strictly necessary. COBOL systems running on old iron chugs along merrily, though then again I've heard that mainframes have some rather elaborate fault tolerance mechanisms that haven't really leaked much into the rest of the mainstream.
Errrm, you are right so long as nothing changes. As soon as anything changes the merrily chugging COBOL becomes a weird frizzie step aunt from an obscure island near Antarctica and, well, stops the chugging and starts jitterbugging.
Then everyone does a "what the hell is going on" dance and wheels barrows full of cash into a carpark where "consultants" pour lighter fluid onto them and burn bonuses in a horrifying money sacrifice to the dark gods of bug fixing.
After awhile everything returns to normal.
Where did you hear the thing about fault tolerance from? Is there a seat at that bar for me?
Skimming over the PDF, they seem to be brainstorming all over the place without much of a clear direction.
That's exactly their remit - fantasize about cool stuff that seems technologically feasible and would provide a strategic advantage if we had it, and wait to see what sort of ideas people try to sell to them.
That's a reasonable point about COBOL, but it's be nice to have that sort of 'it just works' predictability in smaller and more modular packages.
As I said in another comment, but will repeat here: the goal is to make a new COBOL that can be mechanistically upgraded to run on new hardware, so that way we don't have to maintain legacy systems to run our dated code.
> The Building Resource Adaptive Software Systems, or BRASS, program seeks to realize foundational advances in the design and implementation of long-lived software systems that can dynamically adapt to changes in the resources they depend upon and environments in which they operate. Such advances will necessitate the development of new linguistic abstractions, formal methods, and resource-aware program analyses to discover and specify program transformations, as well as systems designed to monitor changes in the surrounding digital ecosystem. The program is expected to lead to significant improvements in software resilience, reliability and maintainability.
Essentially, DARPA wants someone to build them a JVM and languages on top that formally specifies the VM dependencies to ensure secure execution (and execution at all), as well as the semantics of that language on top of the VM. Their goal is to use the formal specification to allow for automatic transfer of the VM to new hardware and platforms without breaking the security by changing something about the security model. A secondary goal is to allow automatic forward porting of things, so we don't have to run old iron (which make present security problems), we can run the latest iron with old code.
An example of what they're looking for is a system where you can provide a formal model of the new hardware, and it will compile a new VM for it which ensures the VM promises are true (which in turn ensures that the software promises are true, and hence that your security works). They're just not ready to produce such a thing yet, so they're still at the phase where they're gathering theoretical work on the topic, to decide on what their final approach will be.
This likely relates to their heavy funding of projects like HoTT (along with its applications to cryptography and other uses of formal methods), and is exactly in line with other major DARPA undertakings.
Thank you! That was an excellent summary that didn't require me to look at that page any longer. I mean, I like contrast, but I want to keep my retinas, too.
yeah thats why its pretty cool - im not sure people get that easily from the darpa description.
ive been waiting for similar things to get done in the public domain or commercially, but whatever comes closer to it generally fails to get adoption (mostly for financial reasons..)
Having read the Wikipedia article, it sounds like DARPA is looking for research papers about making a formally verifiable version of that idea, possibly with some more modern architecture ideas worked in.
You could compare this to a 100 year old car. But that's not accurate. There are 100 year old automobiles in use, and likewise I expect some 40 year old software to still be in use in 60 years. Launch a space probe and its software will run as long as there is power.
But that's not interesting.
Here we are taking not about software not as an instance, a physical entity of computation. We are taking about software is an idea, a design for computation.
What DARPA is looking for is the equivalent of a car design that is still in use after 100 years. One that can be maintained and occasionally modified in backwards compatible ways.
But that's a little stupid, both for automotive designs and for software (computational designs). Creating a 100 year car design is probably at best fruitless and at worst counterproductive. Trying to create a 100 year software project shares the same pitfalls.
I'll bite and play devil's advocate, with a few qualifications. If we consider a class of vehicle rather than a specific make/model, there are certain kinds of vehicle designs that are approaching 100 years in age. The pickup truck, as a basic design, came about ~1930 and has proven to be an immensely important automotive variant. In 15 years it will be a 100-year old automotive design.
I think the question DARPA is asking is of course different and maybe more fundamental, what are the reasons software stops being useful?
My guess is that list of reasons is fairly finite, e.g. assumption made about memory availability and speed no longer hold, so new software can now outperform old software, etc.
But it looks like we've arrived a kind of plateau, and we also have a pretty good idea what's going to happen in the near future with basic computing.
- Computers with 64-bit memory addressing provide enough memory for just about anything we can think of over the next 100 years
- We'll continue to outfit computers with more RAM, more cores, more main storage and all of it will start to get faster following observed trends
- and so on
So if we can write software assuming these questions and answers, why wouldn't it be capable of expanding to fit computers 100 years from now and continue running?
> - Computers with 64-bit memory addressing provide enough memory for just about anything we can think of over the next 100 years
Keep in mind we use memory mapping for files too, so the memory addressing needs to be suitable for that as well.
Storage capacity for both the desktops and servers I've used have increased quite steadily at a rate of something like a factor of 1000x per decade over the last 3 decades.
(RAM has followed a similar trajectory, so the below largely applies for RAM as well)
Lets say we are conservative and start from a base of 1TB today (considering it's trivial to get servers with tens of TB that is conservative). To be able to memory map 1TB we already need 40 bits. Each 1000x increase costs us approximately another 10 bits. That gives us less than 30 years of similar growth in storage capacity before we lose the ability to memory map the data on it with 64 bit. (in case you think memory mapping 1TB today is silly, consider that I can pick up off the shelf servers with near a TB of RAM today, and can certainly get higher end systems with far more than that).
Even if your starting point is that you "only" want to be able to memory map 1GB today, that still only gives you one extra decade if the growth remains the same.
And while it may be realistic to assume that "nobody" mmaps files that big on desktops, it's not on servers - the first time I ran into limits mmap()'ing data on 32 bit computers was about two decades ago. Over last few years I've repeatedly have to rewrite stuff to use 64 bit because 4GB most decidedly was not enough. And stuff keeps ballooning.
I fully expect to see systems hit 64 bit limits in my lifetime (I'm 40 now). For that matter, I expect to still be working when 128 bit systems becomes mainstream.
In the car analogy, I think they're looking for a way of designing cars that can have the engines swapped out for newer faster engines, and then for electric motors, and then even hydrogen fuel cells and beyond, without having to change the rest of the car.
I thought all large software products effectively last forever. Isn't this why COBOL programmers still have jobs for life? I would have thought the challenge is to create systems that will die.
DARPA is looking to invent a COBOL that formally specifies what the dependencies of the program (regarding the machine) are, and allows for it to automatically be transferred to newer hardware (and apply security patches to it, as the underlying abstractions change).
> The Building Resource Adaptive Software Systems, or BRASS, program seeks to realize foundational advances in the design and implementation of long-lived software systems that can dynamically adapt to changes in the resources they depend upon and environments in which they operate. Such advances will necessitate the development of new linguistic abstractions, formal methods, and resource-aware program analyses to discover and specify program transformations, as well as systems designed to monitor changes in the surrounding digital ecosystem. The program is expected to lead to significant improvements in software resilience, reliability and maintainability.
If DARPA succeeds in this, it will be a HUGE advantage for everyone.
The way I personally see this happening is everything has to be formally verified. This will guarantee that the file written tomorrow will be successfully read by yesterday's programs, and that contemporary software is compatible with future OSes. (Does this mean that we freeze libc?)
But there is a problem: there are a lot of quirks for hardware in modern OS/drivers, which add weird and possibly unreliable code. Does the hardware+firmware has to be formally verified too?
Do we have to run our software on all hardware that exists, e.g. starting from 6502 and until some future CPU?
Do we want to use POSIX? Subset of POSIX? cough Plan9? cough
Another problem I see is seemingly inevitable software bloating over time.
- What features do we have to include in our OS, our kernel?
- Does this list of features only grows over time?
- Do we want to have GUI as a requirement? What if UI paradigm changes?
There is also a bloating of protocols, e.g. TLS. Maybe replace TLS with CurveCP?
Sounds unpleasant. There are loads of ways of maintaining this property without resorting to formal verification.
Hardware is already largely verified, thankfully, and where formally verified software exists it is probably good to use, but in general?
You wouldn't have to worry about code bloat for sure; it'd take weeks to write every line of code!
The problem is that this sort of code has to be writable by exclusively average developers, and proof assistants just aren't at that level yet (and might never be). A verified implementation of the register colouring algorithm (100 lines of C, perhaps) takes 10,000 lines of code.
Useful, yes, but not practical, especially for a whole OS.
Some of the comments here claim that we already have pretty long lived and robust software. While this may be true, I don't think that's what DARPA is looking for. I believe that they are looking something that could keep an interstellar probe running for few hundred years, or something like a remote base somewhere in space. Something that could detect and fix any issues with the software and hardware. But it's not only about reliability. Think about a Moon base, where the software keeps running for decades, and once in a while new module/hardware is added to it. MoonOS would have to detect new addition and reconfigure itself to work with it. Maybe it could even decide to move some critical parts of itself to the new hardware. It would do it without any input from humans.
>users have become accustomed to periodic cycles of updating and upgrading to avoid obsolescence—if at some cost in terms of frustration.
That fear of obsolescence is what drives those upgrade/subscription revenues. Finding replacement revenues thus is the real task here.
Anyway, TCP is 40 years old and will be with us for the other 60 years (with some fine-tuning for the big latency roundtrips to Mars) - so this is your 100 year system for example. Others mentioned COBOL already and one can also look at Fortran libraries like in astronomy, etc...
> Others mentioned COBOL already and one can also look at Fortran libraries like in astronomy, etc...
If you do that, you might as well mention Algol, as currently instantiated as C++, C, Java, C#, and so on.
COBOL from 1965 isn't the same language a Cobol from 1995; ditto FORTRAN-IV vs Fortran 95. The only difference is that the names haven't changed as much compared to what the Algol family has gone through.
The only reliable approach to this I can think of is to 'write' the program purely as a declarative specification mapping every supported input pattern into a corresponding expected output pattern (for a familiar example, spec driven unit testing: http://jasmine.github.io/edge/introduction.html).
Then one needs to write whatever's going to generate the implementation (currently that 'whatever' is a human programmer). If done right, it doesn't matter what the underlying platform is (theoretically, this could even be used to train a neural network).
I would say that they should build it as modular and clean as possible. With modular i mean possibility to rewrite parts of the system in different languages etc. A distributed system with multiple copies of each module running on different machines in different locations. Build it around a "live" monitoring and control system.
Use the base from the last 50 years, and add tested modern practices. This would include the possibility to upgrade any part of the system without taking it offline.
I don't quite get what the article is talking about. It starts with saying that updates are a problem. Then it talks about things I can't really distinguish from 'quality issues, reliability, complexity issues.."
What does this "BRASS: even mean, anyone got a clue? "Resource Adaptive" seems to be the important part. What are they talking about?
So, for long-lived projects, I'd honestly look at games.
Out Of This World has survived, a couple decades of frenzied developments, as has Doom and Quake and Adventure and Zork. That gets you around the 20-30 year mark.
I'd look at old Unix utilities like awk and sed. That gets you around 40 years.
I'd look towards the banking industry at old COBOL credit-card processing and reporting code. That gets you into around 50 years.
The interesting thing to note is that in the first two cases, long-term support was caused by reverse-engineering and by having the source available to port. As long as there is a machine with an integer unit, we have Doom. As long as there is a floating-point unit, we have Quake.
The third case, though, relies on turducken VMs of 360s running on 370s running ZMachines running on...whatever comes next. For long-term support, one wonders how tenable this is.
https://mcnp.lanl.gov/ has been in development since at least 1957 (according to wikipedia) -- so 58 years, give or take.
It looks like it might be even older than that, it looks like Enrico Fermi may have started it in 1947 (a note in the introduction of http://arxiv.org/pdf/1207.5076.pdf).
---
At Los Alamos, Monte Carlo computer codes developed along with computers. The first Monte
Carlo code was the simple 19-step computing sheet in John von Neumann's letter to Richtmyer.
But as computers became more sophisticated, so did the codes. At first the codes were written in
machine language and each code would solve a specific problem. In the early 1960s, better
computers and the standardization of programming languages such as Fortran made possible more
general codes.
[..]
MCNP5, released in 2003, is rewritten in ANSI standard Fortran 90. It includes the addition of
photonuclear collision physics, superimposed mesh tallies, time splitting, and plotter upgrades.
MCNP5 also includes parallel computing enhancements with the addition of support for OpenMP
and MPI.
Large production codes such as MCNP have revolutionized science − not only in the way it is done, but also by becoming the repositories for physics knowledge. MCNP represents over 500 person-years of sustained effort. The knowledge and expertise contained in MCNP is formidable.
---
So it seems that Fortran is already the 50-year programming language.
I've been thinking about that last question. The closest I can imagine to a solution would be coding programs to run within stripped-down environments, the specifications for which would be included with the program. The system running the program would read that spec, generate a version of that environment (as a container/vm/whatever), and then start the program within it.
So, to offer a slightly stronger version of your requirements, we're needing to ship:
1. The program itself.
2. The behavioral specs for the program.
3. The behavioral specs for the runtime environment.
4. The configuration specs for the runtime environment.
Somewhat hidden in there is the ability to read an aribitrary program, transmute it to another program. That can be hard--especially with a language like C++, where significant computation can occur during compilation.
"Somewhat hidden in there is the ability to read an aribitrary program, transmute it to another program."
There'd probably be cases where it'd fall short of that capability, but ultimately, if the desire is to make sure you can run an arbitrary current program on an arbitrary future computer, you probably have to get pretty close to that.
Sometimes when I read DARPA headlines I imagine someone like the Dude stroking his beard and going "yeaaaaah, let's do THAT man! Wouldn't that be wild?"
I would say it is the same plane in exactly the same sense that DARPA is interested in having the same computing system running for 100+ years; i.e., they want something that can be in use, maintained, adapted to current needs by incremental updates, without every being tossed out all at once and replaced with a from-scratch new system.
There's a second reason I think that they're phrasing it as "long term" computation, but their requirements discuss formal verification in some detail.
It's true that building up a large bank of formal proofs could take large amounts of super computer time -- eg, dealing with something the size of drivers or a kernel -- and that with the current upgrade cycle and manner that hardware is specified, this might not be practical.
However, if you developed a framework for your proofs that generated a proof generator which generated a final proof easily when given a schematic of the hardware to execute the algorithm and made such a framework general enough that you can use it to make schematics when you upgrade your hardware, then your time spent making the proof generator isn't wasted every hardware cycle.
Once your proof system isn't reset every time you upgrade hardware, it begins to make more sense to develop things like your firewall, IDS, routing, crypto, etc code in terms of formally verified systems (that generate machine specific code in automatic ways).
When you combine this project with other DARPA cybersecurity projects, it becomes clear that DARPA is looking to transition key infrastructure in to formally verified, machine upgradable systems in preparation of high-speed automated cyberwarfare. (Think the difference between HFT and a guy on the E-trade website applied to scanning software for vulnerabilities and generating things like Flame automatically.)
tl;dr: DARPA's plan for the next 10-20 years seems to be that they're afraid of AIs beating them at cyberwar, and this is a piece of their solution.
People with paper and pencils? The original computers were people who calculated stuff. It's funny to think both special and general relativity were figured out by then.
---- DARPA is soliciting innovative research proposals in the areas of formal methods, program analysis, compiler design, and runtime and virtual machine implementation to realize tools for the construction of long-lived, survivable, and scalable adaptive software systems. ----
Skimming over the PDF, they seem to be brainstorming all over the place without much of a clear direction.
There's nothing I can gauge that is theoretically stopping us from having century-long running software systems. It's already fully possible. All this talk about formal verification, replacing protocols, defining API/ABI subsets sounds over the top, when more fundamental issues are providing language runtimes that allow for high concurrent units of execution with resource and failure isolation, restart strategies, preemptive scheduling, live upgrade mechanisms and so forth. The VM is your platform.
And it's not like even that is strictly necessary. COBOL systems running on old iron chugs along merrily, though then again I've heard that mainframes have some rather elaborate fault tolerance mechanisms that haven't really leaked much into the rest of the mainstream.