- Tabular data : TSV (almost all Un*x/GNU tools handle this out of the box)
- Simple "records" : GNU Recutils format (https://www.gnu.org/software/recutils/)
- Formatted texts : Markdown, LaTeX, ...
I know that not everything can be stored as text. But I try to use open, well documented and future proof formats. Examples:
- Images : PNG
- Music : FLAC, Ogg, ...
- If I really need to preserve orignal format/design of a web page: PDF
And of course interacting with those files using the vast ecosystem of countless simple commandline tools and using the same efficient text editor to edit almost all of my documents makes the whole thing a much better experience than all alternatives - at least when text is viable; imho diagrams etc. are still too cumbersome compared to a quick free-hand sketch.
Unless Apple and Microsoft decide that the ability to view and manage files directly is too confusing and dangerous for users and removes your ability to have portable raw text files.
And then all the HN users rejoice about how much simpler life has become when they let tech companies make choices for them and how files were never that good anyway.
Wow thanks for this recommendation. I've got a few things lying around that I've been using awk/bash for and where even sqlite is overkill but it looks like this solves the same issues in a much better and more concise way. I might try converting these this afternoon. Can't wait to give the csv conversion a try too.
I think PDF is complicated, though. (However, there are simpler subsets defined which omit some complicated stuff.) (If you really need to store the contents of a page, PNG might do.)
The SQLite version 3 database format is also unlikely to change I think and it is documented. (SQLite is also in the public domain, which also helps. You can avoid WAL and that other stuff if you want to ensure working in future, I suppose.) (If it does change a lot, probably it won't be called version 3, any more, I think.)
I agree, SQLite will live forever, thanks to its public domain status.
> by no means a format for the long term.
I would take that bet :)
Is this an acceptable case of non-text format? And if it is, what makes it different?
Practically speaking, I'm fine with a constraint where the systemd database format + tools need to be roughly kept in sync. I can't think of a realistic example where this wouldn't be the case. Most of the logs in this database are supposed to be ephemeral.
If you're in banking or medicine or something and are required to keep certain logs for a decade+, you should figure out what you actually want to keep and put it in a format that would be reasonable to access on that kind of timeline.
That can only happen in practice if you take the logger and tools from the same group of developers. Yet another case of forced lock-in from Systemd.
The premise is that somehow binary logs buys you something. Either more precise data or faster access or something else.
Truth is, it could all have been done in ascii and it would have been more portable, accessible and resilient to failure.
To answer your question, I thought and still think it was a bad choice. The only time I would ever be interested in the contents are during early-boot failures, exactly the time when the toolset is limited and most folks aren't familiar with what's available - exactly when simple text is easiest to work with without finding another machine to stare at the `journalctl` man page.
The rationalization about detecting record corruption makes very little sense to me. (Now, there is a valid concern about potential log forgery, enabled by poorly written apps that directly log user input without sanitation. But that's better mitigated in the buggy app, which almost certainly is doing other unsafe things with user input. And if that were actually the concern, they had other choices that would have been far less annoying.)
I'd love to use CherryTree because it supports encryption and is more functional, but it stores everything in a single XML/SQLite file. Neither Zim or CherryTree have mobile apps.
I really tried to use Joplin, which saves markdown too and has mobile apps, but the desktop app is huge. I prefer to use those resources for Keybase.
I've had my daily driver for about 30 years now. My stereo is 40 years old (I use it all day every day).
Modern cars are all run by a computer. I bet this will be really hard on anyone in the future who wants to restore one - there's just too much highly specialized technology in them. Most parts for my '72 Dodge can be made by a competent machinist or metalworker if necessary.
I see this in airplanes, too. People resurrect or replicate airplanes right up to the jets. But the jets? Sigh. You can't make a jet engine in a machine shop. So any that survive are static museum pieces, while the WW2 "warbirds" buzz around outside.
Incidentally my car is also nearly 50 years old and doesn't need a computer to run, although it's received some powertrain and suspension upgrades over the years as well as computers in the form of GPS, cameras, and proximity sensors.
I recently finished restoring a refrigerator from the late 1930s. With new insulation and seals it uses less power than a lot of the "smart" fridges today, and doesn't need a computer to function either. It would probably last another 80 years.
That's pretty fascinating; I'd love to hear more about this.
The engines ran fine, they were taxiing it for takeoff, then the airplane caught fire and burned to the ground. It'll make you cry.
Is this really because of the digitization of things, or because modern jet engines are just that complex on all accounts?
I mean, the act of making the turbine blades only uses some of the most advanced metallurgy techniques known to man. It's not just putting things together the same way you can install a transmission to a chassis or a hard drive to a computer.
As a counterpoint, though - I bet you could make a jet engine in a machine shop. It would be terribly inefficient, loud, unreliable, but there is really nothing stopping you from making something that can produce the thrust necessary for flight (up to an extent). After all, the people in the '40s were blessed with even less resources and information than we have now, and they got it to work.
It's indeed not about computers. Jet engines in the 1950s were designed without computers.
I do know of a couple from-scratch replicas of Me-262s that were built. The only differences were:
1. no machine guns
2. modern instruments
3. modern helicopter turbine engines were used
Nobody wanted to fly with the jet engines of the war years. I recall those engines had a life expectancy of 20 hours. (Or maybe it was 2 hours, not sure.) It was mainly the metallurgy that did them in.
I think they strengthened the nose gear, too, as it had an ugly tendency to collapse on landing.
Correlation between age and presence of global variables?
As to cars, I'm pretty sure you're right - although in the future it will be a lot easier to get custom electronics made (PCB assembly is now very cheap compared to even just a few years ago), the software would require reverse engineering the rest of the car i.e. thousands and thousands of man-hours of design. IIRC the first car to use a CAN bus was around 1991
I also worry about computer-controlled cars being too easy to fix - in the sense that if a mechanical part fails the driver is in theory capable of adapting to that to pull over safely but what happens when your DIY electronic power steering algorithm fails?
The editor I use is https://github.com/DigitalMars/med which is translated to D from the older C version https://github.com/DigitalMars/me
I don't care for IDEs because they only work on one machine (I develop on several). ME works identically on all of them, and is easy to port.
I keep thinking of transitioning to vim, but never get around to it. The text editor is not the gating factor to my productivity, anyway. Most of my coding sessions consist of simply staring at the code. What helps most is lots of windows open on a BFM (Big Frackin Monitor). I want a wall sized monitor with a retina display.
It is the reason to continue to have phonograph records; sound quality has nothing to do with it.
I think the main attributes of the program that contributed to the longevity were:
- no ORM or other complex library to interact with SQL. Just the basic SQL client.
- used a very simple schema
- documented everything in the system in comments
Some engineers who have been in the company for decades usually say how a lot of our legacy codebase was cutting edge at one point and how a lot of their peers would have written it differently if they knew the software would still be in use decades to come. Safer, proven, simpler constructs with minimum dependencies seems to be the way.
How would you write your code today if you knew it would of been your last commit and still in use in 30 years ?
I keep that habit currently by separating work with virtual machines. Storage is cheap and I can come back to my project tomorrow or in 2050 with amd64 emulator. It is also easy to backup or archive it - just rsync the image to NAS or burn it on DVD.
Those images don't stay bootable, for different reasons:
* Media changes - Try booting from your tape, or your floppy disk set.
* Unsupported new hardware - Remember how your Linux boot partition needed to be at the beginning of the HDD? Or how Windows setup would not recognize SATA drives unless you added drivers from a special floppy disk?
* It boots, then gets stuck or goes blank on you - display driver issues being a common cause of this.
I assume some common formats do not change, but it is good to keep some side-notes how to run the thing and what hardware to emulate.
I use Linux because it is open source and boots with broadest hardware range possible - I am sure it will run on emulator of 2010-most popular PC platform in few decades.
The risk is that the components of the reproducible build may no longer be available 50 years from now. But rolling your own archival is not bulletproof either, for the same reasons that untested backups aren't bulletproof.
My answer to your last paragraph is: Test your backups! If you don't, then you don't actually have backups; you just have a nice dream.
Generally: minimize dependencies. External library or API dependencies? Versions can drift, the system can change out from under you in incompatible ways. That goes for the OS, too, of course. Data dependency that you aren't 100% in control of? Same. All are forms of state, really. Code of the form "take thing, do thing, return thing, halt" (functional, if you like—describes an awful lot of your standard unixy command line tools) is practically eternal if statically compiled, as long as you can execute the binary. Longer, if the code and compiler are available.
This. It doesn't mean go overboard with NIH, but you have to evaluate and select your dependencies judiciously. It's not about developer productivity with these types of products.
Also, make as much of your program configurable as possible so you can tweak things out in the field. For example, if you have a correlation timeout. Make that configurable. But don't go overboard with that either. :)
Of course, this is just a good choice regardless. Still, it shocks me how often people will choose libraries and frameworks that require very opinionated structure on large swathes of code, rather than having well defined minimal touchpoints.
Separate the engine from the I/O and user interface. This is also the key to porting it to different systems.
For example, if your engine #include windows.h or stdio.h, you're doing it wrong.
Scribed in gold with a Rosetta code and complete instructions to recreate the computer it runs on. That ought to last a while.
Or write everything in lisp.
I foresee that in the coming years I too will long for "old" technology and devices that just do what they are advertised to do, unencumbered by data harvesting and loads of hidden "features" that are not really there for my benefit. I can already relate to the farmers wanting old school tractors that just do the job.
This model works well, and if something breaks it's trivial to fix. Also, it means non-modal physical controls for absolutely everything. I've driven newer cars, and there's nothing in them that I miss when I go back to my own car.
 A lot of functions are controlled by separate hardware modules. The turn signal relays and clicker, for example, are contained inside the emergency flasher button, and swapping it out takes less than a minute. Same for the windshield wipers (relay box under the steering wheel), radio (standard double-DIN), rear-window heater (relay and timer inside the dashboard button), window and lock controller (part of each window motor), and the headlight switch (switch on the dash that physically switches the light circuits).
 Not even Apple CarPlay, which I see as largely useless, because my windshield-mounted Garmin GPS unit and a magnetic smartphone mount with a Bluetooth-enabled stock radio does everything CarPlay does, but better.
However, SW is closed source and the licenses can't be renewed, in the end that's what's making it unmaintainable.
Is there an older piece of software that we still routinely use?
which is underlying the LaTeX set of macros and stuff that is still in use for academic papers.
 Also TeX is a good example to bring up as an early example of a complex system that was intended to have freely available source code (before the term "open source" and before GNU).
When he created TeX, Knuth said there was a lack of example source code that could be freely examined by students and others in the field, so he released the code and in fact published it in book form along with the source for Metafont and the code for the Computer Modern fonts.
vi is from 1976 
IBM VM dates back to 1972. IBM's other mainframe operating systems trace their lineage back to OS/360, which is 1966. The XEDIT editor is 1980.
I believe codebases for all of the above are still in active development and are certainly in active use.
Stuff like cat, sed, awk, od, dc, head, vi, echo, sh, uniq, diff, sort, and so on.
I suspect, due to it's widespread use in embedded systems of all sorts, there will be things still running MS-DOS 50 years from now.
Pretty sure SCO Unix (yes, you can still buy it) is System V tar. Probably Solaris, too.
I recall AIX tar is...something else, none of the above. I don't recall the details.
A tar command appeared in Seventh Edition Unix, which was released in January, 1979. There have been numerous other implementations, many of which extended the file format. John Gilmore's pdtar public-domain implementation (circa November, 1987) was quite influential, and formed the basis of GNU tar. GNU tar was included as the standard system tar in FreeBSD beginning with FreeBSD 1.0. This is a complete re-implementation based on the libarchive(3) library. It was first released with FreeBSD 5.4 in May, 2005.
To know if something will work in 50 years you have to look at the weakest links and that's not necessarily the language.
If in the meantime Bash has fallen out of favor, have fun identifying what all these cryptic commands do, without having an interpreter available to check your assumption of what it's doing.
Yes, granted, Bash is more likely to stay, since it has stayed for a long time now. But given a future where there's neither Bash nor Python anymore, I'd prefer porting the Python script.
But I'd sooner port a Python 2 script from a few years ago to Python 3 than try to understand a Bash script from a few years ago.
2. Bash is not that complex of a language; it mostly just has somewhat arcane shorthands, like `$@` and `-n` and such. So, yes, you need the manual, but it's not a lot of mental effort.
Well, Python 2 code will likely just work too.
> 2. Bash is not that complex of a language; it mostly just has somewhat arcane shorthands, like `$@` and `-n` and such. So, yes, you need the manual, but it's not a lot of mental effort.
If you're only counting bash qua bash then yes, but in practice most of the logic of a "bash script" will be in (non-standardised) commands that are invoked from it rather than the script itself.
Certainly with Python being one of the top languages in the wold today, the pressure to support backwards compatibility is much higher than it was back in 2007.
We are in the very infancy of computer programming. Computers will likely be around for centuries, maybe even thousands of years. The best time to remove historical baggage and get everyone to switch was 10 years ago, the second best is now. IPv6 shows that if you "treat users nicely" you might not live to see the results.
And obviously we benefit too because we get to use a better language.
I don't believe it would take that much work to keep python 2 in maingence mode, and I also think if the PSF asked for a company to officially take over they would easily find one.
Things like pytorch (I only want gcc-7.3.1) and tensorflow (I want JAAAVA) won't be so resilient obviously.
It's a dynamically linked executable, just like so many other programs.
$ file /usr/bin/grep
/usr/bin/grep: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.32, stripped
A miserable pile of libraries!
Good software is good. Well, uh, yeah. Anything else?
Write software that people actually want?
Well gosh darn, why didn't I think of that!
Also, GNU grep is not antifragile. I learned that when I upgraded my PCRE to a new major version and suddenly grep failed. grep is used by configure scripts, so without a working grep you can't compile a new grep. To me that is the very opposite of antifragility.
Bootstrapping is a very common problem with basic tools.
This mentality is the reason I chose the most compatible and time-proven formats as the base of my project: Text files, and PGP. For glue, I chose SQLite, Perl, and HTML.
For a simulation of the first 20+ years of use, I made it compatible with older browsers: Mosaic 1.0 for now, and I plan to test on older platforms in the future.
Those things were monstrously durable.
Truth is, the 2009 Mazda 3 was actually one of the first from the BL-Generation, while the 6 is the last one from the GH-Generation, which is from 2008.
So, yeah - it feels like an older car, although it's technically newer.
Which is fine until you get in an accident. Then you’d be wishing you bought a newer, safer car....
Just like VWs and other small cars, they're great for driving around the city but not ideal for extended high-speed travels. You would be far worse off on a motorcycle, in any case.
Youtube is full of videos of folks buying nearly used cars that were totaled by insurance (usually due to accidents that triggered the airbags) for pennies on the dollar and fixing them up.
* Only use technology that already exists for at least 10 years and is widely used: bash python3 C and Linux
* No external libraries, only the python standard library and the C standard library
* No monolithic design, it's a mix of about 30 standalone utilities that I can rewrite gradually if for example ever python3 becomes out of fashion.
* Keep the line count low
* Rely on proven abstractions like processes, files, pipes, the environment list, etc
Linux-packaged software will survive, because it's based on a system of lots of mirrors. But most new software isn't packaged, it's just available as source modules on one or two sites, or in GitHub releases. Who's mirroring all that?
I can't be willing to build personal stuffs on unstable or proprietary grounds either.
To me it's like building your house on an iceberg or in a prison.
For server side software we already have some relatively safe grounds in that regard,
but for GUI I only know of frameworks with no clean API to abstract away their implementations,
and which come with much dependencies and idiosyncrasies that can be hard to implement efficiently
on top of other libraries. OpenGL is an API, but it's complex (designed for efficient 3D)
and doesn't cover everything (windowing, mouse/keyboard events, basic primitives, text, images, etc.).
That's why I designed BWD:
Currently I'm building a toolkit on top of it (BWD just has a graphics (canvas), it doesn't have "components").
I want something more powerful (in a typed programming language capable of efficient data manipulations, rather than a text-based scripting language) and flexible (a basic API with the least constraints, not a toolkit with fixed design and implementation choices), etc.
Very appealing, but long-lasting software --- grep, SQL, java --- is maintained long after "shipped".
The thing that makes it easy for you to create better software, also makes it easy for someone else to do the same to you.
They can take your idea, with a slightly different perspective, that makes it better for more people (even if "worse" in terms of what you aimed at). Which is probably what you did to someone else in the first place.
I have a Mathematica binary for Linux from 2000 that still runs mostly-OK. It's statically linked, and can't find fonts (because the distributions moved the location).