It's a very nice read with many good points, but any person with some experience in IT projects could argue with it. The author is taking one side without any self-criticism.
It is true that configure scripts are probably doing some useless things, "31,085 lines of configure for libtool still check if <sys/stat.h> and <stdlib.h> exist, even though the Unixen, which lacked them, had neither sufficient memory to execute libtool nor disks big enough for its 16-MB source code", etc. But then what is the alternative? Writing a configure module each time by every programmer who wants to release some software? This is called code reuse and yes, it's not perfect but it saves time. By not reinventing the wheel again and again. By reusing something that is stable and has been there for some time. Probably such thing is generalizing over many architectures, and making useless things, but then again, who cares for some extra 5-10 seconds of the "configure" command, when you are covered for all those strange corner cases that it already handles?
You could start out removing the autocrap checks for <sys/stat.h> and <stdlib.h>.
Then you could eliminate all the other autocrap checks which come out one and the same way on every single OS in existence.
And in all likelyhood, you would find out that you don't actually need autocrap at all, because the remaining two checks can be done with an #ifdef instead.
I built a fair bit of OSS on SUA back when that existed. Autoconf projects were wonderful: download the tarball, if the tarball is older than the OS I'm building on then replace config.sub, ./configure, make, make install. All standard, all scriptable; never once had a problem with a project that used autotools (or, in fairness, CMake or SCons - all the big reasonably standardized build systems work).
People who'd "kept it simple" like you suggest were the bane of my life. I spent more time debugging each of those builds than all of the builds that used actual tools combined.
(Ironically enough "all projects must use autotools" seems like a quite cathedrally attitude though)
Last I heard Microsoft Windows was obsolete something like ten years ago, is there an operating system other than UNIX left out there, that your application would have to be agnostic?
Anything else are not portable APIs that don't have necessarily to do with UNIX.
It doesn't matter that the kernel is UNIX like, if what is on top of it isn't.
And anyone that only knows GNU/Linux as UNIX, should read "Advanced Programming in the UNIX Environment" from W. Richard Stevens/Stephen A. Rago, to see what actually means to write portable UNIX code across POSIX implementations.
Android? In this context, that's UNIX. MacOS X is UNIX. Even the z/OS on the mainframe has a UNIX03 compliant UNIX subsystem. You were saying...?
GUI is dead; if your application doesn't run on the server and display the results either on the command line or in a web browser, you're doing it wrong.
> Android? In this context, that's UNIX. MacOS X is UNIX. Even the z/OS on the mainframe has a UNIX03 compliant UNIX subsystem. You were saying...?
So you use VI and Emacs on Android, generate postscript and troff files, configure /etc/passwd and /etc/init.d
How does your Android .profile look like?
Yes z/OS has a POSIX subsystem, it also doesn't support anything besides CLI, daemos and batch processing.
Mac OS X is a certified UNIX, however none of the APIs that matter. You known, those written in Objective-C and Swift are UNIX.
> GUI is dead; if your application doesn't run on the server and display the results either on the command line or in a web browser, you're doing it wrong.
Better let all of those that earn money targeting infotainment systems, medical devices, factory control units, GPS units, iOS, Android, game consoles, smart watches, VR units, POS, ... that they are doing it wrong.
So you use VI and Emacs on Android, generate postscript and troff files, configure /etc/passwd and /etc/init.d
I don't use Android, because it's a ridiculously hacked-up version of GNU/Linux (as if being based on GNU/Linux isn't bad enough).
Have you spawned a shell on it? The filesystem is a royal mess, the likes of which I've never seen before. Could I run vi and groff and even SVR4 nroff on it? Yes, if I wanted to waste my time with it, I could.
How does your Android .profile look like?
I didn't touch .profile because I don't care for bash one bit, but it was there.
However, in this context, it's still UNIX. A hacked-up, ugly UNIX severely mutilated to run on mobile telephones and tables, but conceptually UNIX nevertheless (honestly, I have never seen anything to hacked-up and mutilated like Android, and you can bet that in 30+ years of working with computers, one sees all kinds of things).
Depends on if his employer decides to start targeting OS/2 ATM's, THEOS desktops, non-IBM mainframes, non-POSIX RTOS's, or the lonely market for Amiga's. ;)
Hey, that's cheating if you're putting all the OS specific functions in its own app that moves data to/from those. It's what high-security did for Ada, Java, and UNIX runtimes on separation kernels. Significant, performance penalties in many desktop applications.
Desktop is dead!!! The '90's of the past century called and said they want the desktop back!
People don't want a clunky computer any more; except for computer people, I don't know anybody from general population who has one. I'm offended that we're even wasting time discussing desktop anything!
People just happen to carry their desktops on their pockets
You mean they carry their portable UNIX servers in their pockets with them. Since they all come with a web browser, there's your application's or your server's front end.
and use this thing called apps on them.
I have a few of those on my mobile UNIX server as well. Stupidest thing I've ever seen or used, "apps". What for, when they could have used a web browser to display their front ends, or could have ran on a backend server and just sent the display to the web browser? Most of those "apps" I use won't function without an InterNet uplink anyway... idiocy pur.
It's only ad absurdum if you're completely unaware of the fact that UNIX (in this case not GNU/Linux, but illumos / SmartOS) is a high performing operating system with extensive facilities for preemption and traceability, which makes it ideal for web applications, and at scale, too. Haven't heard of SmartOS yet, have you, since you claim UNIX unfit for cloud applications?
The 90's desktop market was more interesting. Yet, you must have never met anyone doing applications that require fast CPU's and plenty RAM. Or looked at the desktop sales figures that are well above zero.
Hell, I built one for a guy's gaming rig a little while back. That replaced his hand-me-down from a company designing & installing sprinkler systems. Why he have that? They just bought new desktops for all there CAD users. Lot of CAD users out there probably have one too. ;)
The 90's desktop market was more interesting. Yet, you must have never met anyone doing applications that require fast CPU's and plenty RAM.
Plenty of RAM? Yes, but on supercomputers. My machines were sgi Origin 2000's and 3800's running a single compute intensive application doing finite element analysis and using 16 GB of RAM, across all the CPU's in the system. A single calculation would usually take a month.
On the desktop, you couldn't be more wrong: I was part of the cracking / demo scene, and we literally counted clock cycles in order to squeeze every last bit of performance in our assembler code, me included.
Im jealous thag you got to play with SGI Origins. I wanted one of them or an Onyx2 but cost too damn much. At this point, though, you're partly supporting my claim: certain workloads necessitate either a desktop, a server, or a bunch of them. These include artists, scientists, designers of various equipment, gamers, well-off laypersons wanting instant response time, privacy nuts needing cores for VM-based browsing, and so on. Not relegated only to "computer people" as you falsely claimed.
One can also look at the numbers. Last year, over 17 million PC's were sold in US. Think the buyers were really all computer people? Even with 3 year refresh cycle, low end, that be estimate of around 50 million computer people in this country that's been buying desktops over 3 years. Think they're really that big a demographic?
Well, I'd argue that all those people you listed are either professionals in diverse fields, or enthusiasts. If you take 17 million PC's sold, just in the United States, that's 17 / 300 * 100 = 5.66% of the population. And I was conservative in using 300 million as the total U.S. population, when I've read it's more like 321 million, so what does that tell you?
But if you look at the number of PC's sold year over year, the number is dwindling at the rate of roughly 15% - 18% per year. Look, for example, under the "Global Computer Sales" column, here:
the average laypeople don't want computers any more, and the sales reflect that. For their needs a tablet or a mobile telephone with a web browser is pretty much all they need, and the web can and does now deliver pretty much any kind of application they could ever need or want. And that's precisely where most of the sales of desktops were. Professionals using computer aided design and people like you and me are few and far in between, in comparison.
On an sgi related note, I myself owned several Octanes and even an sgi Challenge R10000 (with a corresponding electricity bill). I must have torn and rebuilt that Challenge four or five times, just for fun. My primary workstation for years (which I fixed and put together myself) was an sgi Indigo2 R10000, with an hp ScanJet II SCSI scanner, a 21" SONY Trinitron, and a Plextor CD-RW SCSI drive, back in the day when CD-RW was "the thing". With 256 MB of RAM when most PC's had something like 16 or 32 MB, it was a sweet setup. Ah, IRIX 6.5, how much I miss thee...
Is that one of those failed Microsoft tablet thingies? Why would anyone care about a GUI in the 21st century, when everything runs either on stdout/stderr or on the web?
Anyway, the answer to your question is iPad Pro by Apple Computer. It runs an operating system called "iOS" which is a heavily customized FreeBSD on top of a custom CMU Mach kernel. And it's UNIX03 compliant. UNIX! It's everywhere!
> Anyway, the answer to your question is iPad Pro by Apple Computer. It runs an operating system called "iOS" which is a heavily customized FreeBSD on top of a custom CMU Mach kernel. And it's UNIX03 compliant. UNIX! It's everywhere, it didn't go away, and it won't die!
I'm aware of it, it's not good enough. Its UI is terrible when you need to work with multiple applications, it's a pain to customize anything and even more of a pain to run your own programs.
I don't, what for? All I'd need is SSH to connect to my illumos based systems, and a web browser to use my applications. Compiling things on a lone desktop like back in the '90's? No, that's what I did when I was a kid and didn't know any better. I have infrastructure for that now. Cross compilers, too.
> who cares for some extra 5-10 seconds of the "configure" command
For me, it's closer to a minute. "configure" is good enough that it does the job, and it's hard to replace it. "configure" is bad enough that I loathe it with emotions that words cannot describe. It's design is terrible. It's slow. It's opaque and hard to understand. It doesn't understand recursion (module code? pshaw!)
automake is similarly terrible I looked at it 20 years ago, and realized that you could do 110% of what automake does with a simple GNU Makefile. So... that's what I've done.
I used to use libtool and libltdl in FreeRADIUS. They gradually became more pain than they were worth.
libtool is slow and disgusting. Pass "/foo/bar/libbaz.a", and it sometimes magically turns that to "-L/foo/bar -lbaz". Pass "-lbaz", and it sometimes magically turns it into linking against "/foo/bar/libbaz.a".
No, libtool. I know what I'm doing. It shouldn't mangle my build rules!
Couple that with the sheer idiocy of a tool to build C programs which is written in shell script. Really? You couldn't have "configure" assemble "libtool.c" from templates? It would only be 10x faster.
And libltld was just retarded. Depressingly retarded.
I took the effort a few years ago to replace them both. I picked up jlibtool and fixed it. I dumped libltld for just dlopen(). The build for 100K LoC and ~200 files is about 1/4 the time, and most of that is running "configure". Subsequent partial builds are ~2s.
If I every get enough time, I'll replace "configure", too. Many of it's checks are simply unnecessary in 2016. Many of the rest can be templated with simple scripts and GNU makefile rules.
Once that's done, I expect the build to be ~15s start to finish.
The whole debacle around configure / libtool / libltdl shows that terrible software practices aren't new. The whole NPM / left-pad issue is just "configure" writ large.
Actually, the configure debacle doesn't have anything to do with terrible software practices. All of Sun, AT&T, HP, SGI, IBM, DEC, the BSD guys, the semi-embedded guys, and every else had the best architects they could get. They were (and are) brilliant and did brilliant things. Kemp is one of them, for example. Heck, you can complain about Microsoft and Apple, but you cannot say they're incompetent.
Unfortunately, there are two problems.
1. They were all operating under different requirements.
2. They were all absolutely convinced that they were the best in the business and that they were right.
As a direct result, those of us who got to deal with more than one of the resulting systems want to beat them all to death with a baseball bat with nails driven into the end.
> Actually, the configure debacle doesn't have anything to do with terrible software practices.
I don't mean that reason to use configure is bad. There are many different systems, and being compatible with them all requires some kind of check / wrapper system.
I mean that the design of "autoconf" and the resulting "configure" script is terrible. Tens of thousands of lines of auto-generated shell scripts is (IMHO) objectively worse than a collection of simple tools.
See nginx for a different configure system. It has a few scripts like "look for library", and "look for header file". It then uses those scripts multiple times, with different input data.
In contrast, configure use the design pattern of "cut & paste & modify". Over and over and over and over again. :(
Here's an autoconf-compatible and much, much shorter configure system that isn't expanded from macros and remembers that bash has functions. Look, you can actually maintain it!
echo "This configure script requires a POSIX-compatible shell"
echo "such as bash or ksh."
echo "THIS IS NOT A BUG IN LIBAV, DO NOT REPORT IT AS SUCH."
Most of libav's configure script is dependency trees for all the codecs flags you can turn on/off, so it's quite compact without it. x264 reuses it at 1500 lines:
Mostly, he's lamenting the need for it. The fact that after 3 decades this thing that every modern OS needs isn't a standard.
It actually is kind of silly that you can't depend on this stuff being abstracted, but instead must individually test it instead of asking a reference on a given system.
And yet when it comes to browser compatibility, it's encouraged, nigh necessary, to test for capabilities rather than check version strings and assume that the version string is saying something meaningful about the environment.
"Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2228.0 Safari/537.36", indeed.
Insane complexity in build systems just makes your life sad. Trust me, I know -- I write java for a living. We have ivy, maven, ant, gradle, sbt, leiningen, and I'm sure a few more.
I write Scala for a living. Once you embrace the straightjacket of maven you never have to worry about your build again. You write a simple declarative definition, where you just list your project's name/version/etc. and your dependencies. Then you build it. When you want to do something e.g. releasing, you find the standard plugin for doing that and you do with it, and you deal with its defaults. When you decide you absolutely have to pass different compiler arguments depending on the phase of the moon, slap yourself until it passes. When you actually want to write a "reasonable" custom build step that makes sense, writing it as a plugin is not hard and makes it much easier to use good practices.
I write scheme not for a living. My build script is a scheme file or a makefile, depending on the project.
The build complexity is one of the reasons I stay away from Java: If these people think they need XML for builds, what other needlessly complex horrors have they perpetrated? And that sort of thing.
The Maven POM syntax might look bloaty to modern post-XML eyes, but it's not actually complicated: a typical POM only specifies the name of the project and its dependencies. The rest is typically all done by convention. It's also purely declarative so ordering doesn't matter.
M4 and sh are very concise languages. Nonetheless autotools is orders of magnitude more complex than Maven. You really can't compare at all.
At any rate, if you want a more concise syntax there is gradle (but it's a bit slower as it's actually executing a real scripting language) and, perhaps a nice middle ground, a thing called Polyglot Maven which is the same build engine but with a variety of non-XML syntaxes. The YAML one is quite nice:
To this effect, I use leiningen (which is mostly a frontend atop Maven's dependency management) even when I'm building a Jython or a Java project. If there is something funky I need to do as a build/deploy step, I'd rather be writing it in Clojure than Java, and mostly I just want to paste [com.example/foo "1.5.0"] into my :dependencies.
I'm beginning to think that build scripts should be written in the language they are building, or a more expressive language. So many seem to go in the other direction.
I disagree. The worst problems I see with build scripts are that they are trying to do too much, create branches/tags of themselves, be environmentally aware, etc.
I really do agree that complexity is the enemy, but people keep shoving complexity into build scripts against better judgment. I've given up and would rather deal with a complex ruby script than a makefile of equal intrinsic complexity.
That looks nice. It combines strong, static analysis with simple DSL and underlying power in language if necessary. Looks like The Right Thing approach in action.
You actually NEED arbitrary code execution during builds. Allow me to explain...
Let's say we have a make format called dmake. It invokes $CC with the specified arguments for each file, and links them together into a binary/so/whatever, putting it into the build directory and cleaning artifacts. Okay.
Now say that you start a new project in rust. Well, crap, dmake doesn't work. You have to use rdmake, which is built by different people, and uses a more elegant syntax - which you don't know.
Then you write Haskell, and have to use hdmake - which of course is written as a haskell program, using a fancy monad you don't know, and python has to use pydmake, and ruby has to use rbdmake, and scheme has to use sdmake, and lisp has to use ldmake, and asm has to use 60 different dmakes, depending on which asm you're using.
Instead, we all use make. Make allows for arbitrary code to be executed, so no matter what programming environment you use, you can use a familiar build tool that everybody knows. Sure, java has Ant, Jelly, Gradle and god knows what else, and node has $NODE_BUILD_SYSTEM_OF_THE_WEEK, but even there, you can still use make.
You haven't countered parent's point at all. You could've just as easily said the common, subset of SQL could be implemented extremely different in SQL Server, Oracle, Postgres, etc. Therefore, declarative SQL has no advantages over imperative, C API's for database engines. Funny stuff.
Let's try it then. The declarative, build system has a formal spec with types, files, modules, ways of describing their connections, platform-specific definitions, and so on. Enough to cover whatever systems while also being decidable during analysis. There's also a defined ordering of operations on these things kind of like how Prolog has unification or old expert systems had RETE. This spec could even be implemented in a reference implementation in a high-level language & test suite. Then, each implementation you mention, from rdmake to hdmake, is coded and tested against that specification for functional equivalence. We now have a simple DSL for builds that checks them for many errors and automagically handles them on any platform. Might even include versioning with rollback in case anything breaks due to inevitable problems. A higher-assurance version of something like this:
Instead, we all use make. Make allows for arbitrary code and configurations to be executed, so no matter what configuration problems you have, we can all use a familiar build tool that everybody knows. That's the power of generic, unsafe tools following Worse is Better approach. Gives us great threads like this. :)
From the perspective of security, make is not great, but there's always a more complicated build, requiring either generic tooling, or very complex specific tooling. This is why the JS ecosystem is always re-inventing the wheel. If you design your build tool around one abstraction, there will always be something that doesn't fit. What will happen if we build a tool akin to the one I described is that it will grow feature upon feature, until it's a nightmarish mess that nobody completely understands.
>You could've just as easily said the common, subset of SQL could be implemented extremely different in SQL Server, Oracle, Postgres, etc. Therefore, declarative SQL has no advantages over imperative, C API's for database engines. Funny stuff.
No, that's not my point, my point is that a build tool that meets parent's requirements would necessarily be non-generic, and that such a tool would suffer as a result.
>Instead, we all use make. Make allows for arbitrary code and configurations to be executed, so no matter what configuration problems you have, we can all use a familiar build tool that everybody knows. That's the power of generic, unsafe tools following Worse is Better approach. Gives us great threads like this. :)
Worse is Better has nothing to do with this. Really. Make is very Worse is Better in its implementation, but the idea of generic vs. non-generic build systems, which is what we're discussing, is entirely orthogonal to Worse is Better. If you disagree, I'd reccomend rereading Gabriel's paper (that being Lisp, The Good News, The Bad News, And How to Win Big, for the uninitiated). I'll never say that I'm 100% sure that I'm right, but I just reread it, and I'm pretty sure.
"No, that's not my point, my point is that a build tool that meets parent's requirements would necessarily be non-generic, and that such a tool would suffer as a result."
A build system is essentially supposed to take a list of things, check dependencies, do any platform-specific substitutions, build them in a certain order with specific tools, and output the result. Declarative languages handle more complicated things than that. Here's some examples:
I also already listed one (Nix) that handles a Linux distro. So, it's not theory so much as how much more remains to be solved/improved and if methods like in the link can cover it. What specific problems building applications do you think an imperative approach can handle that something like Nix or stuff in PDF can't?
Didn't know that. Interesting. It looks like an execution detail. Something you could do with any imperative function but why not use what's there for this simple action. Nix also manages the executions of those to integrate it with their overall approach. Makes practical sense.
"It's fully generic."
It might help if you define what you mean by "generic." You keep using that word. I believe declarative models handle... generic... builds given you can describe about any of them with suitable language. I think imperative models also handle them. To me, it's irrelevant: issue being declarative has benefits & can work to replace existing build systems.
So, what's your definition of generic here? Why do declarative models not have it in this domain? And what else do declarative models w/ imperative plugins/IO-functions not have for building apps that full, imperative model (incl make) does better? Get to specific objections so I can decide whether to drop declarative model for build systems or find answers/improvements to stated deficiencies.
That wasn't what the original post by ashitlerferad was calling for. I have not problem with generic declararive-model build systems that can be used for anything. However, the original call was for build systems which don't require arbitrary code execution. A generic build system must deal with many different tools and compilers, and thus REQUIRES arbitrary code execution: Somewhere, there's got to be a piece of code telling the system how to build each file. And if you don't build that into the build system proper, you wind up either integrating everything into core, or adding an unweildly plugin architecture and winding up like grunt/gulp and all the other node build systems. Or you could just allow for arbitrary code execution, and dodge the problem all together. This is possible in a declaritive system, but it's a lot harder to do, and means at least part of your system must be declarative.
It seems some kind of arbitrary execution is necessary. I decided to come back to the problem out of curiosity to see if I could push that toward declarative or logic to gain its benefits. This isn't another argument so to speak so much as a brainstorm pushing envelope here. Could speculate all day but came up with a cheat: it would be true if anyone had replaced make or other imperative/arbitrary pieces with Prolog/HOL equivalents. Vast majority of effort outside I/O calls & runtime itself would be declarative. Found these:
Add to that Myreen et al's work extracting provers, machine code and hardware from HOL specs + FLINT team doing formal verification of OS-stuff (incl interrupts & I/O) + seL4/Verisoft doing kernels/OS's to find declarative, logic part could go from Nix-style tool down to logic-style make down to reactive kernel, drivers, machine code, and CPU itself. Only thing doing arbitrary execution, as opposed to arbitrary specs/logic, in such a model is what runs first tool extracting the CPU handed off to fab (ignoring non-digital components or PCB). Everything else done in logic with checks done automatically, configs/actions/code generated deterministically from declarative input, and final values extracted to checked data/code/transistors.
Hows that? Am I getting closer to replacing arbitrary make's? ;)
...I'm not sure I totally understand. Here's how I'd solve the problem:
Each filetype is accepted by a program. That program is what we'll want to use to compile or otherwise munge that file. So, in a file somewhere in the build, we put:
*.c:$CC %f %a:-Wall
*.o:$CC %f %a:-Wall
And so on. The first field is a glob to match on filetype,%f is filename, %a is args, and the third field is default args, added to every call.
Target all is run if no target is specified. The first field is the target name. The second field is list of files/targets of the same type, to be provided to compiler on run. It is assumed the target and its resultant file have the same name. The last field is a list of additional args to pass to the compiler.
This is something I came up with on the spot, there are certainly holes in it, but something like that could declaritivise the build process. However, this doesn't cover things like cleaning the build environment. Although this could be achieved by removing the resultant files of all targets, which could be determined automatically...
There you go! Nice thought experiment. Looks straight-forward. Also, individual pieces far as parsing could be auto-generated.
Far as what I was doing, I was just showing they'd done logical, correct-by-construction, generated code for everything in the stack up to OS plus someone had a Prolog make. That meant about the whole thing could be done declaratively and/or c-by-c with result extracted with basically no handwritten or arbitrary code. That's the theory based on worked examples. A clean, integration obviously doesn't exist. The Prolog make looked relatively easy, though. Mercury language make it even easier/safer.
Yep. If you're running it for a build, you're running unprivilaged - chrooted, jailed, or zoned if you want to be really safe - and if you're running it for install, than you trust the software in any case. And because makefiles are fairly transparent, you can check what the install is doing beforehand.
XML files cannot be easily processed with standard UNIX tools like grep, sed, and AWK. XML requires specialized libraries and tools to process correctly, making it an extremely poor choice for... well, just about anything. It's a markup format for text, not a programming language.
Building software is a programmatic process. No XML please! We're decidedly not on Windows, and since I have the misfortune of fitting such square pegs into round holes, please don't use XML for applications which must run on UNIX. It's a nightmare. It's horrible. No!!!
There is no particular relationship between Windows and XML. And just to play devil's advocate, is the lack of XML support in grep, sed, and awk a problem with the data format or with the tools? Why can't we have new standard tools that operate on hierarchical formats such as XML / JSON / YAML? Current standard Unix tools have plenty of flaws and as forward thinking developers we shouldn't be afraid to replace them with something better.
I have noticed a particular relationship between Windows, Java, and XML: all Java programmers nowadays seem to come from Windows (and then I end up with ^M CR characters in all the text files, even shell scripts!), use Java, and write configuration in XML.
YAML doesn't need any special tools - it's ASCII and can easily be processed with AWK, for example.
I don't know about you, but the last thing I want is to have to have a whole new set of specialized tools, just so somebody could masturbate in XML and JSON.
XML is a markup language. That means it's for documents, possibly for documents with pictures, perhaps even with audio. It's not and never was meant for storing configuration or data inside of it. XML is designed to be used in tandem with XSLT, and XSLT's purpose is to transform the source XML document into (multiple) target(s): ASCII, ISO 9660, audio, image, PDF, HTML, whatever one writes as the transformation rules in the XSLT file. XML was never meant to be used standalone.
If you really want to put the configuration into an XML file, fine, but then write an XSLT stylesheet which generates a plain ASCII .cf or .conf file, so its processing and parsing can be simple afterwards. XML goes against the core UNIX tenet: keep it simple.
Do you like complex things? I do not, and life is too short.
If you must have structured data, use a lisp program. Congratulations on using a format that was designed to be executable. and if it's a build tool, you better believe it's executable. I suspect that Annatar is a Murray Hill purist (I don't know for sure), so he may disagree with me.
Of course, like any real programming language, it's hard to process with regex, but then again, I don't want to process makefiles with regex. And you might have some luck coaxing AWK or the SNOBOL family to parse it, and it would be far easier than doing the same with XML.
>please don't use XML for applications which must run on UNIX. It's a nightmare. It's horrible. No!!!
I'd disagree with you there. DocBook, HTML, and friends, are all good applications of XML (or near XML), doing what XML was designed for: Document Markup.
Seriously people, when you're writing a program in a language that has "Markup Language" in the name, does that not ring any alarm bells?
No, I wrote that XML for use in applications is bad, as it cannot be easily processed with standard UNIX tools. And it's most definitely bad for building software, as it is limited by what the programmer of the build software thought should be supported. A really good example of that is ANT/NANT. make, on the other hand, doesn't limit one to what the author(s) thought should be supported. Need to run programs in order to get from A to B? No problem, put whatever you need in, and have it build you the desired target.
First off, I simply used some of PCRE for the syntax, as it's what I'm familiar with. \w could be easily replaced, and non-greedy matching is a relatively common extension.
As for when your record spans multiple lines, with recursive structures, the previous regex is for extracting simple atomic data from a json file, which is usally what you want in these cases anyway. If not, the json(1) utility can, I believe, extract arbitrary fields, and composes well with awk, grep, etc.
Yes, the json utility can process a JSON file into key:value pairs. Now ask yourself: if you end up with key:value pairs on stdout, why couldn't that have been the format in the first place? Why artificially impose not one, but two layers of complications (one as JSON, the other as the specialized json tool to process JSON)? Why not just keep it simple and go directly to a flat file ASCII format to begin with?
Well, it means not rolling your own parser. But that's not hard. The real advantage is when you actually ARE dealing with structured data, with nested objects. Most standard UNIX formats are bad at this, and sometimes you find it necessary.
Also, because JSON is so common, you get really good tooling for handling structured data by defult, instead of kinda-okay tooling for 50 different slightly-incompatable formats. 10 operations on 10 datastructures vs 100 operations on 1, and all that.
But for unstructured data, or for one-level key/value data, JSON is overkill. You can use DSV, like this:
>who cares for some extra 5-10 seconds of the "configure" command
Building freebsd-7 ports on my athlon felt like "forever and 2 more days" back then. If it is not possible to remove autoconf/configure with all obsolete options, can we at least PLEASE stop doing the same thing again and again 220 times for each small package in enormous dependency list? Caching, anyone?
It is true that configure scripts are probably doing some useless things, "31,085 lines of configure for libtool still check if <sys/stat.h> and <stdlib.h> exist, even though the Unixen, which lacked them, had neither sufficient memory to execute libtool nor disks big enough for its 16-MB source code", etc. But then what is the alternative? Writing a configure module each time by every programmer who wants to release some software? This is called code reuse and yes, it's not perfect but it saves time. By not reinventing the wheel again and again. By reusing something that is stable and has been there for some time. Probably such thing is generalizing over many architectures, and making useless things, but then again, who cares for some extra 5-10 seconds of the "configure" command, when you are covered for all those strange corner cases that it already handles?