As I discovered in practice, it seems to be a trend with authors of such small-size software to also forgo any kind of documentation: what options are supported by specific commands, what behaviours they exhibit (which sometimes differs between GNU and BSD). The ‘usage’ messages are also terse and cryptic to the max. Doubly baffling when the audience of the software is supposedly corporations building their userspace. Do the authors sell the docs on the side, or what?
Anyway, for users trying to do stuff with these utils, the available paths seem to be either finding StackOverflow answers from others who already messed with their own systems and survived, or being the first to do the trial-and-error.
this is what I think powershell has contributed. you literally can't make a utility that doesn't have documentation because it's generated from the code.
It appears that each command is documented at the top of its respective source file. Though I may be misunderstanding and you mean that that documentation is insufficient.
Regardless, I agree that there should be a proper man page with more detail.
Its been said that the best documentation is source code. For UNIX I have always found this is solid advice. Whats nice about small programs, such as this one, is that its more feasible to use the source code as documentation.
"Anyway, for users ..."
Ah, thats me. I am a user. I personally love terse and cryptic, i.e., minimising the number of keys I have to press.
IME, multicall binaries like the OP presume some pre-existing knowledge of how to use the utilities they contain. It seems nonsensical to complain about missing documentation given the purpose of a multicall binary is to conserve space.
With BSD, the utilities in crunched binaries are generally the same ones, i.e., use the same source code, as in one in the userland. With GNU busybox, they are not, and one needs to read the source code to understand the differences.
Even if I can figure out how to use a program from reading documentation I still look at the source code before using it for the first time.
I keep wondering what it is that makes C programmers especially antisocial and obnoxious. Is it like in that joke:
Father calls his small son to him, pours a shot of vodka and tells him to drink it. The son drinks, then makes a face and says in repulse: “Eurgh, it's disgusting!”
The man replies: “Yeah, you probably thought your dad drinks honey, did you?”
That quotes simply means that you shouldn't write spaghetti code, and that another _programmer_ should be able to read it.
Quite clearly, it doesn't carry a meaning of: "hey, every user now has to become a proficient reader of whatever languages this software is written in."
Ah yes, the famed clarity and readability of C, where people are so entrenched in the belief that the compiler limits identifiers to seven characters or whatever, and in the aversion to typing out a single word in full, that they still build whole new languages with module, function and variable names looking like alphabet vomit. And with ungoogleable (ironically) language names to top it off.
I exclude man pages when I create systems with crunched binaries where I want to save space. The man pages are always available online. Its like djb's programs. They just refer the user to his website. I am not a programmer. I am the user. The small systems I create are for me and no one else, so naturally I make them according to personal preferences.
> Its been said that the best documentation is source code
fuck you. it's a myth perpetrated by lazy programmers, who can't be bothered to spend an hour or two writing a comprehensible usage manual, while otherwise EVERY USER IS GUARANTEED to spend MORE just figuring out how to BEGIN using his crap. It's only forgivable for the simplest of self-explanatory /API/ calls, but /nothing/ in userspace.
Posting like this will get you banned here, so please don't do that.
If you wouldn't mind reviewing https://news.ycombinator.com/newsguidelines.html and taking the intended spirit of the site more to heart, we'd be grateful. In particular, we don't want flamewars, and you've unfortunately already posted quite a few flamewar comments. Can you stick to thoughtful, curious comments instead? That's the kind of conversation we're trying for here. Note that that includes being respectful to whomever you're talking with, regardless of how wrong they are or you feel they are.
Documentation != user guide or manual. Manual are part of the documentation. And it is still true that the source code is the best documentation, ideally the source code can be used to automatically generate the docs. But overall documentation and things like ADR will hardly ever find their way into the source so require writing then externally.
I too am unskilled at navigating unfamiliar codebases and demand clifnotes because my time as a user is more valuable than yours as a programmer. these gift horses are AWFUL
> The toybox build produces a multicall binary, a "swiss-army-knife" program
that acts differently depending on the name it was called by (cp, mv, cat...).
Installing toybox adds symlinks for each command name to the $PATH.
> The special "toybox" command treats its first argument as the command to run.
With no arguments, it lists available commands. This allows you to use toybox
without installing it, and is the only command that can have an arbitrary
suffix (hence "toybox-armv5l").
So it's busybox, rewritten or forked for some reason?
Edit: found it elsewhere in the thread. It's rewritten because they didn't like GPL for religious reasons. (Busybox is a completely separate binary from the rest of the system so there is no reason not to ship GPL-licensed software that I can see, but yeah if you've been bitten by making a closed source work derived from GPL once, I guess you don't want to try it again and Android has a lot of vendors with closed source components.)
Choosing fo rewrite software because of a license does not have to be "religious". There are very real reasons why license matters. Especially when it comes to GPLv3 and companies who hold patents.
This is one of the reasons why Google Android, and probably many others, are using Toybox for their userland. Because it's distributed under a BSD license.
Took all the fun out of it (reminded me of "SCO Disease")
Did toybox as a hobby for a bit, no real expectation anybody use it.
Even came back to busybox for a while, on and off, circa 2008-2011 sent toybox patch to busybox, fixed their vi, nbd-client...
Also I think it's worth to listen the first 10 minutes as an intro, and brief about licensing from 5 minutes forward. Where he explains how BusyBox's licensing was then used by other people as "beating tool" to get hardware specs.
> Toybox's main goal is to make Android self-hosting by improving Android's command line utilities so it can build an installable Android Open Source Project image entirely from source under a stock Android system.
Worth mentioning that toybox replaced Android's toolbox back in 2015 (with the release of ver. 6). Toolbox utilities were based on their NetBSD versions whereas toybox utilities works more like the common GNU coreutils.
I think that comment is easier to parse if you replace "people" with "corporations". Toybox not using the GPL is an important advantage for Android vendors, not Android end users.
As a user, the GPL doesn't do anything to restrict you. It only comes into play when you want to redistribute software.
Hating the idea of intellectual property doesn't make it go away. Neither do permissive licenses. Which is why your "ceteris paribus" statement is effectively worthless.
> And the "companies will be forced to give back to the project" argument never made sense to me.
Care to explain where you think it falls apart?
So far, it simply seems like you're fed up with the system and would rather rage in denial than try to improve the system or use it to your advantage. That may give you some degree of satisfaction, but it's silly for you to assert that everyone benefits from the same. And if you really just want to stop caring about copyright, I don't see why you'd bother advocating for MIT or BSD over GPL instead of just turning to casual piracy like most people who can't be bothered to respect software copyright.
As user, permissive license means "no source for you", because vendor is not obligated to share source with end user by permissive license. For end user, there is no difference between permissive and proprietary licenses.
The GPL license is the only license which protects end-user rights to see and modify source code of the software they use.
The purpose of a permissive license is to encourage proprietary derivatives that come with intellectual property restrictions, EULAs, SaaS with vendor lock-in, etc.
The purpose of a copyleft license is to encourage derivatives that don't come with such restrictions.
Copyleft therefore helps reach a local maximum of freedom for everyone as a function of restrictions on redistribution.
Landley's major issue with GPL is that there isn't "a" GPL anymore. To paraphrase a talk (or multiple talks) I've seen of his: "the GPL", even up to GPLv2 was seen as a universal receiver of source code. For the most part it didn't matter which free/open code was linked to it, GPLv2 could be compatible with it.
Enter GPLv3 which wasn't even compatible with the unmodified GPLv2 and to Landley this made it no longer a universal receiver. Couple that with his disagreement over using the GPL in BusyBox to leverage source code from vendors, he created ToyBox and the 0BSD license. He sees the 0BSD license as a universal donor since GPL is no longer a universal receiver.
I didn't know about the receiver-vs-donor concept, but my first reaction is that the "universal receiver" seems to signal the ambition to gather all free software and make it optionally-free.
I mean for now there is a fashion among contributors to keep software open-source, but they don't like it as obligation. (Where "contributors" include ICs, but are really mainly businesses these days.) And the notion of "universality" is in fact more about forsaking my right to require the software to remain free. Why would I need that again? It seems that when thinking about progress/ROI open-source turned out to work very well as a fashion, what's wrong with it being a legal obligation then?
Does the toybox project mix their code with GPLv3+ code?
The weird argument above was that there are apparently too many variations of the GPL (even though there are many, many more BSD and MIT variants). So I said, stick to a single GPL variant. And then you ask if you can mix GPL versions together.
Android has a policy of "no GPL in user space". I think Google makes this concession to the desires of phone manufacturers (Samsung, LG, etc.), but someone else probably knows more ...
Similarly, Apple has a policy against shipping GPLv3 code. When bash upgraded to GPLv3 from GPLv2 ~10 years ago, Apple stopped upgrading it, and then eventually migrated to zsh, which is MIT licensed.
The issue is that GPL is a "viral" license, which hasn't been tested all that much in court, but they're erring on the safe side. They don't want to mix their proprietary code with GPL code.
The code in question was Java code, which has been licensed under (among possibly other licenses) the GPL. The terms of the GPL has not been an issue in the case, but GPL's "virality" AFAIK is premised somewhat on things like API copyrightability.
But that aside, does the Open Source community really want to push for
the legal principal that just because you write an independent program
which uses a particular API, the license infects across the interface?
That's essentially interface copyrights, and if say the FSF were to
file an amicus curiae brief support that particular legal principle in
an kernel modules case, it's worthwhile to think about how Microsoft
and Apple could use that case law to f*ck us over very badly.
Apart from the "universal donor" vs. "universal receiver" argument explained in a sibling comment, Mr. Landley was apparently also involved (as a plaintiff) in the busybox lawsuits. He has explained in various talks etc. that, in his opinion, those lawsuits accomplished nothing except to drive away corporate users that had just started to dip their toes into the water with open source.
So one motivation for creating toybox was apparently that he wanted it as a busybox alternative for users afraid of lawsuits.
Perhaps we don't want those kinds of companies that ignore software licenses? Do the companies think that if they shipped unlicensed Windows with their phones that Microsoft would be "this is fine"
Bottom line is if you're shipping software with your phone that you didn't write, maybe spend the really minimal amount of time to make sure you're in compliance with the license, especially when the license (GPL) makes no onerous demands at all except you might need to put up an apache server somewhere with the source on it.
I'm not disagreeing with you. I think you should follow the licenses of whatever software you're shipping, whether that software is closed-source proprietary software or open source, or something else.
Unfortunately many companies seem to think that because they downloaded the software for free from the internet, they don't need to care about what the license text says. And if they disregard what some random person on the internet whines about them not following some license, a lawsuit seems perfectly in order.
Now one could argue, as Landley seems to be doing, that a permissive license is more attractive to corporations as there is less risk that some mistake somewhere along the way gets them sued. Then one can of course counter that argument by asking whether such users are beneficial to open source in the longer term, or is the open source community just a bunch of suckers doing free work for corporations without getting anything in return.
I don't understand this comment. Google have done countless things both for and against open software.
any business that wants to use gpl software has a problem.
It's not that gpl is bad either, or that it can't be used for commercial purposes, it's simply that the traditional business model doesn't include giving anything to anyone else.
So like I said, google "mit vs gpl" to get countless articles and discussions explaining all about that. That is the answer to "what's the problem with gpl here?"
This project is 0bsd not mit but mit will yield more hits, and is the same fundamental issue.
Red Hat sells GPL'ed software for billions. Oracle tried to do the same. Amazon created a whole ecosystem (AWS) around GPL'ed code. Even M$ embraces GPL.
Red Hat sells support, their software is free of charge. This is the closest thing to a viable business model around GPL code, and if you look at most companies doing this they are struggling. Red Hat was lucky/smart enough to get i to the government space and sell based on their rock-solid security. If your GPL software is unappetizing or irrelevant to the government world, you can't get that revenue stream going.
Amazon "commoditizes their complement" (https://news.ycombinator.com/item?id=25476266) by making open-source easy to use, and hopefully you'll exit through the gift shop and drop some chunky spend on their hosting services.
If your code is the actual valuable thing (for instance, a spreadsheet program that makes Excel look like a joke), then giving the code away for free doesn't give you any path to make money.
Oh right, that's the other business model that probably works. It does diminish your ability to take outside contributions, because you'll need a CLA. (IANAL)
toybox is pretty cool, but there is one big problem (though I am biased): the `bc` it has is old and outdated and not much better than the GNU `bc`.
The last update was several years ago because Rob wanted to maintain the version in toybox himself. It has not gotten much love since.
Source: I am the author of the `bc` in toybox. My current version [1] is much faster and has more math libraries and extensions. Even Android uses my current version rather than the version in toybox.
Edit: mistakenly said that toybox has my `dc`. It does not. Busybox does, and I mixed them up. I've corrected the above.
I have a (very) educated guess, but it's still only a guess.
I think there were two reasons.
The first is historical practice. POSIX themselves say that they don't codify new standards. Instead, they codify existing practice as standards. In practical terms, this means that they look at what all implementations that they care about do, figure out what is common among them, and codify that as a standard.
As far as I can tell, at the time that POSIX was first codified, there was at least one `bc` implementation that only accepted single letter names. So that's what they codified.
Some evidence for this reason: in the Rationale section of the `bc` standard [1], they mention historical practice and implementations several times.
(As an aside, the reason that that implementation only accepted single letter names is because it was a compiler for `dc`, and `dc` only accepts single letter names because of its reverse polish notation and "one letter per command" design.)
The second reason, while related to the fact that historically, `bc` was a frontend to `dc`, is slightly different: it's easier to implement that way.
If you look at the original source of the original Robert Morris and Lorinda Cherry `bc` (which I have because I'm a nerd), its parser only looked for the second letter of each keyword. They could do that because all other identifiers would be single letters. It worked, and they also probably wanted to spend little time on it since most of the users would be programmers like themselves who could deal with such limitations easily, though that is an assumption on my part. (Someday I would like to talk to Morris and Cherry about it, as well as Philip Nelson, the author of the GNU `bc`.)
By the way, the original Morris and Cherry implementation was the implementation in use at the time of codifying POSIX that only allowed single letter names. Its direct descendant is still in use today as the `bc` for the Plan 9 operating system.
Thank you very much for this thorough reply. That is probably it.
Now, suppose I want to write a POSIX shell script that does trigonometric calculations. Should I actually use `bc`? Is there a "translator" that reads `GNU bc` or your `bc` for example, and converts it into POSIX-compliant bc? (meaning converting the symbols to one letter in a way that does not produce conflict, declaring all used variables in a function in the first line (auto), etc)
> Now, suppose I want to write a POSIX shell script that does trigonometric calculations. Should I actually use `bc`?
Yes, you should; `bc` includes trigonometric functions even in POSIX accessible by the `-l` flag.
However, I'm not sure why you would; if you know you have a `bc` implementation other than a strict POSIX `bc`, I would just use that, even in a POSIX shell script.
POSIX is extremely limited; it does not even have `else` for `if` statements! It also does not allow you to use comparison operators outside an `if` statement or loop header. It does not have `continue`.
On the other hand, my `bc` is known to work in just about every POSIX system, and there are ports for it in OpenBSD and NetBSD. It is also the system `bc` in FreeBSD for FreeBSD 13. It also builds on Linux for either glibc or musl, and there are packages for it in Arch AUR, Gentoo, and other smaller distros. And it's easy to build from source with zero dependencies.
In short, if you're going to write a shell script to do math, I'd personally find it more appealing to actually check for and install my `bc` than to use strict POSIX.
> Is there a "translator" that reads `GNU bc` or your `bc` for example, and converts it into POSIX-compliant bc? (meaning converting the symbols to one letter in a way that does not produce conflict, declaring all used variables in a function in the first line (auto), etc)
For the record, my `bc` and GNU `bc` still require the auto list. That's part of the standard, so we cannot really get away from it.
But as for the existence of a translator, no, there is not a translator. I could probably write one, and putting it in my `bc` would be easiest since my `bc` already has a parser.
However, I do not think that would be a good use of time; trying to translate some features, especially `continue` would be tough. `continue` would be tough because I would have to translate the loop into two loops and when exiting the loop, differentiate the case between exiting for a `continue` and an actual exit, and if on `continue` let the outer loop loop.
Also, some features just could not translate, I'm afraid. My `bc` has a full PRNG, which can generate numbers of arbitrary size and precision. There just would not be a way of translating that feature.
And putting the translator in my `bc` would sort of be useless; if you installed it for the translator, why not just use it?
If this message sounds annoyed, it's because I'm annoyed at POSIX, not you. The fact that they have not even standardized `else` really gets my goat; all implementations, besides the Plan 9 one, have `else`, even one that translates for `dc`. There's no excuse for not having it. And they also seem to limit it on purpose, with not allowing comparison operators outside of `if` and loops. (Oh, and you can only use ONE operator per `if` and loop!) Grr...
tl;dr: Yes, use `bc` for trigonometry in shell scripts, but use a full-featured `bc` if at all possible.
I've bookmarked your gitea bc repo, not just for bc, but as an example of a neatly managed FOSS project, like providing performance comparison, fuzzing, project structure, the fact of using self-hosted gitea itself, etc
Skimming through [1], I see neither of Winget [2] nor Chocolatey [3] have any `bc` implementations as of yet. Maybe you can send them a PR? It may help adoption in other distros in the future.
> I've bookmarked your gitea bc repo, not just for bc, but as an example of a neatly managed FOSS project, like providing performance comparison, fuzzing, project structure, the fact of using self-hosted gitea itself, etc
Thank you! It's good to know that my project is actually good; sometimes I wonder if I am just full of myself. XD
> Skimming through [1], I see neither of Winget [2] nor Chocolatey [3] have any `bc` implementations as of yet. Maybe you can send them a PR? It may help adoption in other distros in the future.
I will do that. However, I might wait until I accept a PR that was just made [1] that will fix issues on Windows. I am not good at Windows, so there are problems.
Look what I've found! Lorinda Cherry herself explaining how to write a "talking calculator" (in 1982!) using dc and Unix pipelines:
https://youtu.be/XvDZLjaCJuw?t=827
Rob has a big goal in conflict with my goals: he wants as few lines of code as possible. Obviously, the performance and extra features require more lines of code.
I think with "few lines of codes as possible" he wants auditability. Rob mentions it as something of a feature:
https://youtu.be/MkJkyMuBm3g?t=744 about 12 minute mark, he mentions how people used BusyBox in US army because they wanted to audit all the code, and this was his realization.
I think it's a worthy goal. But surely if bc is missing some useful features there is probably no harm having multiple implementations.
I think it's a false goal if it comes before correctness. Auditability is great, but it's kinda useless to say, "Yep, I know your code is wrong."
Unfortunately, it seems to be Rob's number one goal over even correctness. (Busybox is even worse about binary size.) And with the way Rob asked for me to change things, it actually made things less readable, which is an important part of auditability. So he and I have a fundamental disagreement over it.
Busybox's bc and dc are also based on mine, by the way. And no, many people don't include them in builds, but that doesn't mean it doesn't have them. The reason is probably because bc is a much bigger utility than most utilities in busybox and "bloats" the binary size. ("Bloat" is a relative term. The busybox version is probably less than 100 kb.)
While toybox does not have my dc, I don't know what you mean by toybox not having a bc. It does, and it is based on mine. It's not in the community directory, it's in the pending directory (bc.c), which just means it's not enabled by default. You can enable it if you wish. Rob himself does because it's what allows toybox to have all of the utilities necessary for building Linux. (Yes, Linux requires bc to build.)
Edit: I've corrected my parent comment because you are right that toybox does not have my dc. But it does still have my bc.
> Yeah, I meant that currently, there is no way to do non-trivial mathematical calculations with either of default builds of busybox* [1] or toybox [2]
Yeah, that is a fair assessment; thank you for clarifying.
Rob has said that when he goes through my `bc` and ensures it is up to his standards, he'll take it out of `pending` and put it in `posix`.
Honestly, I don't know why he won't just accept me as the maintainer of it; I know it's already up to his standards because my `bc` is more well-tested than the rest of toybox combined. And I've fuzzed it to the point where I am confident that it has no memory bugs.
Landley's blog is interesting, too. http://landley.net/notes.html He often describes the step-by-step development process for building the Toybox tools.
> The toybox build produces a multicall binary, a "swiss-army-knife" program
that acts differently depending on the name it was called by (cp, mv, cat...).
Installing toybox adds symlinks for each command name to the $PATH.
> The special "toybox" command treats its first argument as the command to run.
With no arguments, it lists available commands. This allows you to use toybox
without installing it, and is the only command that can have an arbitrary
suffix (hence "toybox-armv5l").
Space and time saving measure. Instead of hundreds of programs you have one "box", which can do it all. Program names are the same but they all point to this box. The box reads the original command line and acts accordingly. If the line starts with "cat" it acts like a cat. If it starts with "cut" it acts like a cut.
Small footprint basic system utils, good for constrained systems and early boot, basically. Where utilities themselves are very small, it avoids executable 'boilerplate' that would take up space.
Compiling several tools into a single binary doesn't violate that guideline: each tool still does one thing. Being single-binary is an implementation detail.
Anyway, for users trying to do stuff with these utils, the available paths seem to be either finding StackOverflow answers from others who already messed with their own systems and survived, or being the first to do the trial-and-error.