I stumbled over .*? since ? usually means optional, but turns out it means lazy in this context. The $$ would be makefile escaping. Dropping the grep and changing the regex slightly, I just appended this to an overengineered makefile:
# Help idea derived from https://marmelab.com/blog/2016/02/29/auto-documented-makefile.html
# Prints the help-text from `target: ## help-text`, slightly reformatted and sorted
.PHONY: help
help: ## Write this help
awk 'BEGIN {FS = ":.*#+"}; /^[a-zA-Z_*.-]+:.*## .*$$/ {printf "%-30s %s\n", $$1, $$2}' $(MAKEFILE_LIST) | sort
Sed confuses me more than awk but you're right. That would also remove the only use of awk in my makefile (sed is there already for hacking around spaces in filenames).
Whitespace padding output in sed is probably horrible, column looks simpler than printf via bash or trying to use make's $info.
Anyhow, sed is a marvelous utility. Unfortunately, most people never learn its real power. The language is very simple, but the documentation is terrible. The Solaris on-line manual pages for sed are five pages long, and two of those pages describe the 34 different errors you can get. A program that spends as much space documenting the errors as it does documenting the language has a serious learning curve.
Appreciated, reading through it. I suspect the majority of the sed experience can be attributed to it using "posix regular expressions" by default. It was about a decade after first discovering sed that I realised passing -E was really important.
It is difficult for newcomers to guess that "extended regular expressions" refers to the barely-usable subset of "regular expressions" and "posix regular expressions" are terrible in comparison to either.
edit: alright, yes, one can program in that. Sed can recurse.
.PHONY: help3
help3:
sed -nE 's/^([a-zA-Z_*.-]+):.*## (.*)$$/\1 :\2/ p' \
$(MAKEFILE_LIST) | \
sed -E -e ':again s/^([^:]{1,16})[:]([^:]+)$$/\1 :\2/ ' -e 't again ' |\
sed -E 's/^([^ ]*)([ ]*):(.*)$$/\1:\2\3/' |\
sort
The first invocation filters out the lines of interest, second one space pads to 16. That works by putting the colon before the help text and repeatedly inserting a space before the colon until there are at least sixteen non-colon characters in the first group.
Composing the -n/p combination with act on everything is a stumbling block for merging the multiple invocations together but I expect it to be solvable.
After a slightly dubious use of time, I can confirm that columns is not necessary. Also noticed that the original version missed double_colon:: style targets. I fear sort is not necessary either but one has to draw the line somewhere.
HELP_PADDING := 30
.PHONY: awkhelp
awkhelp: ## Write this help using awk
@echo "awkhelp:"
@awk 'BEGIN {FS = ":.*#+"}; /^[a-zA-Z_*.-]+:.*## .*$$/ {printf " %-'$(HELP_PADDING)'s %s\n", $$1, $$2}' \
$(MAKEFILE_LIST) | \
sort
.PHONY: sedhelp
sedhelp: ## Write this help using sed
@echo "sedhelp:"
@sed -E \
-e '/^([a-zA-Z_*.-]+::?[ ]*)##[ ]*([^#]*)$$/ !d # grep' \
-e 's/([a-zA-Z_*.-]+:):?(.*)/ \1\2/ # drop :: and prefix pad' \
-e ':again s/^([^#]{1,'$(HELP_PADDING)'})##[ ]*([^#]*)$$/\1 ##\2/ # insert a space' \
-e 't again # do it again (termination is via {1, HELP_PADDING})' \
-e 's/^([^#]*)##([^#]*)$$/\1\2/ # remove the ##' \
$(MAKEFILE_LIST) | \
sort
Looks like 4.3 but I don't think it matters - awk vs gawk/nawk might be significant though, gawk 5.2 on the machine I ran this on.
The match with substr is interesting. It's more complicated than setting the field separator to something like :|#+ but should mean : in the help text works. For something one only writes and debugs once, probably better to do the complicated thing that always works.
gawk will write the groups to an array, that's possibly more legible (and slower? should be slower than the leading non-capture //)