
Modernizing “less” - zdw
http://garrett.damore.org/2014/09/modernizing-less.html
======
oblio
It seems to me that many core tools could use such updates. They're widely
used yet most of them predate modern development techniques and tools such as
widespread linter usage, unit tests, peer reviews before committing,
appearance of various stable and portable libraries or even compilers,
security scans.

I wonder what other nasty things would appear after a serious code review of
the GNU and BSD core code bases.

I do agree that the code has been read probably hundreds of times and executed
millions of time, but I doubt there have been many formal attempts at
improvement and the overall methodoloy resembles brute forcing to me.

I could be wrong.

~~~
kev009
FreeBSD has always done peer reviews which cover the entire base system, and
it's now mechanized a bit: [http://julipedia.meroh.net/2014/05/code-review-
culture-meets...](http://julipedia.meroh.net/2014/05/code-review-culture-
meets-freebsd.html). Automated testing (kyua) and CI (jenkins) are also in
progress.

~~~
_delirium
They do security scans, yes, but not really full-system code-quality review
down to the level of every system utility (esp. not those developed primarily
elsewhere). At least in the specific case of 'less', it's almost just a
formality that it's even in the FreeBSD SVN tree, since the only activity is
occasional re-imports of the upstream version:
[https://svnweb.freebsd.org/base/head/contrib/less/](https://svnweb.freebsd.org/base/head/contrib/less/)
Afaict, this Illumos initiative is the first attempt in years by anyone to
review/clean up the internals.

------
dredmorbius
While we're talking less(1) features, one I stumbled across a few years ago
when wishing really hard that less supported an interactive regex line filter,
was its interactive regex line filter:

    
    
        &pattern
    

Which will: Display only lines which match the pattern; lines which do not
match the pattern are not displayed. If pattern is empty (if you type &
immediately followed by ENTER), any filtering is turned off, and all lines are
displayed.

I'd prefer this followed a syntax closer to mutt's filters, and that the
patterns were editable (e.g., typing '&' during a filter would show the
_currently_ extant filter for modification), but it's handy.

~~~
dbdr
> I'd prefer [...] that the patterns were editable (e.g., typing '&' during a
> filter would show the currently extant filter for modification)

If you type '&' then the up arrow, you'll get the previously entered
pattern(s), which you can then edit.

~~~
dredmorbius
Bless you. I didn't know that.

Which is why I share tips like this -- it's almost always a win.

------
JoshTriplett
> Make less use getopt() instead of its byzantine option parser (it needed
> that for PC operating systems. We don't need or want this complexity on
> POSIX.)

This is the kind of thing that always astonishes me to see in a codebase: why
reinvent something rather than just finding and including a compatibility
implementation? Just grab an appropriate getopt.c and compile it in if the
platform doesn't have one, then let the rest of the code pretend every
platform has one. (Preferably an implementation of getopt_long; a quick search
turned up some licensed under 3-clause BSD.)

~~~
roeme
less was started (~ '83) before getopt() was made available to the 'general'
public ('85).

My guess is nobody bothered to replace these parts, since options were only
added gradually; if at all. Coincidentally i'm in the same predicament with a
tool a coworker of mine (initially) wrote, convoluted option parsing to say
the least; but I'm too busy with fixing other parts or adding proper
functionality to it than to replace it.

Sometimes, 'less is more'.

~~~
AceJohnny2
You couldn't help it, could you? :)

~~~
dredmorbius
Many moons ago I mentioned the story of "more" and "less" to our (female) DBA,
who was unfamiliar with the commands, and the expression "less is more".

"It's right in the manpage, actually". "No." "Yes, I'll send it to you."

And so I did, with the subject line "man less".

She sat at a desk right in front of mine, and I detected a somewhat painful
silence as the email arrived. And realized I'd just inadvertently commented on
her social life (confirmed through later conversations).

------
_delirium
Interesting! From what I'm reading so far, this seems not to be Illumos-
specific (despite being motivated by Illumos's needs), but rather a cleanup
that'd be applicable to any POSIX-like system. A fork of less that assumes
POSIX-like functionality and cleans up a lot of things accordingly does seem
like a worthwhile project. A bit more "unixy" design that uses the system
versions of available functionality (like globbing and UTF-8!) should also
reduce the risk of weird bugs & inconsistencies with how the rest of the
system operates.

------
comboy
I've switched from less to most* some time ago, no idea how it looks
underneath, but it works great for me. It seems to be available for most (tee
hee) distributions.

* [http://linux.die.net/man/1/most](http://linux.die.net/man/1/most)

~~~
txutxu
I used 'most' a few years ago because of the windowing support (this is before
'screen' did get support for vertical split). Also I did love about 'most'
that it used to colorize manpages, when other pagers didn't.

I stopped using it, like 4 years ago, when someone did told me on IRC that it
was unmaintained and did have some bugs.

Note that I did never notice any 'bug' in my time as 'most' user, even if they
may exist, and indeed, it's still available at least in all Debian versions.

It's curious that distributions are able to "maintain" a package (maybe even
with custom patches) which is not maintained or updated upstream, for years.

It's encouraging to see actions like the one performed by this IllumOS
developer. If a program is opensource, we can fork, improve and share. Or as
users, we can take a look at the code when making choices, lots of people
forget this in favor of search engine recommendations.

~~~
e12e
The level of love packages receive varies -- I'm a great fan of Debian, but in
the case of "most" support appears to be less than stellar from the Debian
side:

[https://bugs.debian.org/cgi-
bin/pkgreport.cgi?which=maint&da...](https://bugs.debian.org/cgi-
bin/pkgreport.cgi?which=maint&data=mako@debian.org&archive=no&raw=yes&bug-
rev=yes&pend-exc=fixed&pend-exc=done#_1_4_5)

~~~
voltagex_
I think you mean [https://bugs.debian.org/cgi-
bin/pkgreport.cgi?package=most](https://bugs.debian.org/cgi-
bin/pkgreport.cgi?package=most) \- which doesn't look too bad.

~~~
e12e
Well, yes. In either of the links -- I was thinking about the two normal-
priority forwarded bugs for "most" (files starting with dash, sigpipe) and
their age...

------
reconbot
It's crazy to me that something as core as less has such bit rot. I'm aware
GNU rebuilt a bunch of tools a while ago and added a slew of common features
such as `-h` for human readable mode, and `--` long arguments. Does this
rewrite have a name yet? and can it be brought into the core project at all?
(or the GNU project?)

~~~
jordigh
Being part of the GNU project doesn't mean it's going to be maintained. For
example, bzr supposedly is a core GNU package, and it's dead.

Being a GNU package just means you agree to behave in a GNUly way. It's a very
informal and easily-granted qualification.

------
nn3
A faster string search would be nice. I used to (and still sometimes do) use
less to analyze large trace files. With a few hundred MB the searching becomes
a real bottle neck.

Couldn't be that hard to do a boyer moore for non RE substrings.

~~~
porker
Would using ag help? No longer maintained AFAIK, but fast for me.

~~~
chrismonsanto
where did you get that idea? last commit was 7 days ago, no mention of
abandonment

~~~
porker
From the status of the Ubuntu ppa, and (I thought) comments on his site.
Thanks for pointing out that's not the case.

------
kazinator
Nobody needs POSIX conformance in an interactive pager; that's just
conformance for the sake of conformance. Or do they? What is the economic
justification ("business case") for working on POSIX compliance in a "more"
command?

You should never invoke "more" directly in a script anyway, but rather observe
the PAGER environment variable, and fall back on a plain "more" only if that
isn't set. (Speaking of which, PAGER _isn 't_ described in POSIX, oops!)

If the user wants the pager to exit when the last line is reached, the user
can specify the necessary option in PAGER, if their pager supports it. PAGER
just has to be properly expanded: treated as a command, not a command name.

------
noonespecial
Some yaks look really good bald. Well done.

------
gatehouse
Seems weird to hear about less being used "to meet POSIX" requirements, since
I use it at least 10 times a day -- interesting read though.

------
clarry
Cool, but terminfo and curses are another bunch of things I'd like to see go..
or at least delegated from the core of every fullscreen terminal application
to a special compatibility layer (e.g. tmux or screen) for those using
terminals which don't speak ANSI.

I've looked at it before, and I agree less is a mess.

------
onan_barbarian
I got excited to click on this, imagining that someone was modernizing less.
Which is true in a narrow sense, but am I the only one to feel a lingering
disappointment that we run shells in xterms (or slightly modified equivalents)
that emulate ancient terminals, then implement pagers to page through screen
after screen?

Can anyone point to work that starts with the _combination_ of the two
following propositions:

1\. User interface elements invented since the 1970s are a pretty neat thing.

2\. Text-based shells and the command line are also a pretty neat thing.

~~~
rogual
There was TermKit
([https://github.com/unconed/TermKit](https://github.com/unconed/TermKit)) but
the author has stopped working on it.

~~~
onan_barbarian
Yes. His sentiments here [http://acko.net/blog/on-
termkit/](http://acko.net/blog/on-termkit/) are very close to mine: "It makes
me wonder, when sitting in front of a crisp, 2.3 million pixel display (i.e. a
laptop) why I'm telling those pixels to draw me a computer terminal from the
80s".

I'm not crazy about every last aspect of his design there - but it's a start.

------
miah_
I just use 'view'; its the same as running 'vim -R'.

~~~
_delirium
I find 'view' fine for things like looking at a logfile, but not great for one
of my more common use-cases for a pager, which is looking at something from
stdin that's either large or slow-to-produce. You have to wait until vim reads
in the entire stream before you can do anything with it. With 'less' you can
immediately navigate/search/etc. while the stream is still coming in.

~~~
paddyoloughlin
Also, the default action of the cursor keys is more useful for paging in less
than in view. In view, cursor keys move the cursor; in less, they scroll the
screen.

And less has built-in tailing which you can start and stop at any time. That's
its killer feature for me.

On the other hand, vim/view can have some nice syntax highlighting for syslog
format log files. I haven't found that enough to switch though.

------
snoopybbt
Modernizing less to me means using most:
[http://www.jedsoft.org/most/](http://www.jedsoft.org/most/)

------
nn3
Also it would be nice if the original author could put this work up as a
tarball with standard make files somewhere so that other OS can benefit.

------
jwr
Thank you for this effort. It is badly needed.

------
stuaxo
These changes really need tk be upstreamed...

~~~
paddyoloughlin
It's probably not that simple. The article indicates that the refactor removed
support for a lot of old platforms which the project may very well prefer to
keep.

------
Tloewald
Cool, but I was hoping he meant less the CSS extension :-)

~~~
dredmorbius
Ambiguity is program / feature names is a constant (and growing) issue. Though
less(1) _is_ the rather more established of these two instances.

~~~
Tloewald
Well, it's certainly _older_. It's not really so ambiguous since the CSS
version is "LESS". Speaking as someone who cut his teeth on the command line,
I clicked the article thinking that it was referring to LESS since the idea of
modernizing less never occurred to me (even though it's a much older piece of
technology; LESS is much more broken...)

------
0xeeeeeeee
A lot of people are talking about revamping these old programs. I don't see
what the problem is with less, cat, vim, etc...

I've never been using a program and wished for better functionality. Even when
this was all new to me, it was never a problem figuring these out, using them,
and I was always satisfied with them..

So...here's the question. I don't think these are broken, so what are you
fixing?

~~~
coldtea
> _So...here 's the question. I don't think these are broken, so what are you
> fixing?_

First, it's not just about something not working. It's about creating tools
that are extensible and understandable and hackable. Open Source is not just
about "working", it's about being modifiable by the end user. All this cruft
(a mess of 200 obsolete architectures, dead code and deprecated library
support that nobody used since 1988) works against that goal.

Second, there are things that would be essential for some people, like
international users (e.g proper multibyte support) that cannot be added due to
dependancy of some custom methods of handling encodings. That's not some wishy
washy magical unicorn feature request, it's essential for the main operation
of what less does for those that have to deal with these encodings.

Third, there's nothing wrong in taking pride and crafting finely your tools.
UNIX is supposed to be made of things that "do one thing and do it well". Less
having its own utf-8 support breaks this division of responsibility. We have
libaries for that. Same for getopts vs it's custom options parsing.

~~~
clarry
What's the library for UTF-8?

~~~
_delirium
At least for programs written in C, most (all?) modern Unix-like platforms
should include the functionality in the base install. On the language side,
C89 requires support for wide and multibyte characters in a conforming libc
implementation. And POSIX furthermore requires a locales/iconv system to
specify and convert between encodings. Neither of those strictly require that
UTF-8 be one of the supported encodings (C89 predates Unicode), but any
reasonably modern implementation will include Unicode locales. And if it
doesn't, I think at this point you can just consider that to be the system's
problem: the current assumption for POSIXy programs is that they will use the
system locales, not try to implement their own encoding machinery.

------
hayksaakian
Note: this is not referring to the compile-to-CSS language named Less

\-----

Downvotes for clarifying an ambiguity, really?

~~~
justincormack
What does it add to the discussion?

~~~
dredmorbius
Clarity, without having to follow through to the article.

The lack of any context for HN posts (also reddit link shares) is a
significant disadvantage of both sites. I've always been partial to Slashdot's
link summaries, and wish that style were more widely used. See also Jakob
Nielsen and microcontent.

