One of my pet peeves is "The Useless Use of cat Award". Someone awarded it to me as a teenager in the late 90s and I've been sore ever since.
Yup, it's often a waste of resources to run an extra 'cat'. It really demonstrates that you don't have the usage of the command receiving the output completely memorized. You know, the thousand or so commands you might be piping it into.
But, if you're doing a 'useless' use of cat, you're probably just doing it in an interactive session. You're not writing a script. (Or maybe you are, but even still, I bet that script isn't running thousands of times per second. And if it is, ok, time to question it).
So you're wasting a few clock cycles. The computer is doing a few billion of these per second? By the time you explain the 'useless' use of cat to someone, the time you wasted explaining to them why they are wrong, is greater than the total time that their lifetime usage of cat was going to waste.
There's a set of people who correct the same three pairs of homophones that get used incorrectly, but don't know what the word 'homophone' is. (Har har, they're/their/there). I always liken the people who are so quick to chew someone out for using cat, in the same batch of people who do this: what if I just want to use cat because it makes my command easier to edit? I can click up, warp to the front of the line, and change it real quick.
I do “useless” use of cat quite often because, in my brain, the pipeline naturally starts with “given this file”, so it makes the pipeline more consistent e.g. `cat f | a | b | c` rather than `a < f | b | c` where one must start with the first “function” rather than with the data. I see starting with `cat` analogous to the `->` thread macro in Clojure, `|>` pipe in Elixir, and `&` reverse application operator in Haskell. If bash permitted putting the filename first, I’d stop using `cat`; alas, it does not.
> One of my pet peeves is "The Useless Use of cat Award". Someone awarded it to me as a teenager in the late 90s and I've been sore ever since.
Wear it as a badge of honor! It marks you as a person who puts clarity, convenience and simplicity before raw performance. I can't think of a single case when that bit of performance matters.
Needless to say, I'm happily using cat (uselessly) myself and have no plans to convert.
> It marks you as a person who puts clarity, convenience and simplicity before raw performance.
This. As noted, even in scripts it usually makes more sense since the result is a pipeline that's easier to read, annotate and modify.
Case in point:
cat file.txt \
| sed '1s/^\xEF\xBB\xBF//' `# Strip UTF-8 BOM at the beginning of the file` \
| ...
Specifying a file name would only make the "black-magic-line" of `sed` more complicated while also making it more complicated to modify the pipeline itself. Now, if I want to skip that step to test something, I don't have to figure out if/how the next command takes an input file (or, ironically, replace `sed` with `cat`).
> It really demonstrates that you don't have the usage of the command receiving the output completely memorized.
No, it demonstrates that you don't have redirection memorized, and don't know that you can place it anywhere in the command line, including on the left.
> So you're wasting a few clock cycles
Keystrokes too:
cat x | cmd
< x cmd
It's also possible that cmd may detect that its standard input is connected to a real file, and take advantage of being able to lseek() the file descriptor. For instance say that x is an archive and cmd is an extractor. If cmd needs to skip some indicated offset to get to the desired payload material, it may be able to do it efficiently with an lseek, whereas under cat, it has to read all the bytes and discard them to get to the desired offset.
I still prefer to have cat there because it is interchangeable with other output-producing commands and it can handle globs. In an interactive session I iterate on the last command many times, and if I decide to filter stuff can just replace cat with grep or if I decide to pull from a directory of files can add a glob, if compressed it turns to zgrep or zcat etc. With redirects I'd have to change the structure of the pipeline which wastes mental effort. IMO.
> the time [used to] explain the 'useless[ness]' .. of cat to someone .. is greater than the total time that their lifetime usage of cat was going to waste
If you look for situations like this they are surprisingly common.
I don't often (err, ever…) reply without reading further but this time I must, because: I've never heard this turn of phrase "useless use of cat" and it turned my brain upside-down for a moment, because: "interactive" is precisely how I learn and do and I suppose it was a nice reminder that sometimes really big (read: useless) things are actually kinda small (useful) and vice versa.
Well, you waste an entire fork and exec, so I believe you are underestimating the time by a few orders of magnitude. Also, it's almost always grep following the cat, so it's not much to memorize.
But it's well worth wasting a process to have a nice pipeline where each command does a single thing so you can easily reason about them.
It's a lot more than the few extra cycles to spin up the process - it's also an extra copy of all the data. Usually that's also not much, but occasionally it's everything, as the consuming program can seek in a file, but not in a pipe, so might otherwise only need a tiny bit of the data.
It totally makes sense, after all the electron apps/docker containers/statically linked utilities/microservices running on my machine, needlessly running cat might be the straw that breaks the camels back.
It is not called "useless UUoC comment" (UUUoCC) without justification. ;)
From a personal taste perspective, I'm not a fan of either. Having a floating "<" at the start of a line just isn't my cup of tee. Not dealing with explicit stdin/stdout just makes my code easier to read. And especially considering the post's advice is about reading logs, a lot of the post is very likely built around outage resolution. Not the time I want to be thinking "oh yeah `tr` is special and I need to be explicit" -- nah, just use cat as a practice. And no, I'm not going to write `grep blah < filename` as a practice just because of commands like `tr` being weird.
But honestly, if it's such a big deal to have a cat process floating around, there are probably other things you should be concerned about. "Adds extra load to the server" points to other problems. If perf matters, CPU shielding should be used. Or if that's not an option, then sure, there's some room for trifling around, but if you're at a point where you're already running a series of pipes, a single cat command is beans compared to the greps and seds that come after it.
My biggest quality of life improvement for understanding logs has been lnav (https://lnav.org/) -- does everything mentioned in this post in a single tool with interactive filtering and quick logical and time based navigation.
Until now I thought logview.el[0] is the bee's knees, but now I can feel feature envy set in. There are some seriously powerful ideas listed on the lnav page, and it's also the first time I saw SQLite virtual tables used in the wild.
At my current company, we're using pdsh with tail -f piped into sed to follow logs from multiple remote servers at the same time, label them and colorize them. Works okay. Decent solution without needing to install much other software. Not my favourite because it doesn't deal well with SSH sessions timing out and leaves some tail -f processes hanging, and some other quirks. But out of laziness and risk of breaking things in prod, we haven't tried experimenting with much else.
It works by uploading a stub that is an "actually portable executable" to the remote over ssh. lnav then talks to the stub to sync files back to the local host. I don't know the scale of the hosts or files, but if they're not too big, it might work fine.
Huh, I almost posted a duplicate recommendation. My only complaint with lnav was that it had to be built from source on Linux and the build was frigging huge. Apparently they have a pre-compiled linux-musl binary now.
I love lnav and use it constantly, but it crashes a lot. I do wish there was something like lnav that was a little simpler to use, and written in a more resilient way that crashed less.
I can cut lnav some slack for the crashes because identifying and parsing arbitrary log formats seems like a messy problem. Still it shouldn't crash 1/3rd of the time I use it.
Sorry for the crashes :( I've been trying to improve it's internals more than adding features as of late. If you haven't already, please file bugs on github and/or submit crash logs to the mailing list. (I've taken a break from working on it lately. So, if you've done that and I haven't gotten back, I apologize.)
Yes! lnav (https://lnav.org) is phenomenal.
Embedded SQLite...
easily scriptable...
OOB log formats galore, or define your own...
it's a mini ETL powertool that scales to at least a few million rows and runs in your terminal. Maintainer's a friendly dude, too.
I don't think multitail really understands logs like lnav does, it's just following the last lines in the file. For example, if you try to follow multiple files in multitail like so:
You get a view with the tail from one file followed by the tail from the other, they are not collated by timestamp. In contrast, if you do the same thing in lnav:
$ lnav /var/log/install.log /var/log/system.log
You will get a single view with all of the log messages from both files and they're sorted by their timestamps. Here is all of what lnav is doing:
* Monitoring files/directories for updates
* Decompressed files/archives
* Detected the log format for each file
* Created SQLite vtables that provide access to log messages
* Built an index of all the log messages in all the files so
you can jump to a certain point in time or to the
next/previous error message.
* Display all log messages with syntax highlighting
I migrated from multitail to lnav. Turned out to be a no-brainer.
I second the above, just one pain point with multitail to add. I often page/search/filter in the scrollback buffer (I typoed "bugger" - Freudian slip?) and in multitail the scrollback is a separate window with a frame and everything, which is a pain (copying whole lines using mouse includes the frame, ugh). The filtering/searching being a separate pain.
One thing I used in multitail and not sure if I migrated wholly to lnav was log file syntax highlighting using regexes.
Our Easy Data Transform software is intended for data wrangling and desktop ETL. But it has lots of features useful for browsing and manipulating log files, including:
* powerful filtering (including with regex)
* smooth scrolling of millions of rows of data
* support for csv, text, xml, json, Excel and other formats
One thing I've done to identify infrequent log entries within a log file is to remove all numbers from a file and print out a frequency of each. Basically just helps to disregard timestamps (not just at the beginning of the line), line numbers, etc.
Brilliant hack. I've used just about all the tricks from the blog and many of the comments here, but never this one. I've stripped timestamps for sure, but never considered all numerics. Nice one !
I love this tip for all kinds of diffing. I'm so sure I'm going to use it that I already assigned it an alias to strip numbers from whatever's on my Mac clipboard:
alias numberless="pbpaste | sed 's/[0-9]//g' | pbcopy"
It's probably worth exploring making that a frequency bucketing pipe that dashboards N minute intervals .. so that operators can see any abrupt changes.
1) Fuck grep, use ripgrep, especially if you have to scour over an entire directory.
2) Get good with regex, seriously, it will shave hours off your searching.
3) For whatever application you are using, get to know how the logging is created. Find the methods used where said logs are made, and understand why such a log line exists.
4) Get good with piping into awk if needed if you need some nice readable output.
Piping into AWK feels like a misuse of the tool in all but the simplest of cases. Don't forget that you can write complete AWK scripts and invoke them from a file!
I started using ack about 14 years ago, so theres a lot of inertia there. It is slower than the others, but it was so much faster at the time for searching through source trees than grep.
Honestly, the most amazing thing I did with logs was learn how to do subtraction. Any time you have multiple instances of a thing and only some of them are bad, you can easily find the problem (if anyone bothered to log it) by performing bad - good.
The way you do this is by aggregating logs by fingerprints. Removing everything but punctuation is a generic approach to fingerprinting, but is not exactly human friendly. For Java, log4j can use class in your logging pattern, and that plus log level is usually pretty specific.
Once you have a fingerprint, the rest is just counting and division. Over a specific time window, count the number of log events, for every finger print, for both good and bad systems. Then score every fingerprint as (1+ # of bad events) / (1 + # of good events) and everything at the top is most strongly bad. And the more often its logged, the further up it will be. No more lecturing people about "correct" interpretations of ERROR vs INFO vs DEBUG. No more "this ERROR is always logged, even during normal operations".
Isn’t this basically what structured binary logs are? Instead of writing a string like `Error: foo timeout after ${elapsed_amt} ms` to the log, you write a 4-byte error code and a 4 byte integer for elapsed_amt. I know there are libraries like C++’s nanolog that do this for you, under the hood.
Maybe this is a silly question, but is there much value in a 4-byte binary code compared to a human-readable log with human-readable codes? Maybe size, but logfmt especially is not much less compact than binary data.
One thing I didn't see was how to use GREP to view the lines before and after a match:
grep regex /var/log/logfile -A5 #To view the next 5 lines
grep regex /var/log/logfile -B5 #To view the previous 5 lines
grep regex /var/log/logfile -05 #To view the 5 lines before *and* after the match
This is super handy to find out what happened just before a service crashed, for example.
Loosely related: a few years ago I wanted a simpler alternative to some of the more feature-full log viewers out there so I threw together a tiny (50kb) app that might be useful to some folks in here.
All it does is consistently colors the first field in a line from stdin so you can quickly see which log lines have the same first field.
As much as I approve of a skillset to analyze local logs, but after a relatively small scale (10-20 systems), a central decent log aggregation like opensearch or ELK just brings so much value even on 1-3 nodes. It'd be one of the first changes I make to an infrastructure because it's so powerful.
And its not just log searching and correlation value. At work, the entire discussion "oh but we need access to all servers because of logs" just died when all logs were accessible via one web interface. I added a log aggregation and suddenly only ops needed access to servers.
Designing that thing with accessibility and discoverability in mind is a whone nother topic though.
Just to expand on this more - you don't have to use ELK which is pretty resource and maintenance heavy, there are much easier and faster alternatives that don't need attentive care.
Throw Loki+Grafana on a single VM somewhere (or run it on your Kubernetes/Nomad/ECS/etc. cluster) and it will get you very far, as long as you plan ahead a bit (most notably, indexing happens at ingestion, so you need to have an idea of what you want from your logs or your queries will be slower).
Or use a SaaS like logz.io, AWS OpenSearch, Datadog, etc. Most support OpenTelemetry now, so switching data ingestion is quite easy (unlike dashboards and alerts).
It makes sense to have centralised logs pretty much as soon as you outgrow the "everything runs on this one box and my DR plan is a prayer" stage, IMO.
Agreed! Centralizing logs is so helpful, but you don't know it until you've done it. Too many people rely upon grep, when a handy tool is just a download away. Plug for said too: https://log-store.com
We went to the trouble of setting up ELK and I was excited, but I never use it any more. You can't scroll/search through the logs as fast as vi and grep. You have to click to see more log lines. And I can write an alias to tail a log so I don't have to log into a UI.
Glad to see lnav (https://lnav.org) already getting some love in the comments. Hands-down the most reliable source of "thank you! I wish I'd known about this tool sooner!" responses, even from v experienced sysadmins / SREs.
That gets rid of the cat and three greps. Both POSIX and GNU encourage grep -E to be used in preference to egrep.
A pcregrep utility also used to exist, if you want expansive perl-compatible regular expressions. This has been absorbed into GNU grep with the -P option.
> A pcregrep utility also used to exist, if you want expansive perl-compatible regular expressions. This has been absorbed into GNU grep with the -P option.
'pcregrep' still exists. But with PCRE2 supplanting PCRE, it is now spelled 'pcre2grep'.
I don't know the precise history of 'grep -P' and whether 'pcregrep' was actually absorbed into it, but 'pcregrep' is its own thing with its own features. For example, it has a -M/--multiline flag that no standard grep (that I'm aware of) has. (Although there are some work-arounds, e.g., by treating NUL as the line terminator via the -z/--null-data flag in GNU grep.)
Oddly, there are pcre2 packages in RedHat/Alma 9, but they do not include a pcre2grep.
GNU grep is also linked to pcre, not pcre2.
# pcre2grep
bash: pcre2grep: command not found...
# yum install pcre2grep
Last metadata expiration check: 1:58:58 ago on Tue 13 Dec 2022 11:45:44 AM CST.
No match for argument: pcre2grep
Error: Unable to find a match: pcre2grep
# yum whatprovides pcre2grep
Last metadata expiration check: 2:09:25 ago on Tue 13 Dec 2022 11:45:44 AM CST.
Error: No matches found.
# rpm -qa | grep pcre2 | sort
pcre2-10.40-2.0.2.el9.x86_64
pcre2-syntax-10.40-2.0.2.el9.noarch
pcre2-utf32-10.40-2.0.2.el9.x86_64
# which grep
/usr/bin/grep
# ldd /usr/bin/grep | grep pcre
libpcre.so.1 => /lib64/libpcre.so.1 (0x00007efc473c4000)
I once compared the speed of these two approaches, rather accidentally. I did output colorization by adding ANSII sequences. I thought, of course one process must be more efficient than a pipe of processes. After the rewrite, I was disappointed about the slowdown and reverted back to the pipe.
PS I checked back and I used sed rather than grep. I think the result would hold for grep but the morale is that you should verify rather than assume.
I have around 50 seds in the pipe, running in parallel (which is what makes it faster), it would have been a half of that when I tried the rewrite.
> you’ll get overwhelmed by a million irrelevant messages because the log level is set to INFO
I know this happens, but I think it's because programmers are abusing INFO. In principle it's reserved for messages that are informative at a level sys admins and a few others can make sense of and use. Unfortunately abuse often leads to "We turned INFO off" making it much harder to diagnose things after the fact.
I think there should be an counterpart to "log analysis" which is "logging strategies for your app". Which should be WHAT to log and WHEN.
Stuff like, if you are exposing an HTTP endpoint, you should log the request URL and the time it took to serve it. Or if you are invoking an extenral service, you should log the response time of that service.
And you should produce a single line of output for each request that identifies all of the pertinent information. You can have more than one, for e.g. a thread dump, but there should be one that provides a complete summary. I've lived with apps that logged at each stage of the process as separate lines, and that's just not useful data when grepping for anomalies.
I hypothesize it's not useful because you're using grep. If you use a tool that can show you multiple lines all tied by a request ID, it becomes much more helpful.
I am quite often looking for patterns across thousands of requests. As an example, one thing I inherited didn’t even log how long each request took to serve. Sure, you find out how long a single request took, by comparing the first and last log entry, but that’s just not useful 99% of the time.
When I was training sysadmins back in the dark ages, one of the rules I taught was: know what good looks like in your logs. If you are scanning hundreds of lines of logging under duress to find a smoking gun, and you don't know the difference between what the logs normally show, and what you are seeing, you'll waste a lot of time.
Corollary is that good day logs should be minimal and "clean", e.g not logging a lot, or, logging nice and predictably (which makes them easy to strip out via grep -v, etc.)
Yes, always include a request id in every request structure you create and include it also in the response and print it. It would seem something obvious that everyone does by default but instead, no, it's not so obvious it seems.
Not so obvious. How to implement it without passing request id to all the functions, when they are unrelated to request/http ? Especially in languages without thread-locals such as javascript?
Many frameworks solve this with logger context. Add the properties you want to the logging context and all future logs in that context will have that property.
I recently used clickhouse-local to do some log analysis on a lot of elastic load balancer logs (~10s of GBs) and it was spectacular.
In short, you can add clickhouse-local to a shell pipeline and then run SQL queries on the data. An example from the docs:
$ ps aux | tail -n +2 | awk '{ printf("%s\t%s\n", $1, $4) }'
| clickhouse-local --structure "user String, mem Float64"
--query "SELECT user, round(sum(mem), 2) as memTotal
FROM table GROUP BY user ORDER BY memTotal DESC FORMAT Pretty"
I wrote https://github.com/ljw1004/seaoflogs - an interactive filtering tool, for similar ends to what's described here. I wrote it because my team was struggling to analyze LSP logs (that's the protocol used by VSCode to communicate with language servers). But I made it general-purpose able to analyze more log formats too - for instance, we want to correlate LSP logs with server logs and other traffic logs.
(1) I wanted something where colleagues could easily share links in workplace chat with each other, so we could cooperatively investigate bugs.
(2) For LSP we're often concerned with responsiveness, and I thought the best way to indicate times when viewing a log is with whitespace gaps between log messages in proportion to their time gap.
(3) For LSP we have lots of interleaved activity going on, and I wanted to have visual "threads" connecting related logs.
(4) As the post and lnav say, interactivity is everything. I tried to take it a step further with (1) javascript, (2) playground-style updates as you type, (3) autocomplete which "learns" what fields are available from structured logs.
My tool runs all in the browser. (I spent effort figuring out how people can distribute it safely and use it for their own confidential logs too). It's fast enough up to about 10k lines of logs.
Keep access logs, both when a service receives a request and finishes a request.
Record request duration.
Always rotate logs.
Ingest logs into a central store if possible.
Ingest exceptions into a central store if possible.
Always use UTC everywhere in infra.
Make sure all (semantic) lines in a log file contain a timestamp.
Include thread ids if it makes sense to.
It's useful to log unix timestamp alongside human readable time because it is trivially sortable.
Use head/tail to test a command before running it on a large log file.
If you find yourself going to logs for time series data then it is definitely time to use a time series database. If you can't do that, at least write a `/private/stats` handler that displays in memory histograms/counters/gauges of relevant data.
Know the difference between stderr and stdout and how to manipulate them on the command line (2>/dev/null is invaluable, 2>&1 is useful), use them appropriately for script output.
Use atop, it makes debugging machine level/resource problems 10 fold easier.
Have a general knowledge of log files (sometimes /var/log/syslog will tell you exactly your problem, often in red colored text).
This needs to be used carefully and deliberately. This is the style of command that can test your backups. This style command has caused multiple _major_ outages. With it, you can find a needle in a haystack across an entire fleet of machines quickly and trivially. If you need to do more complex things, `bash -c` can be the command sent to ssh.
I've had an unreasonable amount of success opening up log files in vim and using vim to explore and operate on them. You can do command line actions one at a time (:!$bash_cmd), and you can trivially undo (or redo) anything to the logs. Searching and sorting, line jumping, pagedown/up, etc, diffing, jump to top of file or bottom, status bar telling you how far you are into a file or how many lines it has without having to wc -l, etc.
Lastly, it's great to think of the command line in terms of map and reduce. `sed` is a mapping command, `grep` is a reducing command. Awk is frequently used for either mapping or reducing.
Some of these are KPIs (Key Performance Indicators). What we did at a previous job was to have a system like Etsy's statd [1] (it's an easy system to implement) and it made it easy to add statistics like latency of requests, number of errors, just about anything that could be measured, without excessive overhead (in terms of source code).
Can Amazon do this? They use UTC and your local browser’s time seemingly randomly depending on AWS service, and it drives me nuts. They usually (not always) put the timezone next to it, but why can’t they just have a mandate that it either is or is not UTC?! (The worst one is that the Lambda console is UTC but cloudwatch isn’t, so you think you haven’t received a request in hours but then you did)
Surprised to see that under the section "correlate between different systems" tracing isn't mentioned as an alternative approach. That's what tracing is: logging across different systems and getting that structure all stitched together for you.
A few weeks ago I had a windows installer that was silently failing when upgrading from an older version (installation from scratch was working without issues). And as windows install logs aren't exactly easy to read, I was stumped, until I took an upgrade log from an older, working build, strip all date information from both files and compare them, checking all the sections which were different, until I found a line indicating that a colleague had forgotten about a limitation when dealing with msp (don't delete components on minor upgrades, but I didn't throw any stones as I've done the same mistake, twice, one and two years ago...)
One of my favorite tricks is to use a visual difftool.
Copy good log into left panel. Copy bad log into right panel. Quickly show which lines are new, which are missing and which are out of order. Obviously ignore the timestamps ;)
I do this all the time (PyCharm has an amazing diff tool hidden under Cmd-Shift-A, Show Diff). With the command from above that removes all the numeric characters it's going to be even easier to compare, no more timestamps!
One thing we did at my previous job was to add a "trace flag" to each account. Normally, they log nothing about a transaction (other than it happened), but if the trace flag is set, then a lot of information is logged about the transaction. Also, this trace flag is propagated throughout the distributed system they have, so we can trace the action across the network.
One thing that's improved my log analysis is learning awk (well actually Perl but I think 95 % of what I do with Perl I could also do with just awk). Often the most useful way to look at logs is statistically, and awk can quickly let you aggregate things like time-between-statements, counts, rates, state transition probabilities, etc for arbitrary patterns.
Also, sometimes logs stretch across multiple lines, and the other lines won't have the identifier you are searching for. For example, Java stack traces. In that case if you are stuck parsing unstructured logs, the simplest thing to do is to look at the entire file and search for the timestamp that found the first line.
I feel like this is a huge anti-pattern. Use a hosted service that does all of this for you, and then have a whole query language, build alerts, graphs, etc based on these results.
It's not super cheap, but it's 10x cheaper than wasting dev time in the terminal. (Sumologic, splunk are the two I can vouch for)
I found the histogram technique to be really helpful. Slight mod - I tend to sort reverse at the end of the pipeline (sort -rn); then |head is often more useful.
It's also good to have histograms by hour or day. I've hacked up scripts to do this but I should really make something better!
Re timing, logs like nginx access logs have their timestamp from when the request completed, not when the request came in. That's a significant difference for long duration (~10s+) requests, and matters when trying to correlate logs or metrics to a request.
I saw a tip a whilr back about not needing to keep adding "| grep -v stuff" but instead "grep -e -v stuff -v stuff2" i remember getting it to work on Linux but last i tried that on macos I didn't have much luck
We've added a log tailing feature into our product UI which also has a basic find/filter. It's been enormously useful for cases where something weird happens as you can immediately access the last few mins of logs.
I'm interested in this idea as well. Seems like it would be useful for detecting unusual issues assuming you have a large enough data set of "normal" runs.
I've been mulling this idea over in my head as well. I have a fleet of PCs out in the wild, all running the same software. It would be nice to have an easy way to detect strange behavior like processes that are continually respawning / segfaults / crashes / etc, without explicitly writing a bunch of search terms.
Yours is an unhelpful and non-constructive comment. Clearly a lot of people have been getting something out of the content in this post, as it's started several discussions.
People on HN have varied level of skills, and this is a well structured introduction to diving into logs. It already started conversation about better tooling. Let's celebrate today's lucky 10000 https://xkcd.com/1053/ rather than talk something down for being basic.
in addition to these one of my favorite is a perl one liner that generates time delta from a regex pattern of interest. And then I plot it using gnuplot. It seriously helps to 'see' the events with timing in a chart & allows you to do a quick visual search for problem areas.
I guess it depends, but would you not encounter issues like permission blocks, or there being so much content in the logs that it is slow to find what you need?
Do we still use utilities like grep for searching logs? Are these when we cannot stream logs to tools like Splunk & Loggly and use their search services?
One problem is that you start using these services and then the bill arrives and you then spend all your time removing stuff from the logs to keep the bill down.
All of our logs are in Kibana, but sometimes I'll `kubectl logs pod > tmp; grep pattern tmp` because Kibana's search and filtering is often annoying. Actually, I'll usually open the logs in vim and use its search, which also get me the ability to eg delete every line not containing the pattern and then search only things matching the first pattern. I'm going to try lnav as mentioned in this thread but I've gotten by fine with judicious use of grep
Yup, it's often a waste of resources to run an extra 'cat'. It really demonstrates that you don't have the usage of the command receiving the output completely memorized. You know, the thousand or so commands you might be piping it into.
But, if you're doing a 'useless' use of cat, you're probably just doing it in an interactive session. You're not writing a script. (Or maybe you are, but even still, I bet that script isn't running thousands of times per second. And if it is, ok, time to question it).
So you're wasting a few clock cycles. The computer is doing a few billion of these per second? By the time you explain the 'useless' use of cat to someone, the time you wasted explaining to them why they are wrong, is greater than the total time that their lifetime usage of cat was going to waste.
There's a set of people who correct the same three pairs of homophones that get used incorrectly, but don't know what the word 'homophone' is. (Har har, they're/their/there). I always liken the people who are so quick to chew someone out for using cat, in the same batch of people who do this: what if I just want to use cat because it makes my command easier to edit? I can click up, warp to the front of the line, and change it real quick.
Sorry. I did say, it is a pet peeve.