Hacker News new | past | comments | ask | show | jobs | submit login

Weird how syslog doesn't de-duplicate these identical log messages. Or maybe it does, but not enough.



Respectfully, logs shouldn't be "smart", they should just log, that's it.


BSD syslogd definitely does, perhaps whatever stripped down one ASUS is using doesn't. e.g.

  Dec  2 01:09:41 hostname syslogd: last message repeated 10 times
The threshold's pretty low and most of the "repeated" messages say "repeated 1 times" however.


Does it store timestamps for those repeated occurrences? I wouldn't want my logs to "helpfully" coalesce multiple identical messages into a single one. For example, if I saw the equivalent of the text below in a program/OS while facing some issues, I'd be remarkably pissed.

  [2023-05-18 07:12:21] [E] Invalid event 0x1AF2 received from ...
  [2023-05-18 09:44:01] [I] Last message repeated 10 times
I care less about how many times the message was repeated - I care about timestamps, which I might want to correlate to other activities.


No, it's just a plain syslogd dating back to the 4.2 days. It aggregates at 30, 120, and 600 second intervals according to the source. Within that threshold I wouldn't care too much. If I really needed timestamps with more than thirty second precision I probably wouldn't be using syslogd.

In any new-ish production system I'd probably want to use anything other than syslogd anyways.

https://github.com/freebsd/freebsd-src/blob/main/usr.sbin/sy...


That's... good to know. I never realized anyone is doing something like this, ever. It breaks my trust in software logs in general - I'll be sure from now on to understand how any given program handles logging, before making assumptions relevant to troubleshooting.


In many cases, logs are asynchronous so depending on many factors among which are utilization of the host, you might get them with a delay and the ordering of events might not make sense because of that when read from the logs. If you need that precision you can surely engineer/ configure your system for that.


If I'm running my own distributed system for some business reasons, sure.

If I'm dealing with equipment failures, bugs in third-party software, or other such random tech bullshit, as an individual or a team, then I don't know in advance when and what precision I'll need.


In a reasonable system, you might be able to change some function in one place, perhaps even in the running system and get the precision or detail you need. You might take out the big guns like e.g. dynamic tracing using BPF to attach probes at the right places.


If your logs are being spammed with the same message, while nothing else logworthy is happening, how many timestamps do you need in one hour?


Most of the time, I'd assume all of them I can store (deduplication is fine, if I can recreate the raw data afterwards). Which is a lot, because they should compress well (in the limit, approaching the same size as deduplication solution).

Sometimes those data points don't matter - like if they're generated by some program stuck in an infinite loop. But in other cases, they do - like e.g. if each message is caused by some event, like another program doing some processing, or user pressing a key, etc. - then timestamps will be useful to identify the exact cause (e.g. logs only happen when process X is processing mouse input, or when user presses one of 20 specific keys on their keyboard, or only when my microwave oven is running).


It helps to know exactly at what interval things are occuring, and you lose fidelity with this style of logging.


> BSD syslogd definitely does

…by default, but it can be disabled:

     -c      Disable the compression of repeated instances of the same line
      into a single line of the form "last message repeated N times"
      when the output is a pipe to another program.  If specified
      twice, disable this compression in all cases.
* https://man.freebsd.org/cgi/man.cgi?query=syslogd


It seems that the binary in question just writes the logfile directly and does not use syslog.




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: