But for the people who are new(ish) to tcpdump have you heard of libnet and libpcap? You can basically build your own tcpdump! :D
I was amazed at the speed packets fire when you program it yourself in C.
https://github.com/the-tcpdump-group/libpcap <-- CAPturing packets
https://github.com/sam-github/libnet <-- sending packets
Libnet tutorial that I used religiously:
Wireshark starts to break down at a certain point. When that happens, I've found scapy (Python pcap parser) very useful. Once that breaks down, a good option reading the hex directly in C.
Once that breaks down... Well, there be dragons.
tshark -i eth0 -Y ip.addr==184.108.40.206
Instead of having to remember tcpdump's own fiddly syntax. Given that most of my protocol debugging work is done inside Wireshark, I'm much more fluent in that filter language.
Note that the above snippet is a display filter, so it's capturing all the packets on the wire, doing a full dissection, and then running them through a filter before printing them; if you're trying to capture on a saturated 1G link you might need to fall back to the capture filters which are simpler and faster (and I believe also use tcpdump's syntax).
Sometimes "ease of use" is really just "what I'm used to". (Not saying Wireshark isn't more expressive or more powerful...it probably is. But, I usually have a very basic use case, and tcpdump can do it with a couple of flags and an address or port.)
Its creator was one of the guys behind wireshark. They're pitching sysdig as strace + tcpdump + htop + iftop + lsof + ...awesome sauce.
They have csysdig which is similar to tshark
For sure. There's no better way of learning the protocols than writing your own network sniffer. Even writing a crappy one like the one that a few folks and I put together in college will teach you a lot about how each protocol behaves and what to look for: https://github.com/carlosonunez/nbfm-sniffer
ssh root@HOST tcpdump -U -s0 -w - 'not port 22' | wireshark -k -i -
I think when Dick Sites from Google says he has a rule of "stay below 1% of overhead" when analyzing traffic on datacenter nodes he wants you to select the right tool for the job. In the context of tcpdump you can run it with so many options that makes it very powerful. But that power is dangerous in the hand of a novice user. A simple error in how you run it (maybe missing filters or too wide an address space) can cause you to shoot well above the theoretical 1% limit. But that's not the tools fault IMO.
Anyway this seems more theoretical, because in practice I'd prefer a hardware based network tap and analyze that without creating any risk to the live traffic
However, one of the worst things that tcpdump does is to put the NIC into promisc mode. On a physical NIC, this can be VERY expensive and may involve bouncing the link (behind your back) and dropping packets. At the very least, it can wreak havoc with steering filters on some NICs. To prevent this, use the -p option to prevent tcpdump from putting the NIC into promisc mode.
Another issue with tcpdump on an endstation is caused by stateless offloads like checksum offload and offloads like TSO on the send side, and GRO / LRO on the receive side. Because the BPF filters are applied between the network stack and the device driver, you may noticed tcpdump / wireshark complaining about bad checksums on transmit -- this is likely due to checksum offload. And you may see gigantic (way larger than MTU) sized frames. This is due to GRO/LRO on receive, and TSO on transmit. You can disable stateless offloads (ethtool -K on linux, ifconfig on bsd), but that will slow the entire system down.
I'll ask amazon to install it when I need it...? They probably won't mind.
So if it would be my datacenter I'd have a rule of no analysis of non-tap data.
Sure, you can find out where your VM is currently physically hosted, then tap the 4x10Gbit that come out of that hypervisor. Your network stream is probably somewhere in there.
But it is mixed with a lot of other data, and you could miss the bits that go directly to other VMs on the same hypervisor. Also it is encapsulated, split up in multiple streams, multiple virtual networks etc. If the VMs is moved (for load-balancing or maintenance or whatever reason) your tap becomes useless and you have to chase your VM.
This is a ridiculous rule, you are cutting off your nose to spite your face. The saying "perfect is the enemy of good" comes to mind.
To your above point
>maybe missing filters or too wide an address space
There is merit in just capturing a shitload more than you need (provided of course, you're not trying to cap a full 10Gbit) because it is often the filters that are the cause of any "performance" issues, however you define that.
EDIT: whether you allow any access to a node for whatever purpose other than the software that was meant to run on that node would probably depend on what damage is done if that node goes down. If the damage is a blip in statistic and you can live with that fine but that's not always the case
Almost the only time you would want a physical tap is if you need a permanent tap capturing everything over a long term - often for the purpose of running through an IDS/IPS, an even then SPAN/RSPAN/ERSPAN works pretty well.
Even in those industries you mention, most of the other time you are doing a capture is to troubleshoot something, so you don't need to run things for a long time, nor does any "perf issue", which is overstated, probably matter, since things are possibly already half way to fucked. And the compliance argument doesn't hold either vis-a-vis installing tcpdump - your processes and policies would be written as such to allow for debugging (or should be)
If you're not a network engineer, you'd be amazed at how many times we get issues escalated to us exclaiming that "it's the firewall" and demanding that we fix it.
It's usually not the firewall, though, and unfortunately it falls on us to prove that that's the case. It's not always easy but, luckily, tcpdump and friends allow me to show that and I can punt the issues back to where they came from.
(Several years ago, I was able to prove it was a customer's on-premise firewall -- managed by them -- and not our firewall based upon the packet timestamps and this little thing known as "the speed of light".)
It's worth noting that optimizer is also absolutely necessary since the design of the pcap language is such that very simple filters are easy, anything even remotely complicated becomes very verbose and repetitive.
I had a job, about 15 years ago, hacking on a customized version of libpcap (mostly to do filter merges). There is a surprising amount of stuff going on there.
But after that common starting point they quickly developed lives of their own, and should not be treated as a single unit. You'll get things like pflua as a completely distinct implementation of the filtering language compiling to lua instead of BPF, pfmatch with language extensions that are not really compilable to BPF, seccomp and other uses of BPF in the Linux kernel for things that have nothing at all to do with packet processing, a (e)BPF code generator backend in LLVM, and so on.
No longer true in modern Linux.
You can do a hell of a lot with BPF, and it's not like work to extend it's functionality is slowing down, either. We've had tons of features implemented in the 4.x family, stack walking is likely to be implemented in the future, etc.
When you have people like Alexei Starovoitov and Brendan Gregg saying it can do all sorts of things including "crazy stuff", I think we've moved beyond 'extremely simple and limited'
If you want a true MITM proxy then I recommend Burp
netsh trace start capture=yes IPv4.Address=10.2.0.1
netsh trace stop
Article about wireshark :)
My usage or Wireshark is rather sporadic, so I appreciate the traffic drill down I can do w/o any knowledge in advance about the protocols I have captured.
Give it a filter (BPF) and the pattern you're looking for, and off you go: $ ngrep -W byline -d en0 "INVITE" port sip and host sip.phone.tld
The author mentions the pcap filter language, but one of the misfeatures of Wireshark is that it has a different filter language for the GUI filter box.
BTW, If you want constant monitoring for your applications, you can buy a shark appliance that you can always go back to to investigate issues.
I have not used tcpdump a lot, but I believe there is nothing it can do that tshark cannot.
Tshark can capture only the portion of traffic you want via filters, which can reduce the size of your pcap files considerably, and possibly haves less of an impact on performance.
When using tshark, make sure you capture the traffics to a file too, so you can go back to look at something that happened x seconds or minutes ago.
I understand the security concerns, but forms being able to debug with tshark (or an equivalent tool) is a very good reason for not using HTTPS internally.
How about go two months without a security vulnerability in packet processing logic? Drop slightly fewer packets? (Yes, dumpcap too)
> good reason for not using HTTPS internally.
One of the nice things about HTTPS is that it's just a SSL connection and the HTTP goes over it. It's not hard to use socat to create something which listens to TCP connections on one port, and wraps it in a SSL layer and sends that on to another host:port. This allows you to remove the S from HTTPS, and you then speak HTTP to it, and hence can use tcpdump (etc) to dump.
sudo tcpdump -s 0 -A 'tcp[((tcp[12:1] & 0xf0) >> 2):4] = 0x47455420'
This is useful for troubleshooting outbound requests that your backends are making. I've had the interesting logic explained to me but can't remember the details.
So, start with the magik number. If you look up those 8-bit ASCII codes you'll see that it spells out GET followed by a space, which should give a clue as to how it's working. So it will capture a lot of HTTP requests, but it may not be getting them all.
But they're still fairly stateless... without some extra scripting you cannot do a query for "TCP connections with more than 3 restarts".
Regarding ther other topic, nothing stops someone to capture the packets to a generic DB ( sqlite or berkleydb ) to operates such queries. Looks like a weekend project!
See http://superuser.com/questions/904786/tcpdump-rotate-capture... for an example of the required syntax.
Note that tcpdump's default behaviour here is better than Wireshark's; last I checked Wireshark just crashes when your capture file exceeds the available RAM. Again, you can enable file rotation, but many don't realize this until they have been bitten by this attempting to do an overnight capture of a production issue...
However, I haven't seen a need to use tcpdump in awhile since my problem domains have been quite different in that my focus back then was primarily network monitoring. Usually performance problems where I have worked have been easy enough to identify at a higher layer (e.g. n+1 select issues with SQL).
I am grateful for the tool.
It gives you very deep visibility in the supported protocols, dumps easy to parse log-files by default (see e.g. https://www.bro.org/sphinx-git/httpmonitor/index.html for HTTP information) - and it is fully scriptable.
(Disclaimer: I am involved with the project.)
I understand the importance of privilege separation, but I miss the feature.
Obligatory disclaimer: I'm one of the mitmproxy authors - happy to answer any questions.
windump is the windows version of tcpdump: https://www.winpcap.org/windump/
I haven't used it yet but it's libpcap based so I can't imagine it being too different. It has to be at least 2000x better than the piece of shit Microsoft Network Monitor (it's like Wireshark, except so much worse...oh, and it doesn't do promiscuous mode)