
What's wrong with pcap filters? - luu
https://www.snellman.net/blog/archive/2015-05-18-whats-wrong-with-pcap-filters/
======
jsnell
So, a couple of extra anecdotes from after I published that post.

I linked that post to our internal chat, and our support manager was all "Oh
wow, that VLAN section totally explains why those filters at customer X two
weeks ago didn't work". This is 4 years into trying to make sure that everyone
using the filters to configure our system knows about this pitfall. They knew
about a couple of failure modes in the list, but not all of them :-(

An hour after that I had a discussion with someone who knows the language very
well, and we were puzzled about why the silly "look for a specific TCP option"
filter was compiling to something that looked like it wouldn't work with ipv6.
Turns out that while "tcp" works with IPv4 and IPv6, "tcp[]" works only on
IPv4. He knew it but had forgotten, while I'd never parsed the documentation
correctly to know about it. (There's a comment in the code generator
suggesting that this is maybe not how things should work).

------
lukego
We have three main use cases for pcap filters (based on pflua) in Snabb
Switch:

First is ad-hoc filtering in applications. "If this is an IPv6 control message
for the address of interest then put the packet on this queue." The filter
notation is convenient for this and if the filters run as fast as hand-written
code then why not use them.

Second is end-user configuration. For example we have a "snabbnfv traffic"
program that connects a 10G NIC to several QEMU virtual machines with Virtio-
net. We support defining stateful packet filters by pcap filter expressions.
This is a powerful tool that is already documented and familiar so it is easy
for us to use in the interface. It is a power tool though and you can easily
write an expression that does not do what you mean. (Happens to me too.)

Third is as an intermediate language. For example, we translate OpenStack
"Security Group" rules (ACLs) into pcap filter expressions for the snabbnfv
traffic process. This allows OpenStack to provide a simple and limited
interface to the user and for us to map that onto our more general mechanism
underneath. I see this as future proof in terms of supporting more front-ends
besides OpenStack and also supporting extensions for special needs e.g.
matching on exotic protocol headers.

So I am pretty happy with the current state of affairs. I do think there is a
lot of potential for pflua to move things forward and I am looking forward to
seeing more projects pick it up so that we can cooperate on this.

If you are tempted to adopt pcap filters then check out pflua from Igalia:
[https://fosdem.org/2015/schedule/event/packet_filtering_pflu...](https://fosdem.org/2015/schedule/event/packet_filtering_pflua/)

------
orf
Sounds like a more succinct language that compiles down to (unreadable) pcap
filters could be made, no?

~~~
eridal
call it _spcap_ .. the .io is available ;)

~~~
bostik
I know that's a low-brow joke, but I hope you know the history of S-prefixing
a complex beast for a "simple" approach is not exactly encouraging.

SMTP - Simple Mail Transfer Protocol. Simple does not rule out hideously
complex.

SNMP - Simple Network Management Protocol. Let's not got there..

I shiver at the thought of a Simple Packet Capture Language. It might just end
up looking like someone summoning the Ancient Ones.

EDIT: ditched the word "rewrite". It's not exactly suitable here.

~~~
bch
Can't believe SOAP (simple object access protocol) didn't lead your list...

------
est
For me, the only problem is ss, tcpdump and pcap all have different syntax
which is PITA to remember which is which.

~~~
tptacek
tcpdump and libpcap have different syntaxes? When did that start? tcpdump's
filtering language is libpcap.

~~~
est
Sorry, I mean wireshark. You can use a filter like these in wireshark GUI:

    
    
        ip.addr==127.0.0.1
    

But not in tcpdump

    
    
        $ sudo tcpdump 'ip.addr==127.0.0.1'
        syntax error.
    

This works in tcpdump

    
    
        sudo tcpdump 'host 127.0.0.1'
    

But not in wireshark GUI

~~~
matthiasl
It's a bit more complicated than that.

Wireshark allows two different filtering languages in two different places,
used at different times.

Your first example ("ip.addr=127.0.0.1") is a display filter. You enter those
in the text field at the top of the GUI.

Your third example, 'host 127.0.0.1' is a 'capture filter'. You can enter
capture filters in Capture/options.

~~~
est
I can understand that, but why do we invent these two different DSLs? I
suppose they have a very large portion of overlap!

~~~
matthiasl
There are different trade-offs for the capture filter language and the display
filter language.

The capture filter language considers short and bounded runtime (no loops) to
be paramount. Roughly, when you're capturing, it's important that your filters
eat a limited amount of resources. Juha talks about this in the article.

The display filter language abandons the careful runtime limits in favour of
being more powerful. You can go higher up in the stack, you can use regexes,
etc. That's acceptable because you're usually doing it offline.

The conclusion of Juha's article is, roughly, "the libpcap DSL is sometimes
frustrating, but I'm not aware of a clearly better alternative". I agree. I'd
love it if someone invented something prettier but still solid.

------
lmz
Unrelated: the previous post on that blog about Finnish election candidate-
matching websites is quite fascinating:
[https://www.snellman.net/blog/archive/2015-05-11-okcupid-
for...](https://www.snellman.net/blog/archive/2015-05-11-okcupid-for-voting-
the-finnish-election-engines/)

------
k_sze
Wow. Thankfully I have never had to deal with pcap myself.

That vlan stuff just looks completely b0rken. The semantics of `vlan` spilling
over an `or` or parentheses reeks of basic language design failure.

