
Packet Capturing MySQL with Rust - gbuehler
http://www.agildata.com/packet-capturing-mysql-with-rust/
======
burntsushi
> Enter regex macros! While it is presently slower, and requires a Rust
> nightly, it has the very appealing property that if your regex is not a
> correct expression, your program won’t compile!

To be clear, it's not just slower, it's _much_ slower. See the benchmark
comparison here:
[https://gist.github.com/b0f6a17744dd1df60752b6e8ced47afd](https://gist.github.com/b0f6a17744dd1df60752b6e8ced47afd)
<\-- That's why the `regex!` macro isn't even in the docs any more.

It looks like `regex!` is the only thing preventing your project from
compiling on stable Rust, right? FWIW, the Clippy lint tool will check your
`Regex::new` calls at compile time for you (assuming it's a string literal,
which it is in your case).

Also, I'd recommend not using `*` as a version constraint in your
`Cargo.toml`. You do have a `Cargo.lock` so it's not as bad, but with better
version constraints, you'll be able to run `cargo update` and get semver
compatible updates.

~~~
placrosse
Both good suggestions for improvement, Andrew. I saw the performance notes,
and admit I was a little torn. Quite simply, the regex! macro was interesting
for the reason stated, and I left it in there for the purpose of showcasing
something (a little bit) unique in Rust.

Regarding the asterisk for versioning in Cargo.toml, I also agree. When
quickly putting things together, I usually start with it just to see if the
default version pulled works. The great utility of Cargo.lock, effectively
storing the working versions of all the crates, allows scraping the versions
out of there at any time, and putting them into the .toml.

I hope you noticed the extensive links in the post, as one of the goals was to
bring more people into the Rust ecosystem. The Spyglass utility does work
quite well. None of us claimed it has reached a state of absolute perfection,
so your comments are appreciated (and pull requests will be as well)!

Thank you.

~~~
burntsushi
No worries! And yes, regex! is a pretty cool thing to showcase---it's a pity
that it is so slow. :-( Very nice project though! :-)

------
jsnell
I feel kind of uncomfortable with that regular expression for scrubbing data.
It seems to be fail-open rather than fail-close and does clearly not cover the
full lexical structure (e.g. hex numbers or MySQL's disgusting hex-encoded
strings, including the numeric digits, are not caught by any of those cases,
and thus would leak in ful. Or there's the possibility of string escaping with
backslashes being turned off with a config setting, which would screw up the
escape handling in the regular expression).

Am I missing some subtlety that makes it safe?

~~~
placrosse
I have no doubt that there are some cases which won't match. This particular
utility does not need to be 100% on for every possible corner case, to produce
the desired result. That said, all improvements, whether pull request or
posted suggestions, are much appreciated.

My comments regarding the regex was the very high number of cases that _are_
correctly handled, with such a small amount of code.

Thank you for your comments. I appreciate it.

------
sciurus
VividCortex has an agent that works similarly, which I believe they've written
in Go using libpcap. It would be nice if they open-sourced it.

[https://www.vividcortex.com/resources/network-analyzer-
for-m...](https://www.vividcortex.com/resources/network-analyzer-for-mysql)

------
Arcsech
Interesting - I hadn't seen libpnet before. I was recently working on an
experiemental project doing deep packet inspection in Rust using libpcap,
which doesn't have very mature Rust bindings yet - the basics work, but it's a
bit rough around the edges. libpnet looks like it has a much nicer Rust
interface, and does some more things for you as compared to libpcap, which
gives and takes &[u8]s and nothing else.

However, libpnet doesn't have two very useful things, as far as I can see:
Reading/writing packet capture files, and the ability to use BPF filters. The
first in this case might be useful mainly for testing, but the latter seems
like it might simplify a fair amount of their code.

~~~
Roxxik
I was just thinking about writing a minimal traffic-analyzer and libpnet looks
way more suitable for this task than libpcap.

And adding the functionality for a pcap like fileformat doesn't seem that
difficult.

The filters are a major pain point, I don't know how libpcap handles this, but
at least it says it won't copy packets from kernel- to userspace that are not
matching. Thus avoiding alot of overhead, maybe it's possible to introduce
some rusty kind of filtering in libpnet, too.

Going to log into Github now and see if I can do something.

EDIT: fixed spelling

~~~
ffk
If you want to avoid libpnet or libpcap, you can use socket and recv directly.

Here's a quick example demonstrating socket & recv capturing all packets on
all interfaces.

[https://gist.github.com/fkautz/0104084fd79cee5608d8e3fc6e729...](https://gist.github.com/fkautz/0104084fd79cee5608d8e3fc6e729a0f)

------
Roxxik
> To run Spyglass, you need extra permissions above that of a normal user in
> order to capture network traffic at the data-link layer, below IP, and
> without having to alter or interfere with the regular data flow between the
> client app and database servers. We recommend running it using “sudo.”

Wouldn't it be better to use some kind of privilege separation? I think there
is a reason WireShark does this... And even saying Rust is a safe Language
won't save you from programming errors, it just makes them more diffcult.

~~~
placrosse
Thank you for the suggestion. Spyglass went from concept to a working product
which met the project goals, in a little over 5 weeks.

And, you're correct it won't save you from all programming errors. It does,
however, make it far more difficult to accidentally encounter whole classes of
them which constitute, on average, quite a high percentage of debug time in
other systems languages.

------
scurvy
Why are you not encrypting your MySQL connections with SSL? If you're in the
cloud, you absolutely should be encrypting. Even if you're in your own colos,
you should be encrypting (in the chance of inter-colo queries). Seriously, why
aren't you encrypting this traffic? Query intelligence isn't an valid excuse.
Turn on query logs instead. Percona has shown that the logging impact is very
minimal (even if the link is 7 years old now) [0].

[0] [https://www.percona.com/blog/2009/02/10/impact-of-logging-
on...](https://www.percona.com/blog/2009/02/10/impact-of-logging-on-
mysql%E2%80%99s-performance/)

------
gtirloni
I hope to one day understand how a post, with 4 points only, by a newly
created account, gets promoted to the front page.

~~~
dang
Please don't post comments like this. If you're worried about voting on a
story, send an email to hn@ycombinator.com and we'll look into it. (In this
case, the voting looks largely legit. Rust is popular on HN these days, so
that may be why.)

Oh, and nothing is wrong about posts by new accounts making the front page. We
welcome new users!

