
Improving testing by using real traffic from production - LeonidBugaev
http://leonsbox.com/blog/2013/06/04/improving-testing-by-using-real-traffic-from-production/
======
lazyjones
Good idea to use tcpdump in order to record the requests as they happen (not
as they're processed, i.e. from the logs as other replay tools), but perhaps
using it in promiscuous mode is unnecessary if the listener runs on production
servers. Oh, and bonus points for using Go.

~~~
LeonidBugaev
Thanks.I was thinking to use RAW SOCKETS instead of tcpdump, but decided to
leave it for futher releases.

------
lawl
This is very cool. Could have needed than in my last project where in our
load-testing with j-meter everything was fine and in production our nodes
would fly out of the cluster like flies. I haven't looked deeply at it but I
think session support would be very important, so that you don't just forward
the requests but take care to maybe rewrite the session cookie to the server
assigned one and when rate-limiting make sure to prefer to keep existing user
sessions. From my expirience the problems we had on production were often
because of some weird steps user took, which you might not catch if you don't
make sure to exactly keep the users sessions in sync.

~~~
LeonidBugaev
Yeah, this version have lack of smart rate limiting. But i want to implement
"buffering", so requests will be send in same order, without loss, from
buffer. Smth like video streaming. Session loss in this case will be only when
i need to re-fill buffer.

~~~
lawl
Or when the session times out on the server side because it's rate limited.
I'm just talking about my own use case I would have had. We had a session
timeout of 10 minutes there. But i realize this might be an edge case.

Keep up the good work.

------
peterwwillis
Does this tool include features not found in tcpreplay?

------
sparrovv_
You wrote that staging should be the same as production env (and I agree), so
how does it help with catching new errors if they were already caught and
reported on production?

Of course we could run that against our test environment, but then I'd be
afraid of getting to many errors, that were caused by changes that we want to
release.

So is there something obvious that I'm missing?

------
kohlerm
Is your application completely stateless? I'm asking because Load test tools,
which are based on http replaying always support rewriting rules, because in
stateful web applications the server may parse a cookie (or something
similiar) to the browser, which is then used in the next requests.

------
jrochkind1
Nice! This is something I've wanted to do for a while, but whenever I started
down the path of hacking together my own tool, ended up giving up after a bit
of work realizing it was a bit more complicated to do well than I anticipated
and I didn't have time to do it right.

------
jondot
Great tool. I was just about to do such a thing - thanks! I actually planned
on a reverse proxy in Go, but seeing now that you've implemented with tcpdump
- any insights on the two ways to do this?

~~~
a-priori
I think tcpdump is the better solution since it's less intrusive into your
production environment. Even if the tool suddenly crashes, it will not
interrupt the flow of requests to your servers.

Something similar, like using libnetfilter_queue, would also be good.

------
psychometry
This wouldn't be effective if the any of the changes you're trying to test
involve changes to URLs.

------
rocco
OT: Any idea of the software used for diagram? (seems embedded with
javascript)

~~~
LeonidBugaev
<https://www.draw.io/>

------
Swizec
Is it just me or is the site down?

~~~
LeonidBugaev
Looks like just for you <http://www.downforeveryoneorjustme.com/leonsbox.com>

Site hosted using Github Pages.

------
aetimmes
s/Chief/Chef/g. :)

