Hacker News new | comments | show | ask | jobs | submit login
Ask HN: How do you continuously monitor web logs for hack attempts?
477 points by sandGorgon 4 months ago | hide | past | web | favorite | 224 comments
what is the generally accepted best practice to be monitoring web logs for anomalous accesses ?

do you guys just throw cloudflare in front and forget about it ? Or do you have engineers who work like data scientists - eyeball the logs ?

I have heard suggestions of using a firewall - but I'm genuinely curious on how do security focus companies discover things like "oh, we got attacked by Bitcoin miners from North Korea". Are there sophisticated tools that do this for you.. or is there a generally accepted practice that has evolved for even regular engineers to do this ?

P.S. I'm personally more interested in an API-focused answer, but I guess the same thing applies for websites.

The answer isn't quite "don't bother" but it's definitely not "install these tools and get started".

Most of the things you can easily set up to watch for security events are looking for problems you simply shouldn't have.

For instance: you can set up fail2ban, sure. But what's it doing for you? If you have password SSH authentication enabled anywhere, you're already playing to lose, and logging and reactive blocking isn't really going to help you. Don't scan your logs for this problem; scan your configurations and make sure the brute-force attack simply can't work.

The same goes for most of the stuff shrink-wrap tools look for in web logs. OSSEC isn't bad, but the things you're going to light up on with OSSEC out of the box all mean something went so wrong that you got owned up.

Same with URL regexes. You can set up log detection for people hitting your admin interfaces. But then you have to ask: why is your admin interface available on routable IPs to begin with?

What you want is an effectively endless sink for developer logs (Splunk is fine but expensive, ELK is fine but complicated to manage), and you want to add structured instrumenting logs to your applications in all the security-sensitive places, like authorization checks (which should rarely fail), and especially a very solid, very long-term audit trail of admin-like actions, so you have some prayer of tracking insider misuse. You could instrument your SQL queries and catch any query failures --- which should also be rare. That kind of stuff. I don't think there's a good shrink-wrap package that does that.

But really: focus on removing attack surface, rather than continuously monitoring it.

I came here to add an important detail about WAFs that's missing in this thread. For context, I am the original author of ModSecurity (but haven't worked on it since at least 2009). I wrote ModSecurity for three use cases that still make a lot of sense sense; they are: 1) full HTTP transaction logging (meaning request and response bodies), 2) rule-based analysis of HTTP traffic (largely to trigger logging and otherwise highlight anomalies) and 3) virtual patching.

The first two use cases are all about traffic visibility, which has been, and remains a big problem. If we're talking about improving security, the third use case—virtual patching—is the exciting part. The story goes something like this: you discover a vulnerability in your code but you can't fix it immediately, for whatever reason (take your pick). In practice, it may be weeks until the problem is fixed. However, if you have ModSecurity installed, you can write a virtual patch for your problem, often to fix it 100% reliably.

However, it's very important to understand that virtual patching doesn't mean just slapping together some rule that appears to be doing the job. Blacklisting doesn't work. To properly virtual-patch, you have to fully understand the vulnerability and the entry point to the application. You have to carefully whitelist all parameters in that particular location. In other words, you have to almost replicate the work that should have been done by the developer.

Even in this context, it's not all rosy. Virtual patching requires a great deal of expertise to do properly. It's almost always better to spend your time fixing the actual problem. Sometimes, however, that's not possible to do in a reasonable time frame, and that's when a WAF (according to my definition), can be a life saver.

Thanks for this. I admire ModSecurity! But the subtext of my comment is the needs of typical HN people, most of whom manage applications they wrote, for small companies. Overwhelmingly, those people should spend any time they would spend on things like ModSecurity instead shoring up the security of their code.

TLDR: Ivan, thank you for everything you've provided to the IT community.

- - - -

There's few people I am more pleased to see commenting here on a topic like Sec more than yourself Ivan, your work's exceptional. And the tools you've provided like Hardenize and SSL Labs have saved me countless hours.

Your books have been an immense source of knowledge for me (and I'm relatively a newcomer by my colleagues standards).

Really pleased to see you here on HN. Though, I'm not entirely surprised. The best of the best seem to haunt this place.

The parent posted what is pretty much the right answer, but to add:

* Your logs should be there to help you find out what happened in the event of a compromise, not to detect it in progress.

What OP should be monitoring is the activity that occurs on the system - logins, processes spawned (e.g. php or mysql suddenly spawning shells for no reason), changes to content, resource volume spikes and as the parent said, configuration changes.

Much of this can be done with a variety of tools but there isn't really a one-size fits all approach. I'd start with resource spikes as it's easiest and work up to an OSSEC configuration reflecting the actual environment.

Also by way of middle ground on the log front, Graylog2 is a happy medium between Splunk and ELK, but it's better to log fewer (but more useful) things than a load of noise.

> What OP should be monitoring is the activity that occurs on the system - logins, processes spawned (e.g. php or mysql suddenly spawning shells for no reason),

In my experience, monitoring this takes about as much effort as adding a new apparmor / selinux / firejail / ... rule which would block it in the first place. I mean, with processes spawned we're almost at auditd level already.

>logs are not there.... to detect an attack in progress

I vehemently disagree with this assertion. I also disagree with tptacek's asseration that any admin interface should be inaccessble from the public Internet. That's just not how a lot of small businesses, non-profits, etc. operate. 2FA? Sure. Geo-fencing? Keeps out noise. Basic auth / some front-end to minimize app-specific attack surface? Perhaps.

Saying "you shouldn't look at logs to detect anomalous activity before the fact" is awful. So is "If you have and admin availability publicly, you're screwed".

My saying is "Security is a spectrum". A bank should not have their firewall admin interface publicly available. A community newsletter blog would probably be alright.

I don't know what the maturity cutoff you're using here is, but I'll draw my line at "deploying on AWS". If you're on AWS, you have really no business exposing an admin interface on a routable address. Deploy it on an internal address and have admins use a VPN to reach it.

There are organizations on AWS that have publicly-routable admin interfaces. They should all be working on plans to migrate away from that, and, in the meantime, they should be extra careful about authentication and monitoring.

If you're too small to have a non-routable admin interface, you are definitely too small to pay for security products.

Mixing discussions between people makes it harder to respond to your comments. It's better to have one comment aimed at me and one aimed at tptacek.

As for the content of your comment, it's hard to distinguish what's aimed at me and what's aimed at him, so I'll assume everything aside from your first sentence is aimed at him.

> But really: focus on removing attack surface, rather than continuously monitoring it.

You buried the lede on this one.

This applies if you are developing your own software.

If you do security for an enterprise with dozens or hundreds of apps that you don't develop yourself, you might not have the luxury of only trying to reduce attack surface, or instrumenting 3rd-party supplied code.

Yes, that's true.

I use some custom fail2ban rules on my web servers, for example too many hits on url X in a 10 minute window = firewalled for a day. Helps stop brute forcing on web apps e.g. Wordpress and can cut down massively on the traffic the server is having to deal with.

I Also have some mod_rewrite rules in to 403 some URL’s that bots currently hit, mostly because I don’t want to spend 200ms of PHP time for the bot to discover that it’s exploit didn’t work.

I accept that when you’re hosting websites for a load of other people you’re gonna need backups too, that’s the bottom line.

I don't even give them the courtesy of a 403. That's more information than they deserve. HAproxy severely throttles their requests, at no cost to my web servers, and eventually stops responding entirely.

Converting our WordPress sites to static sites greatly simplified and improved our infrastructure.

Yeah I find young naive devops people implementing security watching and stuff like fail2ban but in the long run it reduces productivity. Stuff like fail2ban can block legitimate traffic (it blocked me on many occasion trying to work from home).

When I was young I too used to try implement detection and banning but realised it's a waste of time. Key only access to ssh and make sure you at least subscribe to security updates for all packages is enough.

For our own app we went the ELK route due to the number of log messages coming from our own application. But yes as you said it's a pain to manage. Also creating kibana graphs/views/desktop for each type of log is also painful.

A good service I believe would be someone to offer a turnkey/shrink wrapped managed ELK for self hosting. Not only managing the system side of it, but also offering services to help create custom kibana graphs.

Just set a short ban time in fail2ban. I have it set to 5 minutes. The aim is just to slow down the kiddies so that my logs don’t fill with a zillion pointless attempts.

And/or just reconnect through another server or VPN.

Yes, but my aim is more to create a molasses pit to slow down the log fill rate rather than to ban anyone for long. If I manage to lock myself out I just get up and get a drink of water.

Using a short ban time keeps the load of running fail2ban to a minimum. I at one point was banning the kiddies/bots for a week and 40% of the server load was chewed up just running fail2ban.

Fail2ban has issues when you have a lot of IPs blocked in memory.

The solution I came up with to this problem is to use a custom ban action that adds an IP to the blocked ipset list with a given TTL, but set fail2ban to think that the IP has to be only banned for 5 seconds or so.

The result is that ipset purges the IP from the list after the TTL expires, but fail2bans working set is a 5 second rolling windows of blocked IPs.

This works great and it's being used in production for a couple of years.

Sounds like a great solution. Have you written about this in any detail?

  > so that my logs don’t fill with a zillion pointless attempts
What about moving SSH to a different port?

I do this, but it is the port scanning that I am tracking. It is hard to find your ssh port if you get banned port scanning.

When your server is down and you need to ssh into an arbitrary 5 minute wait is going to frustrate you.

You don't really need fail2ban at all. Just don't log failed password attempts. Key access only, no one is going to get in.

I am not worried about people getting in via ssh, I am worried about some zero-day on some other service.

Do you really need the space or is it just some anxiety issue / ocd issue?

If a sufficiently interested botnet discovers your SSH server, it'll hammer your server with 10's of connections a second, from many different IPs. That doesn't take up enough bandwidth to be noticeable, but it can fill up the disk, when you're least expecting, which can cause all sorts of trouble.

Of course there are ways to deal with it, but that's just extra work.

Fill up a terabyte? Doubt it, especially since logs are compressed nightly.

What does your comment have to do with mine? You seriously registered an account just to reply to me? What's going on?


> What you want is an effectively endless sink for developer logs (Splunk is fine but expensive, ELK is fine but complicated to manage

I'm pretty happy with sumo. A couple things I wish they'd change, but their core competency is searching logs and metrics and they do that really well. It is paid (there is a free tier but we outgrew it quickly) bit it's well worth it IMHO.

> But really: focus on removing attack surface, rather than continuously monitoring it.

So much this. Logs are great for telling you something bad has already happened and what that was, both security and faults. Reducing the attack service will get you more mileage from a security pov (again, in my opinion). Logs need to he there and some basic trawling of them is also necessary, but is purely reäctive.

Is there a course (or set of courses), books or other learning material you recommend for web developers to get to a competent level in terms of building, deploying and maintaining secure applications (including infrastructure)?

Edit: updated to be more clear

Get more into DevOps/SRE/Ops/SysAdmin/WordOfTheWeek where building secure infrastructure is part of their job. Writing secure applications is one thing, deploying them on secure infrastructure is another.

Thanks for the tips!

Yeah, I should have been more explicit. When I say build secure applications, I imply the full stack including infrastructure. I updated my comment accordingly.

>focus on removing attack surface, rather than continuously monitoring it.

Amen to that. I stopped caring about these sorts of entries in my HTTP logs when I realized these automated attacks target software packages written for language runtimes I don't even have installed. I'm sure not losing any sleep at night that someone is trying to hit phpMyAdmin or wp-admin on my server that doesn't even have a PHP interpreter.

A while ago I was looking for a very lightweight system to set up a small CMS with no bells & whistles (Pico, Grav etc), but all of them seem to require PHP. Are there any alternatives around which could also dodge most common attacks by not having installed most common interpreters?

Look into static site generators... you basically write your site offline, and run a script to deploy it to a static web server (or mostly static). In the end, a web server that doesn't run any code outside delivering content is the most secure web server. It'll be fast, and (usually) near zero attack surface.

>But then you have to ask: why is your admin interface available on routable IPs to begin with?

You have railed against network isolation as defense in other threads. Locally-scoped IPs, VPNs, and ACLs are all about network isolation. Care to shed some light on the inconsistency?

I did?

Removing attack surface make it viable to continuously monitor. I don’t think it is an either/or.

Just out of curiosity has anyone with a cryptographically strong ssh password ever been hacked via the password authentication path unless they gave this password to someone else?

I don't know what "cryptographically strong" means in the context of a password, but I do know people with strong SSH passwords that got owned up, yes. Don't use "cryptographically strong" passwords with SSH. Use keys, disable passwords.

I mean a password with enough entropy that also hasn't been used before and is not shared with anyone.

Do you know how they were owned? Is there a write up anywhere as this is a super interesting topic.

> but I do know people with strong SSH passwords that got owned up

By having their password brute forced, or some other means?

So for example, a 16 character random password is still vulnerable?

This is what I would like to know. If strong ssh passwords can be broken then there is a bigger issue than just turn off ssh password log in.

I am pretty sure that 16 character random passwords are safe from dictionary attacks, and I suspect you and I are using the same process for protecting these passwords. tptacek is leary of our process while you and I are leary of the certs getting out of our control without our knowledge. Security is a handful of trade offs.

For passwords "cryptographically strong" normally means a password with appropriate complexity and sometimes unpredictability requirements, since entropy is almost nonexistent in passwords.

Something out of pwgen or diceware would do though. If your application has length requirements that preclude the latter, make sure they are liable if you loose something due to preventing you from using a secure password. (No one can remember a short secure Password for more than a small handful of applications.)

It is a good advice but the OP is asking for web log files not ssh and any other logs. You cannot remove the attack surface when the app/service is served via http

If you use a service such as Cloudflare, you can restrict the ip range, ensuring that all requests go through that service (which probably has more experience with malicious requests than the website owner).

You cannot remove the attack surface when the app/service is served via http

Remove? Probably not. There are many options for minimizing the attack surface with http.

Good answer, I only came to add that OSQuery should be included in your tool box. Much easier to query for indicators of a problem, especially at scale, than it is to grep through logs and reconstruct an event. This is still a detective tool that would not likely catch an incident in real time, so the preventative measures listed are still best practice, but it should be included as a tool to fall back on.

Thanks for this. Just removed ssh login from a server.

The suggestion was to only allow logins with ssh keys - not disable ssh logins entirely.

In an ideal world, literally no SSH at all. That's doable in container cattle fleet networks.

In a realistically achievable world for virtually every startup, no routable SSH. No security rule allows 22/tcp from a routable address into a VPC. You want to SSH into a host, you VPN to your network first. This is how most organizations with security teams set up SSH.

If you can't get there, at least get to jump box SSH --- 22/tcp is allowed to exactly one isolated host from the outside world (some people filter source addresses to their jump boxes, I don't bother); to get to real hosts, you bounce through that one.

A network where you can randomly connect to the sshd on real servers is one most security professionals will assume isn't being run competently.

Ja, should have been more clear. Removed password login while maintaining existing keys.

I think you meant “Splunk” :)


this is essential, love the analogy here: https://youtu.be/NUXpz0Dni50?t=14m13s

Security monitoring is useful and part of a defense-in-depth strategy. So is fail2ban and many other things.

"Defense in depth" is just a way security product salespeople get customers to buy more stuff. If you remove attack surface altogether, you don't need more depth of defense.

> "Defense in depth" is just a way security product salespeople get customers to buy more stuff.

"just" that? That's a very bold statement that most of the security community would disagree with.

An example: you don't monitor network traffic on the border because you trust your server not to be compromised. Then one gets compromised and the incident goes undetected due to lack of monitoring.

> If you remove attack surface altogether

That's a big if. Services can have a complex life, with software and configuration changing often. Unless you pull the plug you attack surface is never going to be zero and, on top of that, there so guarantee that it stays small. Hence organizations usually have different people taking care of different layers.

Or, you know, install these tools and get started:

   - Nikto (https://cirt.net/nikto2)
   - A comparison of tools from 2017 (http://sectoolmarket.com/price-and-feature-comparison-of-web-application-scanners-unified-list.html)

  Web Application Firewalls:
   - ModSecurity (https://geekflare.com/install-modsecurity-on-nginx/)
   - Enable OWASP rules in ModSecurity (https://geekflare.com/modsecurity-owasp-core-rule-set-nginx/)
   - Vulture (https://www.vultureproject.org/)
   - Pay $500 to have SpiderLabs configure ModSecurity for you (https://ssl.trustwave.com/web-application-firewall)
   - Nginx (https://www.nginx.com/products/nginx-waf/)
   - Azure (https://azure.microsoft.com/en-us/blog/azure-web-application-firewall-waf-generally-available/)
   - AWS (https://aws.amazon.com/waf/)
   - Symantec (https://www.symantec.com/products/web-application-firewall-reverse-proxy)
   - Incapsula (https://www.incapsula.com/website-security/web-application-firewall.html)
   - Smoothwall (holy cow, they're still around? https://us.smoothwall.com/utm/)
   - Shadow daemon (probably defunct, https://shadowd.zecure.org/overview/introduction/)
   - IronBee (same, https://github.com/ironbee/ironbee)

  Hardening guides:
   - Apache (https://www.tecmint.com/apache-security-tips/)
   - Nginx (https://geekflare.com/nginx-webserver-security-hardening-guide/)

Don't do most of this.

Care to elaborate?

You almost certainly don't want to set up a WAF.

Don't use any of those scanners, all of which look almost exclusively for vulnerabilities you don't have. Do look into getting a Burp license for every developer on your team.

By all means, harden Nginx. Don't pay SpiderLabs to do it for you.

There are good burp alternatives too :)

There definitely are! Your stuff included! Burp is just the industry standard.

So is it "fix whatever Burp points out" ? Is that a good first step ? Especially for api endpoints (versus websites)

You'd ideally want a dev process where all your developers were capable of conducting integration test runs with Burp; you'd probably set up a checklist of authz tests, and set up some custom wordlists for Intruder for database, filesystem, and DOM injection vectors.

Thanks. This was a superuseful suggestion. I can now Google a lot of stuff on my own.

This is brilliant.

Thanks !

A WAF will stop a lot of the stupid junk that a kid with a "how 2 hax0r" book can make work a disturbing amount of the time.

If any of that stuff is actually working on your site, not having a WAF is the least of your problems. Don't waste time with this stuff.

I assume you are writing in an attempt to persuade.

Your message is being lost/ignored as a result of your delivery.

We want to understand your message, but you aren't giving enough background. I don't trust your message because there is no supporting background. If you're going to tell me to not do something, either:

A) provide a compelling alternative (not doing $it is not a compelling alternative to $it in nearly all cases) B) explain in detail why you shouldn't do $it. Provide more details if asked.

I want to believe what you are saying and be convinced by it, please help me. (Note: I'm not trying to be a dick, I've just known a lot of people who are perpetually ignored because of this communication style and take forever to learn that lesson because no one bothers to say anything)

I think tptacek's implicit point is this: all the problems you're trying to address with the tools listed are the problems you shouldn't be having in the first place, so, to quote, "focus on removing attack surface, rather than continuously monitoring it".

In this instance, I'm just interested in getting the right answer written down and moving on.

tptacek is using an appeal to authority given his standing as a security expert on this forum. That's the only reason his comment containing no justification is being taken seriously.

It is also being taken seriously because 1) he is correct, and 2) those of us who do this for a living are literally sick to fucking death of having to explain to management, developers, and sysadmins that there is no silver bullet that you can buy that will fix the mistakes you should have cleaned up five years ago. You still have to do the work. If you need to hear this ("reduce attack surface rather than spend time/money on monitoring attack surface") more than once you are beyond help and it is time for us to move on to people for whom there is at least a possibility that they won't be p0wned in the next 12 months.

This isn't a religious/esoteric forum here. Technical statements should be backed by proper reasons regardless of who is stating something. If you are sick to tell technical reasons then why bothering posting on HN?

You're relatively new, but after a while, you get to learn the credibility of some HN'ers by their usernames. Often, you can click their username to find previous comments from the same user where they expounded on the technical details, particularly for subjects that come up here a lot. I don't blame them for not wanting to go into detail each and every time, especially when their reputation gives their comments credibility in place of details.

I'd also like to add that you may want to work on the tone of your comments if you'd like to get enough karma to really participate in the community. Your comment history seems a bit confrontational and argumentative. There are better, friendlier ways to make the same points. I'll defer to the guidelines[0] to explain how you may improve your communication skills.


I'm quite aware of some credibility of some HN users. That doesn't give them the right to give arrogant or non-argument-backed answers. Besides that: There is no reason to base your argument on the age of my account.

You may have missed my edit to my previous comment regarding the tone of your comments. I suggest you go back and check it out. (Yet another sign of the age of your account, which is most definitely relevant.)


Since you're relatively new here, you may not be aware that there are levels of karma that unlock new ways to interact with the site. I'll omit the technical details, since they're readily available with a little research.

I hope your day gets better.

No, not because I'm relative new here but because I don't put too much value in the karma system and what it's offering me, as I wrote before. But thank you and no offense! Hope you a better day, too!

Ignore the karma system then. How's your current approach working out for you?

Because there are people who are willing to believe that something is good or bad without needing to understand the rationale behind it? Most people making the case for WAFs aren't posting detailed rationales of why WAFs are a good way to spend your resources either.

Example, same author: Cryptographic Right Answers.

(Disclaimer: I work with tptacek.)


You are being extremely rude. Nobody on this thread has been as personally abrasive as you have. If you have a technical argument to make, make it; nobody is going to call you names for doing that. Technical arguments, even those you don't feel are adequately supported with evidence, are not insults. But "whataboutism" and "appeals to authority" and "religion" and "arrogance" are insults, and speak to motives and intents, which are things you have no business bringing into this discussion. Stop.

My posts aren't meant to be rude. I feel very sorry if I hurt you or others feelings. I'm not making a point on your general credibility and technical statements - quite the opposite - but some your answers today just seemed to be ignorant for me (and probably some others). If you're giving a simple statement without background (just based on your opinion) and you're friendly asked for your information, then imo you should either don't answer or give back more information, but not just continuing giving answers without background. That's all, and that's not meant to be rude. Just my opinion on your way of posting today, as others have giving me today.

No one means to be rude. Rudeness is the byproduct of a lack of self awareness and concern for social mores. Gp (and others in this ridiculous subthread) are trying to give you pointers that will improve the quality of your interactions with this community, which is ironic given the comments that sparked this discussion.

Nope. I upvoted it before even seeing who wrote it. The advice is absolutely spot on. Harden your servers, don't just install a load of tools that look for intrusions after-the-fact and call it a day. I'm not sure why some people aren't getting it.

Its not that he owes a justification or an explanation or that readers can legitimately feel entitled to one. Paid support is another matter.

If you think I'm wrong, rebut.

The fan boys came out to rep you, but they're wrong.

First of all - it takes less than an hour to set up a WAF, and it doesn't worsen your security position. Not only is it free, it's fast and easy and will demonstrably be better than no security at all.

Saying "don't put up a firewall because it isn't perfect" is stupid. You don't have to be a security guru to get that, which obviously several people have, or they wouldn't all be giving you crap about it.

Securing your application long term is a complex, expensive, time consuming process. Suggesting that spending a couple months to be "serious about security" is somehow better than first taking one hour to do the most bare minimum effort is ridiculous.

If I have a computer on the internet, I put a firewall up. Is it defense in depth? No. But that doesn't mean it won't stop the most basic, most common attacks, and it takes no time and no skill to apply it.

I don't know what weird thing you have against WAFs other than apparently they aren't cool or sophisticated enough, but 1) they remove attack surface, 2) they filter the most common drive-bys, 3) they can be both passive and active, 4) they're free, 5) they're easy, 6) they're quick to deploy.

How is it 2018 and I have to explain to you of all people that the web equivalent of iptables might be a good idea if you have zero security?

> WAFs (...) 1) they remove attack surface

They do what you configure them to do. Without knowledge of what's allowed and what's not, they're going to have either false positives or false negatives. And if you have that knowledge, you can fix those issues in the web app rather than hide them with the WAF. A lot of preconfigured WAFs will remove specific matches you don't care about anyway rather than attack surface.

> 2) they filter the most common drive-bys

Common as in known proxy or webapp exploits? If that's the case, why not patch them? (There's a reason to use WAF if you can't patch but that costs time/money to get right and test)

> 3) they can be both passive and active

I'm not even sure what passive WAF means... just logging?

> 4) they're free

Only for very small traffic.

> 5) they're easy 6) they're quick to deploy.

See point 1. If you can't put the right restrictions on the app, why do you think you can put them in WAF? If they're not in the app in the first place, why do you think this way is quicker? It's not a magic box will know your requirements. You need a proper setup or the only thing you achieve is removing obviously silly cgi-bin requests from your log files.

I'm not even against WAFs. Just think the gain from them is been none and minimal if you reduce it to "turn it on, it's cheap and easy". It's really similar to iptables - you can turn it on, but why are you doing it? (On a server anyway) If you have a specific reason/route to protect - great. If not, make sure you bind non-public services to local interfaces and you don't need iptables.

I'm going to assume that whenever you say firewall you're talking about a WAF, since you mention "web equivalent of iptables". Disclaimer: I also work with 'tptacek, but opinions my own.

> it takes less than an hour to set up a WAF

Not if you want it to do anything useful. Default WAF rules are mostly useless and come with bypasses -- kinda like XSS filters in browsers, and browsers are in a way better position to judge if XSS is going on or not. You can sort-of get useful WAF rules if you encode most of the invariants of your system in your WAF -- but your app is in a strictly better position to do that sort of validation for you instead.

> and it doesn't worsen your security position.

Yes, except for the shiny new TLS MITM, the pathological parser performance cases, the new attack surface in that WAF itself (hands up if you've seen XSS in a WAF interface) and the continued operational cost of having new infrastructure that sees plaintext... Sure! And then I'm being generous in assuming we're not talking about a virtual appliance or as I like to call them "mystery *nix box you can't patch or audit".

> Securing your application long term is a complex, expensive, time consuming process. Suggesting that spending a couple months to be "serious about security" is somehow better than first taking one hour to do the most bare minimum effort is ridiculous.

Also, unlike WAFs, it's effective.

To be clear: I am not conceding your point about cost. It does not take "months" to identify systemic appsec issues. It also does not take an hour to configure a WAF and have it do something vaguely useful for your application.

Anecdotal evidence: Latacora literally does this systemic-discovery-and-remediation-plan thing for a living as a part of our engagement. "Months" is not our timeline for any kind of systemic appsec issue like SQLi or XSS.

> ... the web equivalent of iptables might be a good idea

Are you really sure you want to stand by that equivalence? How often does one come across iptables bypasses?

> Are you really sure you want to stand by that equivalence?

Yes. Both require some knowledge of how to use the tool and rules, but both are not difficult for even a novice to pick up quickly. Not to mention both can be done by either a dev or a sysadmin in the cloud right now, neither of them require in-depth knowledge of the code, and the scope of the change is simple enough that it can be tested and applied quickly.

> "Months" is not our timeline for any kind of systemic appsec issue like SQLi or XSS.

The discussion was about a dev or sysadmin fixing all this themselves ("taking appsec seriously" etc). Not just running Burp on their website. After they identify the vulns they have to have meetings to discuss fixes, figure out how to patch them and test that it works, go through QA and then roll into production. A fix may involve several systems and most people's deploy process is painful. I've never seen that take less than three weeks, and six is more reasonable. Can you do it in less time? Of course. Does that actually happen? Not without paying a consultancy $300k to hold your hand for two weeks.

It is completely plausible to get a WAF up and running and have it provide real protection immediately. CloudFlare's is probably the best example of this, considering the extensive add-ons they provide based on the intrusions they've seen. I don't like CloudFlare's TLS MITM either, but tens of thousands of their customers don't seem to mind.

Just like iptables, ModSecurity doesn't do anything at first. You have to actually make some decisions. Does that mean you shouldn't spend the extra 15 minutes to read a manual and enable some protection?

Earlier it was suggested that protecting against Drupal or Wordpress vulns is unnecessary. I disagree! There's no reason not to protect against drive-by vulns. Filters for XSS/SQLi, unnecessary HTTP features, malicious bots and scanners, directory traversal, RCE, etc are well worth preventing whenever possible.

As for false positives, they're not as bad as they used to be. There are thorough guides designed to help you identify and remove them from your site. https://www.netnea.com/cms/apache-tutorial-7_including-modse... https://www.netnea.com/cms/apache-tutorial-8_handling-false-... For a sysadmin this may take more time, just like it would if you were using iptables to secure a black box. A dev can change the filters immediately to suit their app.

But even if you don't do any active filtering, the passive filters can at least alert you that something is happening so you can go fix it. This is so standard that Kubernetes includes it as an option in their Nginx Ingress Controller. This guide can get you up and running on your existing K8s install in 10 minutes: https://karlstoney.com/2018/02/23/nginx-ingress-modsecurity-...

Might I suggest a little research, e.g., reading http://cs.unc.edu/~fabian/course_papers/PtacekNewsham98.pdf

This points out how WAF/NIDS is just a speed bump for a serious attacker.

You may conclude that they don't really 1) remove attack surface 2) Unclear if they really do this, and if they do what value there is 3) See the papers for detail on the differences 4) Zero cost to purchase, but what cost to maintain, run, keep up? 5) see #4 6) even if you have a large web farm?

This paper was writen 20 years ago.

I already addressed this. The WAF is a temporary solution. It is not targeted at "serious attackers", it is targeted at drive-by scans and generic exploits.

Nobody said "don't put up a firewall". By all means, set up the rules on your VPC so that only 80/443 hits your ALB and nothing else is routable. But don't futz with iptables on your hosts or god help you setting up a separate AMI for some 3rd party firewall.

I don't. I was explaining why an comment lacking justifications wasn't being buried

It's not an appeal to authority when you're really an expert.

Or you go to the doctor and expect him to justify your treatment down to the minor detail? You sound like you talk to your doctor about $overpriced_drug.

And still there are doctors who are prescribing homeopathy and other effect-less "medicine".

It's just reasonable to ask for rational arguments. That's the whole point of the rants here.

I posted this elsewhere in the thread. To be convinced, read this frightening paper: http://cs.unc.edu/~fabian/course_papers/PtacekNewsham98.pdf

> We want

Please speak for yourself alone. tptacek's comments in the thread are quite clear.

I'm not trying to be a dick

You are succeeding without trying. You're essentially complaining some free advice does not fit your exact expectations and specifications. Which is fine, lots of free advice doesn't. Trying to pass this complaint off as some kind of favour to the person giving you free advice, though, is... less than great.

No, tptacek just doesn't give proper answers here because he thinks he can afford it with his status here. He could have just left links to further links/literature.

There's nothing 'improper' about the answers, you just don't like how they're presented. As I said, that's a totally fair objection. It's just that nobody owes you free advice in the precise format you find acceptable. That's usually a feature of paid advice.

> If any of that stuff is actually working on your site, not having a WAF is the least of your problems.

Is this advice assuming that it's devs who want the WAF, rather than ops or management?

For devs clearly it's not what they should be focusing on. But saying something isn't worth doing because it shouldn't work is a terrible security principle.

I wrote a longer comment here, but decided it wasn't helping.

So, I'm just going to go with: no, if you're really building an application and care about application security, no, you shouldn't waste time setting up a WAF tool designed to help 10,000 employee insurance companies make sure their Wordpress marketing site isn't exposing some vulnerability from 3 years ago.

Not wasting time with dumb stuff like this isn't "terrible security"; every dumb thing you do exacts a cost from good things you could be doing instead.

Hit the nail on the head. Those suites of tools offered for existing vulnerable software is usually written for platforms and configurations that shouldn't be applicable to 99% of companies.

It's like throwing gauze over an open wound. Some companies don't care enough to put thought into healing the wound altogether, whereas some just wrap it in gauze and hope they don't bleed out. And it usually works - at least for a while, until you get big enough/become a lucrative enough target to a persistent hacker.

It isn't a case of "more security tools == better security", it's a case of "why do I need these tools in the first place". If you aren't analyzing attack surfaces and building security into your software development process, I can only hope the code you write isn't accessible externally.

* using 'you' incorrectly here, speaking to those looking to apply these tools to their setup.

To illustrate the principle of removing attack surface with your example of a WP site, instead of parking a WAF in front of it, create a static mirror and move the WP site to your internal network.

> setting up a WAF tool designed to help 10,000 employee insurance companies

What's wrong with that use case though? If you get hired as the new CTO, you're not even going to know all the managers, let alone all the developers.

Clearly auditing all code, doing developer training, etc. is the right longterm approach. But putting a security culture in place and actually getting things right in this kind of environment is a multi-year process.

There's nothing wrong with that use case. I don't love WAFs really in any setting --- I tend to agree with the security nerds I hang out with who call it "the AV of appsec" --- but at least it makes sense in that context.

I'm writing for an audience of people whose businesses own the applications they're deploying. If you control code, you should have at least the security staff required to cut out the extremely low-hanging fruit attack surface that things like WAFs address. Most startups, even with 20+ developers, don't have any full-time security people, and that's fine, because even at the ~20 engineer scale, it's still perfectly feasible to cut out all the WAF-addressed attack surface for your environment.

On rare occasions I'll write advice or give an answer from the perspective of enterprise secops, but unless I say I'm doing that, I'm never really writing from the enterprise perspective. Reddit r/netsec is the right place for that POV.

"Don't invest in a home security system because your windows should be completely shatterproof"

"Don't invest in a home security system before you make sure your house has a door"

Something tells me those taking objection to my comment are missing the point that I am criticizing the parent....

If your counterexample is just reinforcing the parents erroneous point that WAFs only protect against specific applications with known vulnerabilities, and that not all websites have "doors", then you are missing the point that many WAFs have detectors for fairly generic sql injection and path traversal techniques that could very well find problems with even custom application endpoints....

I'm one of a whole bunch of reviewers for Black Hat USA, which is the largest vulnerability research conference in the industry. For the past several years, if you were to submit a universal WAF bypass talk, enabling you to evade detection by every WAF on the market for every common attack, there is a really good chance that talk would not be accepted --- too boring. Almost certainly a WAF bypass that evaded all detection by a single WAF vendor wouldn't make the cut.

So, no, not really interested in generic SQL detection rules in WAFs.

More like, "Don't invest in home security, because your most precious valuables are stored in a safety deposit box at a bank, and everything else is covered by home insurance."

A WAF will also add an attack surface, just like anti-virus software before them.

Company X, a content publisher, has a wordpress site with a bunch of plugins. They hear that the security record of such a setup is less than ideal. They buy a WAF solution to protect them.

Company Y, also a content publisher in the same space, realizes that they don't need dynamically generated content. They don't need tons of JS. They start publishing static content and what little dynamic functionality they have is well-compartmentalized, with interactions to the outside world carefully audited.

Company Y could also get a WAF, but why? It increases the attack surface, Company Y probably don't have the time/expertise to audit it, it is not clear what benefits it will have.

In addition to the points discussed..

If you follow the Geekflare link to modsecurity, and then follow the link there to a download page, there's this warning:

>NOTE: Some instabilities in the Nginx add-on have been reported (see the Github issues page for details). Please use the "nginx_refactoring" branch where possible for the most up to date version and stay tuned for the ModSecurity version 4.

That's not something I would suggest throwing in production, given that warning has been there for years.

The post itself describes Shadow daemon as "probably defunct".

Ironbee's last commit was two years ago and I get an nxdomain looking for their website.

AWS WAF is both expensive and and extremely manual. You won't get anything out of it without a major labour investment.

Nginx's built in WAF is only in the pro edition, and outside the price of many people.

This whole discussion however depends on your customer. I have an agency I support where I have to provide a report on every single hack attempt. That means if my Rails application gets hit with said three year old Wordpress exploit, and the web server "404", I have to report on how we "blocked" it. And no, "not applicable" is not an answer.

So we throw money at commercial products. And sometimes I recommend doing so. But like security can involve just being aware of your risks, this use of a WAF is about being aware of the real problem you're solving, which is a paper work based one.

The problem with throwing tools at the thing is you have no idea what it's doing or how it works or IF it works, and it becomes forgotten about. But it's better than nothing in the short term while you figure out how to secure your application. Slap ModSecurity in front of your app while you go over OWASP and managing cloud infrastructure securely.

I assume that I suck at security, everything I have ever built is badly compromised and I'm too dumb to know it. I'm so bad at security that evil criminals actually celebrate my birthday.

Since I'm this bad, I assume that any monitoring tool I could install on the boxes themselves, up to and including all my logs are compromised. I'll usually still install monitoring tools for compliance, basic diligence and because they make wicked graphs that impress my superiors.

Since I'm that bad, I'd rather not invest 40 hours a week combing through fake logs unless I'm validating that I've taken care of the basic security fundamentals for my threat level.

Rather, I like to make sure that my security fundamentals are taken care of so that I can stop the simple attacks. If my threat/compliance model dictates, from there I'll possibly step things up into a web application firewall (WAF).

But, through all of this, I know that there are people out there who are so much smarter than me that the best I can do is little more than an annoyance for them. So, again, I like to assume that I'm badly compromised. This implies that I'm very careful about what data I keep, and I take efforts to protect my data at rest.

I break into stuff professionally.

There is nothing I love more than the "just throw {Cloudflare,Akamai,Fastly,etc} in front of it" crowd. If you are going to deploy a WAF [1] it _needs_ to be at the edge of your datacenter or cloud environment. No matter what fancy feature your provider wants to sell you to fix this, there is a way to bypass it.

* Make sure you are logging centrally, and logging the right things.

* Make sure someone is looking at the logs, it doesn't have to be a full time person, but it needs to be someones job.

* Depending on how big you are, hire people to break in on some regular basis. It could be anything from hiring a cheap contractor to run a bunch of automated tools, to full on 10 person adversarial teams. They will find some stuff for you to fix, but more importantly you will know whether the first two points are working based on what you see.

1. https://www.owasp.org/index.php/Web_Application_Firewall

I disagree.

If your WAF is the edge of your datacentre, it will likely be ran by some sepperarte team with a generic ruleset, literally be MITMing all your connections, but more importantly it will be attackable via your company VPN or some other side channel like B2B connections generally, that you probably have no control over.

Your WAF, if you choose to have one, should sit directly in front of your app and have rules tailored to your app. Even those on VPNs, or in the datacentre , should be forced to traverse the apps WAF.

That said, the WAF logic is largely what you're going to do in your input validation layer, so unless you feel you need to double bag it try and to mitigate unknown framework or library issues, or are having another person write the rules, then you're just repeating work aimlessly.

When some dickhead manager buys an insecure http service they want on the web, you should probably quit. Failing that put a WAF in front of it.

It is up to you to define where your peremiter is, and put your protections there.

I generally recommend having the WAF live in front of of apps, but with a small gap that allows your scanners to still access the app. It is important to still be identifying vulnerabilities that your WAF might block, because they aren't perfect.

> Make sure you are logging centrally, and logging the right things.

Could you please elaborate on what some of the right things generally are?

> Depending on how big you are, hire people to break in on some regular basis.

Just curious, how this is covered from legal standpoint? Do you ask clients to sign some consent document to release you from liability from damage from hacking by automated tools?

For work I am doing, most of the "automated tools" are things I wrote myself and I know what they do. It is up to the engineers working on the engagement to exercise care and avoid causing disruptions to the target or doing things they can't reverse easily.

I like to remind people that having someone on retainer that breaks your stuff with an automated tool is better than having some random guy on the internet break your site with the same tool. Either way it is eventually going to happen, but at least I will immediately notice and contact the right folks to help get it fixed.

> to full on 10 person adversarial teams

Okay this sounds incredibly interesting. Are there any open source groups that do this "for free" sorta? Not asking to get free working, asking so I could possibly jump in and learn.

If you're interested in learning the kind of particular skills used, there's a whole subculture in computer security having to do with Capture the Flags (CTFs), for a rather broad concept of "flag".

https://microcorruption.com/ is a quite a good one to get started with. (Because embedded systems are smaller and do less, they are easier to reason about, which makes it easier to see vulnerabilities.)

See also: https://www.reddit.com/r/securityCTF/

More broadly, the (with-permission) attackers are known as "the Red Team", and the concept is used against computer networks as well as in literal war-games by the US government.

There are websites that let you practice. hackthebox.eu is really popular. And there is a bit of a test to even get access to that.

Wow! Thanks for this! I "signed up" and it looks very interesting so far

Take the OSCP course, it is relatively cheap for the value of the content and it is one of the few certs in security that is respected.

From your profile it looks like you work for a big company. Ping some people in security and find out who runs your "Red Team" (if you have one). Make friends with them.

we call it hacking forums/teams. main motivation is some idelogical/political belief tho.

imho try to hack yourself first

Speaking of "Hack Yourself First", this completely non-ideological (though somewhat commercial - it advertises the creator's Pluralsight course) website where you can learn and practice intro hacking skills:


In short - security is such a big and multifaceted topic that it’s hard to know where to start, but I think that learning Kali Linux tools would cover a lot.

I started working at a cybersecurity company - our product is ridiculously stupid and I try and work as little as possible. In the time I’m not working I try to at least learn stuff. Since I’m at a cybersecurity place I sometimes try to learn about cybersecurity. It seems that even though for profit cs companies are making crazy dough, a lot of open source solutions out there do the same thing. There are so many attacks - they can be anywhere from the application level to the level of physical devices like routers having some kind of exploit. For any type of attack there is a good or multiple good open source tools that can be used to execute or defend (depending on what color hat you have), and a ton of them come standard installed on Kali Linux, but before you learn these you have to understand networks. As a python/Django dev I didn’t really have a deep knowledge and still don’t have a deep knowledge of this. Nor have I done more than dabble with any of these tools. But I know that there are a lot of videos on how to use these - including this 15 hour (!) one https://m.youtube.com/watch?t=1177s&v=vg9cNFPQFqM - if you were to go through that you’d probably be a security wizard

Thanks for the youtube link. Seems interesting. I am going to watch it in full.

We also use Fail2Ban like a poor man's Web Application Firewall - for a Rails app you know you'll never have legit traffic to anything 'cgi' or 'php' or 'exe', so crafting a couple of regexps to match those and then ban them works well.

We use CloudFlare so we feed the IP bans into CloudFlare's firewall through their API as well as local IPTables. (We also use CloudFlare's WAF - we like redundancy in our paranoia.) If you had more demanding requirements you could look at mod_security for Apache or Nginx.

We also implement rate-limiting using Fail2Ban, which helps to trip the dumber automated scans.

Every time Fail2Ban blocks an IP address, it emails our admin team with the matching log entries and a whois on the IP address, which we then eyeball to make sure there's nothing weird going on.

TBH, easily 90% of the IP addresses we block got caught looking for 'wp-admin/admin.php', which is probably a good reason to never use WordPress.

Fail2ban is for all practical purposes resource depletion protection against bots that pose no threat but still generate work for the server. For this purpose you do sacrifice a bit of I/O performance, but hopefully won't get those spikes that brings things down.

It is not a replacement for the actually security of keeping things updated and having good passwords/client certificates.

Just be aware when crafting those regexes that user-generated URLs (e.g. slug-based content paths, or even just query-string params if one is very careless) can thereby be used to build a denial of service attack against internal infrastructure, third-party integrations/webhooks, and even regular customers.

Reminds me that back in the day we used to be able to train email sanitizing appliances for false positives by sending them EICAR or GTUBE with spoofed headers.

We have definitely run into this, albeit only very very rarely. It's not viable for someone to build a usable DoS against us as an attack: our offices are on a whitelist and our set up prevents a viable attack against our internal infrastructure.

But yeah, like every tool, there's potential risks/down sides/things to watch for. There's no silver bullets!

We also use fail2ban. It serves us good. When adding a new rule, we dont apply it but collect all the matched requests and ips. After 1 week, we review the log and if no customer traffic was captured, the rule is applied. We ban only for 30m. In my field, web, it takes 2m for a bot to disappear once its requests have been blocked. We dont use permanent bans as we havent had the need for them

Are these bots causing problems like consuming excess resources? If not it doesn't seem worth the effort to ban them.

These days I only block bots when they're e.g. filling out forms or posting junk comment that a human has to moderate.

Some kid running an automated vulnerability scan can be automatically detected and blocked this way, which can protect you if that scan would have succeeded otherwise. Of course, the proper way would be just to scan all that yourself and actually fix the problems, but that may be too difficult in some cases.

While it's mostly bots trying to attack Wordpress, it's also blocked a few people spidering us aggressively (probably competitors) and other people trying attacks. Once it's scripted through a Chef cookbook, it's applied to all of our infrastructure with almost no effort, including any new boxes we launch. The biggest effort is running a script every few months to clear out the CloudFlare firewall, and I could probably stick that in a cron job, really.

Our technical team is very small and the company entirely dependent on our website and our reputation, so we take multiple approaches to securing the site and servers. Each little thing we add helps my boss sleep better, and that alone is probably worth the initial effort.

I definitely wouldn't do what some people are suggesting, manually reviewing logs/fail2ban matches and blocking IP addresses by hand. That's a waste of effort - and would probably be a full time job, too.

It seems you never had to deal with the amount of crap bots throw at every minor website out there.

If you don't want your logs flooded with vuln. scanning and referrer spam, use fail2ban or a waf.

I have, in fact, been doing this for years. Why should I care about having pristine logs?

Yeah, who cares about the extra disk space, the extra overhead, the waste of resources, filtering through noise to understand some issues with the app, right?

Isn't it lovely when every other referrer is trying to promote some online pharmacy or testing to see if you have a wp vulnerability?

Sure you can filter it, but I'd rather deal with garbage at the source.

A lesson I learned the hard way: if random bots are causing significant resourcing issues, then you need to address that directly. And if their impact isn't significant, why are you spending time on it? The disk space my log files consume is trivial.

Scanning bots consume resources and traffic. I pay for both. They increase my bill so it makes sense to restrict them.

Having gone down this path before, I think time is better spent making your site more efficient so it doesn't matter. Developer time and site complexity are much more important commodities than bandwidth fees.

I usually leave it to the devops to deal with it. If it has to be a developer on my team to do it then you are right, I will spend their time elsewhere.

I like this approach. Do you permaban the IPs? because a ton of ISPs rotate IPs across users

CloudFlare has a maximum number of firewall rules, based on the types of paid accounts you have. Every couple of months we purge the list when it gets close to being full.

99% of IP addresses we block are from outside our target market - we're a very local/regional company - so we don't really have to worry much about blocking legitimate customers. Very rarely our support team will get a request from someone that's been blocked and we'll unban the IP address, or set it to a JS/Captcha challenge at CloudFlare.

> I'm personally more interested in an API-focused answer

OSSEC [1] is my go-tool for this type of tasks, it offers a language-agnostic interface to monitor any type of events. I used it along with my team to power the early version of the Sucuri Firewall, which later became the GoDaddy's security system. The project has matured a lot and is heavily tested using millions of HTTP requests every day. We also implemented a behavior-based algorithm to detect malicious requests, this part is not open-source but I can tell you that it was relatively easy to integrate it with the core interface.

Splunk [2] is also a good option, slightly more difficult to configure though. I have had the opportunity to meet some of the developers working in this company during a short visit to their Vancouver office and was genuinely surprised with the level of engineering that goes around this product. Definitely, worth checking it.

NewRelic [3] while the purpose has nothing to do with security, it also offers an easy-to-use interface to monitor events in the system. With some extra configuration you can make it work as a rudimentary firewall and benefit from all the great features that the product provides.

Disclaimer: I am an active contributor to the core features, but I am neutral on my recommendation.

[1] https://en.wikipedia.org/wiki/OSSEC

[2] https://en.wikipedia.org/wiki/Splunk

[3] https://en.wikipedia.org/wiki/New_Relic

What you need is information, what's easy to get is data. Generally, if you have data on what "normal" looks like, and data on what the current state is, you can view the difference as information that indicate an abnormal state; be that a full/failing disk - or someone downloading all tour data via blind sql injection.

That's the theory. I've yet to achieve this in a timely fashion, within budget limitations - but on my list is to have another look at graylog2 for logging (if you have the money, splunk is probably easier - but they do charge for the value they provide).

But I'm hoping to find the time to mash graylog2 with something like collectd, and maybe a general event collector for exceptions/stack traces. The goal would be to have it all on a similar timeliness in a unified interface (this spike of exceptions happened after this series of requests from there, and after the disk went full). And ideally with a sensible way to automate "normal" vs "exceptional" (alert on high cpu, low disk, high network bandwidth etc).

Sure, people should read logs/alerts - but it's a challenge to filter out irrelevant (ssh pw bruteforce against a ssh server with only key auth, Web app crashing due to #wontfix (yet) bug) - so the relevant becomes obvious (successful key based login from different continent - where none of your team are).


You don't monitor web logs, you ship the logs to cold storage every [crontab spec units for time :)] and monitor your hosts and network for unusual activity.

If anything happens, THEN you look at the logs. The signal/noise ratio of the average web host logs makes combing through them just not worth it.

There are tons of products out there - have a look at OSSEC (https://www.ossec.net/), it's open source and free.


You shouldn't care about hack attempts. There will be thousands each day. They don't matter.

You should care about your systems not being easy to hack. Unless you're a very high value target then "practice basic security hygiene, install your updates, don't reuse passwords" is enough to achieve that.

Using Splunk to monitor traffic either via web logs or via more detailed packet capture solutions.

Just good web logs is enough to detect all kind of malicious anomalies.

Splunk machine learning functions can cluster data and detect anomalies in anything without rules or signatures:


I'd recommend only really addressing the most obvious attacks, rather than trying to follow up on everything that even looks like it.

I once got an "emergency" call at 7am (I had been up doing something rather late), about a machine I run attacking this guy. After some digging with him, the machine was our public-service NTP server, and the packets were on udp/123, in response to packets his system was sending out. Ended up with him yelling at me and hanging up, and me shutting down our public NTP server for good.

1) fail2ban 2) log everything to an ELK stack (or your preferred alternative) and set up dashboards and reports to feed you malicious crafted URLs 3) Reject traffic in your web proxy based on a maintained list of hostile URLs. Ban the source of this traffic. Don't send back 500s, just slow down and eventually stop answering the requests. HAproxy lets you do this easily with a few simple rules and a stick table.

most successful organizations ive seen don’t think about their API/app, but rather their stuff of value: usually data. If someone lifts your entire database, can you spot that on the way out the door? How quickly would you notice 100% of your fattest internet pipes spewing data at some poor souls website? You mention bitcoin; how long until you spot all your machines at 100% CPU?

Separating this out:

1. Monitoring logs for security -- people don't, but machines do. Initial security comes more from segmentation architectures (bastions, whitelisting, ..) and network/host intrusion detection systems. Logs do come into play later for app-specific rules (unpatched apps, fraud tipoffs, ...) and eventually automated UEBAs (exabeam, ..) and analyst threat hunting (plug: http://www.graphistry.com !).

2. Forensics for post-breach attribution & containment/mitigation. Let's say something does happen (... it will.) Logs are important here.

-- Threat intel: Known indicators of compromise (IOCs) -- IPs, URLs, downloaded file hashes, ... -- get published. Team setup always-on correlation rules against their internal logs via say splunk or hadoop, and if hit, they know the threat intel of it. Or they'll get alerts from OpenDNS that an internal device is talking to a known bad actor. That's generally pretty shallow threat intel on the attack (ex: VirusTotal.com entries), but a start. There are multi-step ways to iteratively correlate across these sources, but takes effort. Attacker attribution happens here, but typically more important is identifying further incident IOCs in case of a bigger breach.

-- IR / DFIR: Teams care more about defense and explaining what happened internally (incident killchain, scope, ...). Say there was a malware alert from AV, if the dwell time was hours or the device important, teams should investigate what else happened for the device/user/etc, or if anyone else got hit. THAT'S where logs really shine. Small/immature teams will check individual tools (firewall device, ...), while more mature teams check centralized logs.

We find manual log analysis shallow in practice for a decently sized org... so another plug for http://www.graphistry.com for setting up more comprehensive multi-step analysis flows for the human side of looking through stuff like malware, phishing, and account hijacking. Tools like ours work on top of log stores like Splunk/ELK/Hadoop/etc. (more scalable) or bring their own DB (less scalable.) This can get pretty intense: some companies log all the way through process activity.

At the extreme end (FBI, Mandiant, ...), companies will quarantine a device for activities like analyzing RAM & disk images.

I sleep a lot better since I use IP white lists on any port/is that is not public facing. Central reliable location with a signed whitelist that the servers checks regularly and adds to its firewall. No security measure is bulletproof, particularly against zero days but that reduces the risk by orders of magnitude.

Nice idea, but how does it improve over "have important stuff on an administrative VPN"? To me the whitelist seems more effort (but we often have non-static IPs at home; with dual stack lite aka native(?) ipv6 plus a private ipv4 behind a huge, ISP-wide NAT making things even worse), and it doesn't give you additional encryption.

Agree. I have scripts regularly updating the whitelist with dynamic IPs every 5 min. But I don't use VPNs for various reasons.

OpenVPN on windows is very slow (unlike on Linux), it's hard to get above 5-7MB/s, which is a problem for large file transfers (and that was pre meltdown and spectre patches). It's also hard to configure, and the server still has WAN connectivity so it's easy to misconfigure it and get no protection. I also found that having an OpenVPN LAN with no gateway in addition to your default LAN on win7 tends to interfere with Chrome on the client, which will occasionally complain that it has no connectivity.

Being a belt and suspenders guy, and only dealing with windows machines, I found that a good solution is to combine it with IPSec, which is straightforward to configure on windows, fast, and adds encryption when the underlying protocol doesn't have any (SMB2). You can configure the firewall to only use IPSec for certain ports. But it's a nightmare to interop with non windows machines.

Marcus J. Ranum's "Artificial Ignorance" concept is pretty good. I did an implementation and have been running it for more than 10 years:


Someone's also done a Python implementation: https://github.com/lchamon/ai

...and it looks like there's one in syslog-ng Premium Edition:


I highly recommend you try out Sqreen. https://www.sqreen.io/ It's basically newrelic for security. Takes 2m to add to your web app and it works super well. It blocked an attack automatically on our service recently!

A tad off-topic, but one thing I've enjoyed doing in the past is giving my attackers something to get distracted on, allowing me to observe their behavior.


Wow, it's interesting to see how bad this hacker is with a shell. Even the "better" attacker from the video you linked in the comments doesn't seem to really know what he is doing at times.

Sadly, I feel not all hackers are dilettantes.

I can not recommend watching this Defcon talk enough: https://www.youtube.com/watch?v=_SsUeWYoO1Y - "You Spent All That Money"

Dude I'm a security consultant for a major vendor and I can't tell you the number of times I've

A) Found an incident while installing new equipment

B) Installed new equipment last year and came back this year because they still got hacked


C) Ripped out their old gear and put in my company's gear after a hack because the sales guy told them they wouldn't get hacked with our (passive, monitor-only) gear.

You need the right tools, no doubt. But you need to understand your needs first. Centralized logging is the first step, but without knowing what you want to get out of it, those logs just sit there unnoticed. Even intelligence platforms like Splunk or ArcSight, unless you know what you're looking for and write it into rules, it's just sitting there.

Knowledge is what matters, the tools just make it easier.

During my time at Wanderu we set up Filebeat on all app machines, forwarding json logs to Kafka, and then on to ELK and Pipelinedb for analysis/alerting/session killing/ etc.

It worked very well, as there are almost always application edge cases that should not be blocked immediately with Cloudflare or at a FW level.

ELK was used for exploration and/or setting alerts on thresholds that we had found/defined beforehand.

Pipelinedb was used to run continuous aggregates on a stream, augment (pop json log lines off a stream, enrich them with Maxmind, etc) logs in "realtime", and immediately surface malicious behavior. We'd then programmatically kill sessions or add Cloudflare rules upstream, based on the behavior that surfaced in a Pipelinedb continuous aggregate.

Wanderu: https://www.wanderu.com/ Filebeat: https://www.elastic.co/products/beats/filebeat PipelinedB: https://www.pipelinedb.com/

I'm curious how you got Filebeat to push to Kafka. Did you write a custom modification to Filebeat/libbeat?

Could something like poor man's primitive check work?

- Create a custom http header and find a unique way to create/identify it. So theoretically another layer of authorization, just like APP ID for instance.

- Create a middleware that would scan the request and check whether it has the required custom header

- If not log everything and create a notification/slack/allow/deny and if it's good let through the request.

I use a home-grown solution called txrban:


It consists of several processes written in the TXR language. One for monitoring Apache, one for sshd and one for Exim (mail server).

TXR's standard library has "tail streams": streams which automatically follow a rotating log file.

These txrban scripts simply open tail streams on their respective log files, use TXR pattern matching to scan for what they are a looking for, and then use functions in the tiny txrban utility library to do the banning. Banning is instigated by issuing iptables commands.

The banning isn't persisted, but when you restart the scripts, they retroactively replay everything by scanning the available logs from the beginning.

Banning events are reported via syslog. A "dry run" configuration variable allows testing for the rules without actually modifying your iptables.

I never packaged this program into a redistributable form. It contains some hard-coded log-file paths that work on my server, and no documentation.

I used to run a web service, an API, that created 3D models of people from photographs. It was a character creation service that made likeness models of real people from a photo. I knew we'd get a lot of hack attempts, and a consultant recommended we get a high end Sonic Wall appliance, a $20K piece of hardware that cleanses all network traffic to and from my data center.

Best decision EVER! We'd get hack attempts, we'd get DoS attacks to the tune of 130K hits per hour, and that Sonic Wall shrugged it all oof like nothing. For a while, it looked like every script kiddie was trying to hack into our servers. No one got through, and we could care less how or what they were doing in their attempts. We had real work to do, that did not include fucking around with those idiots.

We also had a block list for customers from China. Not a single transaction from China was legitimate, all forged credit card data, so we just blocked their billions of potential customers because not a single purchase from China was real.

For me monitoring the logs is already too late to do anything about the hacking attempt. Usually web servers write logs after the request was served. One approach (I have seen commercial products do it) is to hook into the web server code and inspect every request real time before it hits the application servers and make a decision if it is a legitimate request or not.

It does take a number of malicious requests before someone can find an oppening. If you ban them after the 1st or 2nd request, you will be safe. It is going to be a miracle if some vulnerability scanner finds and oppening after 1 request, not impossible but HIGHLY unlikely

> If you ban them after the 1st or 2nd request, you will be safe.

Sadly, it is too easy to get a new IP address now. I think as part of our security mindset, we should consider it trivial to scan for several hundred vulnerabilities using relatively unique IP addresses.

I'm not saying that you shouldn't ban traffic as it's required under several compliance programs and it is another form of demonstrated diligence. But, unfortunately, banning individual IP addresses only works against the absolute lowest hanging fruit of the criminal world.

previously worked at a major shared webhost a few years back (which used Apache)

we used mod_security with common rulesets + a custom list which was reviewed/updated based on what hacks were coming in (with hacks both passively monitored by customer base/compromised spam reports and via active scans for malware deposited by attackers which were then cross-correlated/traced back to application logs to find the entry point/compromise.

support staff was ~600 (including front end call center and billing people), with probably 250 of those doing L2+ support and ~50 doing the security work which fed into the ruleset, for a server farm of ~2-4000ish (and exponentially more sites/accounts since it was shared webhosting). In some ways our network could have been viewed as a giant honeypot because of all of the outdated webapps that were being run by the customers, plus their compromised PC's saving passwords, etc.

I see a lot of mod_security misconfiguration across hostings. While I think it is a great tool, these misconfigs make it difficult for me to recommend it.

What is "an attack?" You need to define this better to get more focused answers.

Log monitoring is good for finding bad access patterns, but can't be trusted to find a problem as it happens.

Look for repeated errors. Hitting the same service over an over? Login attempt failures from the same few IPs? It's either a script, or the Bing crawler. Write application level code and network level configuration to mitigate the problem.

How about legitimate users doing social engineering attacks on each other? Sybil attacks in ratings or voting? Attempt to reuse mobile purchase tokens on different accounts? Some new SSHD credential leak zero day? Using your cat picture upload service for storing pictures of dogs? Each of these need different approaches, and a good logging system that can change with your needs is super helpful.

But you can't read the logs in real time, looking for bad patterns. That needs customized analytics.

We use a combination of https://www.templarbit.com and Splunk. Templarbit allows us to get real-time notifications when someone is running a vuln scan, violates one of our security headers or is trying payloads that look funny.

Fail2ban looking for web pages that don't exist on your web site (like ../wpadmin) works pretty well at getting rid of script kiddies.

I suppose you could add a honeypot page that took form submissions and looked for any sequence attempting a CSRF, XSS, or SQL attack and ban the source there as well.

This is a list of what you can do for application security with Nginx (mostly with open source tools): https://github.com/wallarm/awesome-nginx-security

My talk from Nginx conference: https://www.nginx.com/blog/build-application-security-shield...

Important note. Care about vulnerabilities. Not about attacks. Buy Burp license. Run appsec training for all of your developers; it's easy while you're small.

Disclaimer: I am a co-founder of Wallarm mentioned in preso.

I used to spend a lot of time combing my logs with custom scripts, but honestly I think the effort is better spent elsewhere. There're countless automated attacks a day on a big site but they really aren't anything you need to worry about.

I’ve learned that at the very least to ensure that everything is being properly backed up and that the restore works. Testing this regularly puts my mind at ease that even if my site, application, whatever gets destroyed, stolen, etc I can bring it back to life in a very short time. It’s very difficult to ensure that something is 100% secure (not even sure if possible since the major players like Twitter make mistakes). Of course, this doesn’t solve the issue of data being taken but that’s why it’s important to understand the data you actually store and if you truly need to store it or if some could be handled another way.

There's really no way to eyeball logs in any sort of complex environment. What you're looking for is a system that will automatically analyze logs and/or traffic for malicious events and alert on them. What you're looking for are Introsion Detection Systems (IDS) and Security Information and Event Management (SIEM). For IDS there are a whole bunch of open source solutions. Essentially these systems ingest logs into a centralized location and run them through a rules engine, which gets updated with new rules on a weekly or daily basis.

> What you're looking for is a system that will automatically analyze logs and/or traffic for malicious events and alert on them.

(I don't pick on you, so I hope this comment doesn't come off that way.)

What are "malicious events" and how will you know them when you see them?

I would argue that, instead, your system should automatically analyse logs for benign (a.k.a., completely normal, expected, innocent) events, toss them out, and leave humans to review anything that's left over.

With any new "system", it will take a bit to build up a complete "whitelist" so that the amount of data to review is down to a manageable level but you'll be rewarded with much more meaningful and actionable data when you get there.

> What are "malicious events" and how will you know them when you see them?

I suspect they are simply defined as "rare" events.

Instead of the term "malicious events" he should have used the term "anomalies".

Ha I remember awhile ago doing something similar with our AWS server logs, turned out we were always under attack 24/7, rather amusing but not very useful apart from scaring juniors a little bit.

Dont do this in production without backup tools but we use unsupervised learning for anomaly detection. We're machine learnists so this is natural for us and we originally built the system for a mega Corp and made it open source (though not all parts are yet public). Catch is it's not real time but he system outputs outliers every few hours or daily depending of the size of your traffic (daily for around 50M proxy daily logs). Outliers are then reviewed by scripts and few operators / analysts.

fail2ban, but I'm not sure that's what you are looking for. You'd have to customize it for an API.

2nd'ing fail2ban, it's simple and awesome.

Beyond that, tripwire or it's modern equivalent. Traffic monitoring, "deep packet" inspection on the network side. Run all outbound through a proxy.

Not a lot of people seem to know what tripwire is nowadays.

There is this Linux thing called IMA which in some ways reminds me of tripwire (Integrity Measurement Architecture).

Nah, I’ve waned to use fail2ban multiple times and failed.

Unless something has recently changed it is some of the most poorly documented software on the planet.

I’ve never been able to configure the thing or figure out what it’s doing or... really anything besides find the homepage.

It’s also just not worth the trouble. Globally open SSH port? That’s like 1992 level Linux box in your parent’s basement stuff.

While it monitors SSH by default out of the box, people here are talking about creating custom filters which will monitor their Apache or Nginx logs. (Although, that said, we also have it monitor our non-standard SSH port with v. aggressive blocking rules, even though it's also behind a firewall. Defence in depth, and all that.)

The documentation isn't the greatest, but this page explains the concepts well enough and was easy to find using Google.


I understand the concepts but... how do you actually use it?

A filter matches lines in the nginx log using regular expressions. If the line matches, it uses another regexp to extract the IP address, and then calls out to scripts to block that IP address.

I'm not going to post the exact configuration files I use, but the GitHub repo for fail2ban contains examples.


That's a filter that protects Apache against the 'shellshock' (https://en.wikipedia.org/wiki/Shellshock_(software_bug)) vulnerability, for example.

I like Fail2Ban, but as I remember it, my learning process looked a lot like this:

1.) Read through the documentation.

2.) Pour myself a scotch.

3.) Read through more documentation.

4.) Start drinking right out of the bottle.

5.) Try something.

6.) Get an error or experience a spectacular failure.

7.) Use StackOverflow.

8.) GOTO 5.


In good news, it worked - the light went on and now I'm confident that I could adapt Fail2Ban to suit any use. But, in bad news, honestly, I feel your pain and understand where/why you are getting stuck. Fail2Ban is a very complicated, yet extremely useful piece of software.

cd /etc/fail2ban and explore, you have example configs for all popular software and plenty of explanations in their comments

Computers are better at the eyeballs-on-logs tasks. You teach them what is "normal" to be ignored and they may warn you about what is left, even the old logwatch is pretty capable for that.

Of course, there some basic security practices. Firewall, fail2ban or similar to stop/ban misbehaving IPs, monitoring systems metrics to detect abnormal activity/usage (not just for intrusion detection), and maybe feeding logs to i.e. ELK to notice new/unknown patterns.

I use logwatch. It gives a large informative report every day. Mostly it shows me that fail2ban and my secure configuration are doing what they are supposed to do.

There is a product called Unomaly [1] which does exactly what you wrote.

1. https://unomaly.com/

I maybe have a counter-question. Once you identify an attack, what do you do? Firewalling the attacker is an option but not really effective, the attacker might just change IP address and keep going. We've only inconvenienced them a bit. What else can be done?

WAFs like CloudFlare help, but stuff still gets through. Last night an obvious brute force attack went undetected through CloudFlare. I just finished writing a post about it: https://hauptj.com/2018/05/08/brute-force-login-attacks/

I use Fail2ban http://www.fail2ban.org/wiki/index.php/Main_Page for auto blocking.

I take an 80/20 type approach and see what types of access is not covered by current rules, then I add a custom rule to cover that type of access attempt.

The most important things for this are: - A good idea of what is normal traffic and behaviour on the web service. - Good threat intel.

These 2 combined will make it fairly obvious what needs to be done, but obtaining these 2 items is not a trivial task, infact, they are both continuous processes.

I work for: https://www.signalsciences.com/ and our tool is specifically designed for use cases like this. Let me know if you would like more info cody-at-signalsciences-dot-com

For web logs specifically, even if you had a system that would 100% tell you what is a hack attempt and what isn't, you would probably not want to look at the output.

The web server will be hit by thousands of "hack attempts" a day. Most of them will be directed at software you don't even use.

You care about successful hacks, not random attempts (targeted attempts may be different, but good luck finding them in the noise of web logs). Web logs are probably the noisiest, least useful place to look for those (although they can be useful to determine how an attacker got in/where they came from after the fact).

The only thing that could make sense for web logs is looking for brute-force attempts, but that is better done with application-level logging (and countermeasures like CAPTCHAs).

Things you could look at instead are e.g. firewall/netflow logs: If your server is making unexpected outbound connections, there's a good chance it got owned and someone is now downloading their toolkit onto it.

> I'm genuinely curious on how do security focus companies discover things like "oh, we got attacked by Bitcoin miners from North Korea". Are there sophisticated tools that do this for you..

I work in a Fortune 100 Security Operations Center (SOC). We have a dev team that builds tools for our Incident Response team, Threat Intel team, etc, etc and an engineering team to manage those tools. And lots of other teams that specialize on lots of other things (malware reverse engineering, spam/phishing, red team, etc). We have to build a fair amount of our own stuff because we operate at levels above what a lot of commercial tools can do. We have hundreds and hundreds of custom alerts that fire based on various network traffic and sysmon activity that our threat team has written rules around.

This, of course, is out of the question for most.

So, what should you do? First, get your logs all in one place. The ELK stack is probably the easiest way to do that. ES makes things searchable and Kibana is a pretty good UI for a lot of this stuff.

Second, if you aren't monitoring your internal network, you really have no idea what is going on, so look at what you can accomplish quickly with distributions like Security Onion (which includes the ELK stack): https://securityonion.net/

Third, there's a fair amount of Open Source Intel (OSINT) on threats. You'll need to find a way to integrate that information and scan your logs for it. Malware, like Coinminers, will have certain Indicators that will tell you if they are on your network. If you're on the ELK platform, look at something like ElastAlert. You write rules in Yaml and then set up Alerts to fire into your Slack channel or Email you or whatever. There is no real shortcut here.

Commercial tools that can do a lot of this stuff includes FireEye's TAP, but they definitely aren't free: https://www.fireeye.com/solutions/threat-analytics-platform....

And, as mentioned elsewhere, absolutely remove any attack surface that you can. Scan your IP ranges and domains with tools like Shodan.io, HackerTarget.com's DNS tools (DNSDumpster.com has a nice, free DNS search), etc and make sure you have a handle on what, exactly, you're exposing.

And, finally, if you're big enough then every single day a computer will be compromised in some way. There is no company in the world who doesn't experience this so you need to have a clear set of tools and playbooks to handle that. If you can respond to most incidents in days (not weeks), you'll be better than 90%+ of the companies out there.

Edit: clarified acronym

Im currently building a SoC for a fairly large company in Germany and before that I worked for what probably by now through M&A is the largest MSSP in the world. I dont know if I agree with you that having a MSSP is out of the question. we had plenty of clients who only paid us maybe 2 to 3 thousand a year if even that. Most MSSPs will charge on a contract to contract basis revolving around log volume and estimated daily event creation. If you are only creating an estimated .1 events per day you could very well have a rather cheap contract.

If it costs 2 or 3k a year your MSSP is probably not spending more than a few minutes a year understanding your logs. Realistically at that price the most likely number is zero minutes.

There is room in the industry for players that aggregate logs in interesting ways and can share threat intel and track attack patterns between customers, but a lot of the time with MSSPs you will just find yourself paying for a really expensive splunk instance.

Logs are fairly useless without occasional review (to at least verify you are collecting the things you think you are) and a decent understanding of the infrastructure they support.

I don't know your business model or capabilities though so this might not apply in your case. Hopefully not.

But there are plenty of MSSPs out there that will ingest logs, charge out the %]%{^{ for "monitoring" and add very little value. After the nth time you get alerts for PHP vulns on stuff which is clearly not running it, or something stopped sending logs months ago and no one noticed, or you find that firewalls logs from anything not in an unpublished support list don't really get parsed (yeah they will ingest it but just not tell you that nothing meaningful will happen with it..)...

To be fair, monitoring by third parties is one of those things that you can't "do and forget about it", you need shared responsibility and a clear understanding of who is doing what. None of this is cheap though.


SpectX is a tool that can run fast sql-like queries on top of distributed text files. It was initially built to parse logs by an infosec team. Hope it helps!

It's called SIEM, Security Information Event Management, there is a lot of companies doing this and providing cheap and expensive solutions.

OSSEC is a nice FOSS tool for this

ossec + syslog , how you display the data (zabbix, elk, etc) is up to you.

Another question- how do you comply with GDPR and do that? What if a black hat sends you a SAR request?

error/exception monitoring

Applications are open for YC Winter 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact