
TruffleHog – Searches through Git repositories for high entropy strings - ergot
https://github.com/dxa4481/truffleHog
======
infogulch
I wonder what the false positive rate is on perl.

~~~
MildlySerious
If you read the description of how it works, it says it only searches for
base64 and hex blobs, so I doubt it would be any higher than for any other
code.

------
nickthemagicman
Hate to be dumb, but what exactly is a 'high entropy string'? Thanks in
advance.

EDIT: I see now. [http://blogs.splunk.com/2015/10/01/random-words-on-
entropy-a...](http://blogs.splunk.com/2015/10/01/random-words-on-entropy-and-
dns/)

So you're looking for hashes gotcha.

~~~
daveguy
A string that is close to random noise. Low repetition. Like what you would
find in a randomly generated cryptographic key.

------
bpchaps
I tried something similar to this a while back and found that using Shannon
entropy, though effective with long strings, failed pretty hard with shorter
strings. Lots and lots of false positives.

An alternative that worked surprisingly well (though, not without flaws) was
to use a password strength checker similar to python's passwordmeter library.

POC: [https://github.com/red-bin/password_finder](https://github.com/red-
bin/password_finder)

------
ryan_lane
If you're looking for something higher signal and lower noise, see:
[https://github.com/lyft/bandit-high-entropy-
string](https://github.com/lyft/bandit-high-entropy-string)

Note that bandit is for python, so it'll only match against potential secrets
hardcoded into python code, but it takes the AST in mind, so the signal is way
higher.

------
codelion
Interesting project, you can also search for all kinds of suspicious commits
using commit watcher: [https://github.com/srcclr/commit-
watcher](https://github.com/srcclr/commit-watcher)

------
empath75
This is really clever. I run a stash server. Is there a way to get this to run
on all the repos?

------
smoyer
My security officer is going to love this!

