
Gryffin: a large scale web security scanning platform from Yahoo - cnbuff410
https://github.com/yahoo/gryffin
======
cheepin
"At the heart of Gryffin is a deduplication engine that compares a new page
with already seen pages. If the HTML structure of the new page is similar to
those already seen, it is classified as a duplicate and not crawled further."

Does anyone know what definition they use to constitute "similar"? In
particular I'm wondering if you have to do any sort of configuration on Single
Page apps which could have remarkably similar markup but completely different
behaviors/vulnerabilities.

~~~
kylequest
Looks like it's simhash:
[http://www.titouangalopin.com/blog/2014-05-29-simhash](http://www.titouangalopin.com/blog/2014-05-29-simhash)

------
stephendicato
I don't accept "coverage and scale" as the answer to why this was created.
What problem is fundamentally being solved by scanning, or fuzzing, your web
based applications "at scale"?

------
q4
Can someone explain for a newbie on how to practically use it in a project? I
understand golang basics. The github documentation mentions what it does but
not how to use it.

