
CUDA Grep - usgroup
https://www.cs.cmu.edu/afs/cs/academic/class/15418-s12/www/competition/bkase.github.com/CUDA-grep/finalreport.html
======
sjf
They ignored a key detail in their benchmark. They did not include the time it
took to transfer the input file from memory to the GPU, (the GPU cores can
only operate on graphics memory). This is easily going to be the most time
consuming part. They also didn't include any IO operations in their timing, so
instead of a 10x speedup, it is more likely to be a couple of percent faster
as grep is going to be IO bound. This is a neat project, but there is a reason
we are not using the GPU for regex matching.

~~~
bkase
One of the authors here:

Yes you are right, you should not replace grep for one off regexes due to the
latency of the io operations. As far as I recall, memory throughput however is
much higher on house than on CPUs (or at least this was true in 2012). Our
intended use cases for this sort of a grep were cases where you are finding
many needles in enormous haystacks such as: Looking for malicious machine code
in executables that you download off of the internet (virus scanning), or
searching for certain genetic sequences in DNA code. I believe we discussed
this in our final presentation, but perhaps we left it out of the written
report.

Yes we probably should have included the numbers to show why you wouldn't want
to use this instead of grep for simple tasks. I am sorry we missed it.

~~~
zeusk
I'm quite certain Windows Defender uses (or used, at some point) GPUs in the
machine to accelerate malware pattern matching. I remember a coworker talking
about a funny bug as a consequence when the graphics driver executable itself
was infected.

~~~
cl0ckt0wer
There's an article on the announcement last year:
[https://arstechnica.com/gadgets/2018/04/intel-microsoft-
to-u...](https://arstechnica.com/gadgets/2018/04/intel-microsoft-to-use-gpu-
to-scan-memory-for-malware/)

~~~
dmix
That is for their enterprise "Microsoft Defender Advanced Threat Protection"
endpoint security product which scans live memory for any hidden malware not
writing to disk. Which would otherwise be consuming far more CPU than the
typical default windows security software. Which is why they needed GPU in the
first place. But it's an interesting solution none-the-less.

[https://www.microsoft.com/en-
us/microsoft-365/windows/micros...](https://www.microsoft.com/en-
us/microsoft-365/windows/microsoft-defender-atp?ocid=docs-wdatp-main-
abovefoldlink)

------
bkase
One of the authors here: Happy to answer any questions. Keep in mind this was
a school project from 2012 (Kayvon's 15-418 at CMU, a wonderful class), so
it's been a while.

~~~
Athas
Is the code available? The links are dead.

~~~
bkase
Hmm I'll take a look at that. Yes code is here:
[https://github.com/bkase/CUDA-grep](https://github.com/bkase/CUDA-grep)

------
michaelmior
I'm surprised to see no comparison with ripgrep[0] or The Silver Searcher[1],
both of which claim to be faster alternatives to grep.

[0]
[https://github.com/BurntSushi/ripgrep](https://github.com/BurntSushi/ripgrep)
[1]
[https://github.com/ggreer/the_silver_searcher](https://github.com/ggreer/the_silver_searcher)

~~~
ianhowson
Perhaps more appropriate would be Hyperscan:
[https://www.hyperscan.io/](https://www.hyperscan.io/)

Even though it's now an Intel project targeting CPUs, it grew out of a startup
trying run regex matching on parallel hardware.

~~~
acuozzo
FWIW, I ported Hyperscan to ARM using SIMDe.

------
User23
One reason that Perl regex is relatively slow is because it isn't actually
regular: it can't be implemented using DFAs or NFAs.

~~~
apetresc
Sure, but it must be relatively simple to identify, pre-compilation, if the
regex is regular or not; there's only certain constructions that break
regularity. Why couldn't Perl's engine fall back to a more optimized regular-
language-only implementation for those cases, if it has such a big impact on
performance?

~~~
ben509
Theoretically, a regular expression can be expressed in constant space, and
that's profoundly limiting, so anything that requires so much as a stack is
not a regex.

That includes basic things like capture groups. That doesn't mean you can't
use automata in a regex-plus implementation, though.

I can't find the paper, but I think there is a hybrid engine that tries to get
the best of both worlds; constant space where possible and using additional
space to provide more features.

~~~
chaosite
You can implement capture groups with no problem using automata (see re2).

You might have meant to say that you can't use backreferences.

------
xvilka
Would be awesome if they use proper open source software stack, like OpenCL,
not that NVIDIA proprietary crap. Using anything from NVIDIA harms open
source, because until they bankrupt there won't be FOSS drivers and software
for their GPUs.

~~~
privateSFacct
CUDA is not crap. At the time this paper was written it had been out for a
number of years, had good multi-platform support, and had good support for
nvidia GPUs.

If they had used opencl they may have been using an immature product, that
might really have been apple only at the time (or apple and IBM?). I don't
know thie history but 8-9 years ago (when the research work may have started)
nvidia was tpushing the cuda idea (almost entirely alone - AMD still playing
catchup).

~~~
wyldfire
OCL was pretty mature then. CUDA is not crap but it is lock-in and IMO its
best features were captured by OCL.

------
UK-AL
If anyone is slightly interested in a different approach.
[https://github.com/vqd8a/iNFAnt2](https://github.com/vqd8a/iNFAnt2). I think
its designed around executing incredibly large regex sets

------
peterburkimsher
Previous discussion:
[https://news.ycombinator.com/item?id=5803943](https://news.ycombinator.com/item?id=5803943)

------
yzh
I remember several years ago I saw this IOCCC winning entry on regex and
thought "maybe we should write a parallel version" And it's good to see
someone has it implemented now!
[https://www.ioccc.org/2012/hou/hint.html](https://www.ioccc.org/2012/hou/hint.html)

------
nixpulvis
Github link on this page is broken, otherwise this is looking awesome. I'm
excited to try it out.

~~~
figitaki
The correct link is [https://github.com/bkase/CUDA-
grep](https://github.com/bkase/CUDA-grep) I think he's working on fixing the
broken link.

------
ngcc_hk
None of the links work. Any source and log on say github?

