Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Where is this useful? The bus speed across to the GPU is so slow, I thought it was only meaningful for near-autonomous operations.


Right now, the bandwidth to modern GPUs is actually pretty decent (16GB/s bidirectional), but the latency is still horrid. This means that you need rather large operations for offloading to pay off. I think doing raid-5 or full disk encryption with large blocks might just barely be worth it.

However, with AMD and Intel integrated GPUs, this is about to change. AMD is doing a lot of work on HSA, which can be summarized as "GPU and CPU share same memory, and can communicate by passing pointers". I can see this kind of work being really useful in the near future.


More and more CPUs have AES instructions and my old lenovo ideapad has a crypto coprocessor. Do you think the GPU offload will be worth it when the sytem has hw accelerated crypto?


The question should be whether dedicated hardware for accelerated crypto will be worth it when GPU offload is suitable for it.

Although in that particular context (security) you might enjoy the isolation of dedicated hardware as opposed to sharing it with others. The GPU solution though of course has the advantage of being able to adapt to new ciphers etc.


Specialized, single-purpose hardware is right now some 10x more energy-efficient for the same task than a GPU, (and some 50x more efficient than a CPU). Given that modern chips are not limited by transistor density but by energy density, we're going to see more special-purpose hardware in our chips, not less.


Adapting/implementing the newest and hottest cipher on the block is not something that the crypto community advocates. Do you really think crypto accelerated hardware is going to fall behind and not support the ciphers that the crypto community (academia/industry) endorses?


I've been waiting for what, a decade, since VIA introduced padlock to get hardware accelerated encryption in mainstream CPUs. And, just recently, basic support have been introduced but only for AES and nothing else (and if I'm not mistaken (probably is) the padlock is vastly superior to the offerings of AMD and intel :P).

So yes, crypto accelerated hardware is behind and does not support the ciphers that the crypto community (academia/industry) endorses and in all likelihood will never bother catch up since doing it on the GPU will be good enough. Even if it takes another decade.


What algos are you missing?

The AES-NI instruction set was proposed in 2008 and the first intel cpus started shipping almost three years ago.[1] Soekris has had the vpnXXXX crypto accelerators since as long as I can remember.[2]

[1] http://ark.intel.com/search/advanced/?s=t&AESTech=true [2] http://soekris.com/


Blowfish or twofish wouldn't hurt. But I'd be happy with AES, too bad none of my devices have hardware acceleration for it.

The fact that it was proposed in 2008 is quite telling by itself. And when intel introduced it it was in their high-end product lines, to find a processor where AES-NI is less needed would be a challenge.

A suitable integrated GPU would penetrate the market much better and ultimately support products which today use the atom processor. The very same product segment where you barely can use encryption on today (in contrast to a i7 which saturates a fast ssd without breaking a sweat in AES-encryption throughput - without hardware acceleration).

A similar solution would also most likely allow me to encrypt files on my phone without a large impact, even if the manufacturer couldn't care less about security features.


That depends on which team you're playing for, no?


I am not trying to be difficult but i have no idea what you are talking about. I looked through your past comments and you seem to be a competent commenter, can you clarify that you meant?


Whether you are attacking the encryption, or whether you're a user of encryption; a defender.

If you're attacking, having flexibility is advantageous.


I had thought in the past that storing the index of a database (not the data, just index) on the card and using that to handle complex queries, might be interesting. Not sure if that has a practical, real-world use though.


Not so much RAID5, which is just an XOR operation that is as good as free on a modern CPU, but RAID6 where a more computation-intensive Reed-Solomon code is used.


One situation where you might see immediate results is in High volume routing. I think this project [1]. They were using the GPU to saturate multiple 10GbE interfaces with a commodity processor.

1.http://shader.kaist.edu/packetshader/


I actually spoke to them about doing a similar thing, and they said they were using a GPU because they were pushing lots of sub-1500 MTU packets, and that commodity h/w could probably saturate multiple 10GbE interfaces that were just doing large packets for high-throughput.


README suggests RAID processing, file system encryption, and AES.


Agreed. I'm also struggling to see what kind of massively parallel operations need to be done in kernel space in the first place.


Maybe it doesn't have to saturate the GPU to be a win. If you can just banish some cache busting, streaming work, like raid processing, to a tiny sliver of the GPU it could be a win.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: