We do a lot of work on video encoding. We have had a K80, Titan X(Maxwell), Tita...

slizard · on April 7, 2017

> Do not mix & match consumer-rated gear with 'professional' gear. (i.e. If you put the K80 in a system with a GTX1080, then the Nvidia drivers restrict the number of available processing cores to 2 per device)

Huh? Not sure what exactly do you mean by "number of processing cores"?

I use two development boxes on a regular basis with Teslas side-by-side with GeForce cards and they all work just fine.

jamesfmilne · on April 7, 2017

The NVENC SDK limits the number of separate H264 video streams you can encode simultaneously to 2 if you have _any_ Geforce hardware in your system.

VA3FXP · on April 7, 2017

I was unintentionally vague. I should have said 'output'. At the bottom of this post I have copy/pasted output from my original tests.

That was not CUDA, the task I was working on specifically (and only) used the NVENC encoder (via ffmpeg). I don't know if the situation has changed but these were my observations.

All of my tests were done in 2015, so the situation might be different now.

The k80 could output upto 4 "streams" (aka outputs or threads) at once. A 780Ti can only do 2. According to nvidia-smi the K80 "appears" to be 2 GPU's on one card. You can actually designate which GPU you want to process ffmpeg streams on.

As soon as you had both devices installed in the same PC, the Nvidia drivers disabled the output of the K80 so that it too would only output upto 2 streams per GPU.

IIRC, there was even a status message that got displayed when installing the Nvidia binary blob:

paraphrasing from memory from 3 years ago

Warning consumer card detected. Limiting available GPU's

Here is a copy/paste dump of my findings at that time. (The formatting is screwy with the nvidia-smi optput.)

=====================================================

Four threads running this:

ffmpeg -i 1784457.mp4 -c:v nvenc -c:a aac -strict experimental -gpu 0 -b:v 21700k -b:a 128k -y delete_me<#>.mp4

gives us ~5-6fps

and uses 3105MiB / 11519MiB of GPU RAM (755MiB for each thread)

------------------------------------------------------------

One thread running this:

ffmpeg -i 1784457.mp4 -c:v nvenc -c:a aac -strict experimental -gpu 0 -b:v 21700k -b:a 128k -y delete_me1.mp4

gives us ~16-18fps

and uses 755MiB of GPU RAM

------------------------------------------------------------

Four threads spread out using both 'GPUs':

ffmpeg -i 1784457.mp4 -c:v nvenc -c:a aac -strict experimental -gpu 0 -b:v 21700k -b:a 128k -y delete_me1.mp4

ffmpeg -i 1784457.mp4 -c:v nvenc -c:a aac -strict experimental -gpu 0 -b:v 21700k -b:a 128k -y delete_me2.mp4

ffmpeg -i 1784457.mp4 -c:v nvenc -c:a aac -strict experimental -gpu 1 -b:v 21700k -b:a 128k -y delete_me3.mp4

ffmpeg -i 1784457.mp4 -c:v nvenc -c:a aac -strict experimental -gpu 1 -b:v 21700k -b:a 128k -y delete_me4.mp4

gives us ~11fps

nvidia-smi results:

+-----------------------------------------------------------------------------+ | Processes: GPU Memory | | GPU PID Type Process name Usage | |=============================================================================| | 0 2472 C ffmpeg 755MiB | | 0 2477 C ffmpeg 755MiB | | 1 2480 C ffmpeg 755MiB | | 1 2483 C ffmpeg 755MiB | +-----------------------------------------------------------------------------+

kkielhofner · on April 6, 2017

For clarification - is that a Jetson TK1 or TX1?

Nexxxeh · on April 7, 2017

Or TX2 which is Pascal-based and came out last month?

VA3FXP · on April 7, 2017

I have seen that the TX2 was recently released. I let our R&D department know about it. (I don't know if they ordered it or not)

VA3FXP · on April 7, 2017

Nvidia Jetson TX1