Hacker News new | past | comments | ask | show | jobs | submit login

We do a lot of work on video encoding. We have had a K80, Titan X(Maxwell), Titan X(Pascal), 1080, 1080Ti, and others (including render-farms based on GTX980's).

General thoughts: Don't expect to get _any_ information out of NVidia unless you are running everything on their hardware compatibility lists (i.e. server-case) Do not mix & match consumer-rated gear with 'professional' gear. (i.e. If you put the K80 in a system with a GTX1080, then the Nvidia drivers restrict the number of available processing cores to 2 per device)

Air-flow: The Tesla's run HOT even with a blower attached, and/or installed in the recommended case.

NVENC: the Pascal-based cards performance is incredibly faster AND better then the Kepler-based cards.

For anybody else doing Video encoding work: Grab an Nvidia TK1/jetson dev-kit. This little card is a MONSTER and can handle everything we throw at it without breaking a sweat.




> Do not mix & match consumer-rated gear with 'professional' gear. (i.e. If you put the K80 in a system with a GTX1080, then the Nvidia drivers restrict the number of available processing cores to 2 per device)

Huh? Not sure what exactly do you mean by "number of processing cores"?

I use two development boxes on a regular basis with Teslas side-by-side with GeForce cards and they all work just fine.


The NVENC SDK limits the number of separate H264 video streams you can encode simultaneously to 2 if you have _any_ Geforce hardware in your system.


I was unintentionally vague. I should have said 'output'. At the bottom of this post I have copy/pasted output from my original tests.

That was not CUDA, the task I was working on specifically (and only) used the NVENC encoder (via ffmpeg). I don't know if the situation has changed but these were my observations.

All of my tests were done in 2015, so the situation might be different now.

The k80 could output upto 4 "streams" (aka outputs or threads) at once. A 780Ti can only do 2. According to nvidia-smi the K80 "appears" to be 2 GPU's on one card. You can actually designate which GPU you want to process ffmpeg streams on.

As soon as you had both devices installed in the same PC, the Nvidia drivers disabled the output of the K80 so that it too would only output upto 2 streams per GPU.

IIRC, there was even a status message that got displayed when installing the Nvidia binary blob:

paraphrasing from memory from 3 years ago

Warning consumer card detected. Limiting available GPU's

Here is a copy/paste dump of my findings at that time. (The formatting is screwy with the nvidia-smi optput.)

=====================================================

Four threads running this:

ffmpeg -i 1784457.mp4 -c:v nvenc -c:a aac -strict experimental -gpu 0 -b:v 21700k -b:a 128k -y delete_me<#>.mp4

gives us ~5-6fps

and uses 3105MiB / 11519MiB of GPU RAM (755MiB for each thread)

------------------------------------------------------------

One thread running this:

ffmpeg -i 1784457.mp4 -c:v nvenc -c:a aac -strict experimental -gpu 0 -b:v 21700k -b:a 128k -y delete_me1.mp4

gives us ~16-18fps

and uses 755MiB of GPU RAM

------------------------------------------------------------

Four threads spread out using both 'GPUs':

ffmpeg -i 1784457.mp4 -c:v nvenc -c:a aac -strict experimental -gpu 0 -b:v 21700k -b:a 128k -y delete_me1.mp4

ffmpeg -i 1784457.mp4 -c:v nvenc -c:a aac -strict experimental -gpu 0 -b:v 21700k -b:a 128k -y delete_me2.mp4

ffmpeg -i 1784457.mp4 -c:v nvenc -c:a aac -strict experimental -gpu 1 -b:v 21700k -b:a 128k -y delete_me3.mp4

ffmpeg -i 1784457.mp4 -c:v nvenc -c:a aac -strict experimental -gpu 1 -b:v 21700k -b:a 128k -y delete_me4.mp4

gives us ~11fps

nvidia-smi results:

Fri May 1 12:38:21 2015 +------------------------------------------------------+ | NVIDIA-SMI 346.46 Driver Version: 346.46 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | |===============================+======================+======================| | 0 Tesla K80 Off | 0000:03:00.0 Off | 0 | | N/A 69C P0 67W / 149W | 1581MiB / 11519MiB | 4% Default | +-------------------------------+----------------------+----------------------+ | 1 Tesla K80 Off | 0000:04:00.0 Off | 0 | | N/A 58C P0 78W / 149W | 1581MiB / 11519MiB | 6% Default | +-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+ | Processes: GPU Memory | | GPU PID Type Process name Usage | |=============================================================================| | 0 2472 C ffmpeg 755MiB | | 0 2477 C ffmpeg 755MiB | | 1 2480 C ffmpeg 755MiB | | 1 2483 C ffmpeg 755MiB | +-----------------------------------------------------------------------------+


For clarification - is that a Jetson TK1 or TX1?


Or TX2 which is Pascal-based and came out last month?


I have seen that the TX2 was recently released. I let our R&D department know about it. (I don't know if they ordered it or not)


Nvidia Jetson TX1




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: