6

I have an Nvidia graphics card, and am using the proprietary drivers.

I looked into the ffmpeg H265 encoders available to me, and found hevc_nvenc. Using hevc_nvenc does in fact use the GPU to encode the video, which massively increases encode time, but the output filesize is considerably larger.

For example: (input.mp4 is H264 and aac)

ffmpeg -hwaccel cuda -i input.mp4 -c:v libx265 -c:a libopus -crf 26 libx265_output.mkv

ffmpeg -hwaccel cuda -i input.mp4 -c:v hevc_nvenc -c:a libopus -crf 26 hevc_nvenc_output.mkv

Filesizes are:

input.mp4             351M
libx265_output.mkv    134M
hevc_nvenc_output.mkv 360M

ffprobe shows both outputs as hevc encoded, and input as h264.

So why does hevc_nvenc seem to perform so poorly? There must be something I'm missing.

1
  • 1
    you mean "massively decreases encode time" ? Commented Jun 18, 2023 at 4:13

2 Answers 2

7

Update

The hardware accelerated encoders do not support Constant Rate Factor (CRF, -crf) to determine size/quality ratio. You can check what the encoder supports, e.g. with ffmpeg -h encoder=hevc_nvenc -hide_banner. Dennis Mungai's detailed answer to 'How can I use CRF encoding with nvenc in ffmpeg?' at Superuser suggests to use -cq:v 19 and -rc:v vbr parameters instead to get constant quality with variable bitrate. You probably want to consider that to help in your case.

It worth noting that this is still different from CRF because constant quality, as it is determined by -cq, sets constant quantization parameter (CQP) instead. Here's a quote from the CRF Guide by Werner Robitza explaining why CRF still allows to save bits in comparison to setting constant QP:

… The quantization parameter defines how much information to discard from a given block of pixels (a Macroblock). This typically leads to a hugely varying bitrate over the entire sequence.

Constant Rate Factor is a little more sophisticated than that. It will compress different frames by different amounts, thus varying the QP as necessary to maintain a certain level of perceived quality. It does this by taking motion into account. …


Original answer

ffmpeg wiki asserts that this is typical for hardware accelerated encoding:

Hardware encoders typically generate output of significantly lower quality than good software encoders like x264, but are generally faster and do not use much CPU resource. (That is, they require a higher bitrate to make output with the same perceptual quality, or they make output with a lower perceptual quality at the same bitrate.)

Peter Cordes suggested the following explanation as a part of his answer at Video Production Stack Exchange to 'Why processor is "better" for encoding than GPU?':

My understanding is that the search space for video encoding is SO big that smart heuristics for early-termination of search paths on CPUs beat the brute-force GPUs bring to the table, at least for high quality encoding. It's only compared to -preset ultrafast where you might reasonably choose HW encoding over x264, esp. if you have a slow CPU (like laptop with dual core and no hyperthreading). On a fast CPU (i7 quad core with hyperthreading), x264 superfast is probably going to be as fast, and look better (at the same bitrate).

If you're making an encode where rate-distortion (quality per file size) matters at all, you should use x264 -preset medium or slower. If you're archiving something, spending a bit more CPU time now will save bytes for as long as you're keeping that file around.

1
  • 4
    In a nutshell: Hardware accelerated implementations of compressors are designed for speed. Some features cannot be implemented due to architectural limitations. For this reason, when targeting the same quality, hardware compression will likely result in larger file sizes. Commented Nov 13, 2021 at 14:29
0

ffmpeg -h encoder=hevc_nvenc

gives your actual list of presets

  -preset            <int>        E..V....... Set the encoding preset (from 0 to 18) (default p4)
     default         0            E..V.......
     slow            1            E..V....... hq 2 passes
     medium          2            E..V....... hq 1 pass
     fast            3            E..V....... hp 1 pass
     hp              4            E..V.......
     hq              5            E..V.......
     bd              6            E..V.......
     ll              7            E..V....... low latency
     llhq            8            E..V....... low latency hq
     llhp            9            E..V....... low latency hp
     lossless        10           E..V....... lossless
     losslesshp      11           E..V....... lossless hp
     p1              12           E..V....... fastest (lowest quality)
     p2              13           E..V....... faster (lower quality)
     p3              14           E..V....... fast (low quality)
     p4              15           E..V....... medium (default)
     p5              16           E..V....... slow (good quality)
     p6              17           E..V....... slower (better quality)
     p7              18           E..V....... slowest (best quality)
  -tune              <int>        E..V....... Set the encoding tuning info (from 1 to 4) (default hq)
1
  • 2
    This is probably better as an edit to the other answer as it does not answer the question. Commented Dec 31, 2022 at 18:46

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.