-1

I'm working on a media processing SDK with a substantial C++ codebase that currently uses FFmpeg for video decoding on native platforms (Windows/Linux). I need to port this to browsers while preserving both the existing C++ architecture and performance characteristics. The WASM approach is critical for us because it allows leveraging our existing optimized C++ media processing pipeline without a complete JavaScript rewrite, while maintaining the performance benefits of compiled native code.

The Challenge: WebAssembly runs in a browser sandbox that typically doesn't allow direct GPU access, which conflicts with our hardware-accelerated video decoding requirements. Pure JavaScript solutions would require abandoning our mature C++ codebase and likely result in significant performance degradation.

My Questions:

  1. Is it technically feasible to compile FFmpeg with hardware acceleration support (NVENC/VAAPI/VideoToolbox) for WASM targets? Additionally, can the underlying hardware acceleration dependencies (like CUDA runtime, Intel Media SDK, or platform-specific GPU drivers) be compiled as WASM modules, and would this approach serve the purpose of enabling hardware acceleration in the browser environment?

  2. Are there any emerging browser APIs or proposals (like WebGPU, WebCodecs API) that could provide a pathway for hardware-accelerated video decoding in WASM modules while preserving our C++ architecture?

  3. Has anyone successfully implemented hardware-accelerated video decoding in a browser environment using WASM, or are there alternative approaches that would allow us to maintain our existing C++ codebase and performance requirements?

Context:

  • Extensive C++ media processing pipeline with FFmpeg 7.1.0
  • Target streams: H.264 and HEVC
  • Performance requirements make software-only decoding insufficient
  • Rewriting the entire codebase in JavaScript is not feasible due to complexity and performance constraints

Any insights, experiences, or alternative architectural suggestions that preserve our C++ investment would be greatly appreciated!

I attempted to compile FFmpeg for WebAssembly using Emscripten with hardware acceleration enabled. My approach was to call the FFmpeg configure script from a Linux bash environment using:

emconfigure ./configure \
  --target-os=none \
  --arch=x86_32 \
  --enable-cross-compile \
  --disable-debug \
  --disable-x86asm \
  --disable-inline-asm \
  --disable-stripping \
  --disable-programs \
  --disable-doc \
  --disable-all \
  --enable-avcodec \
  --enable-gray \
  --enable-avformat \
  --enable-avfilter \
  --enable-avdevice \
  --enable-avutil \
  --enable-swresample \
  --enable-swscale \
  --enable-filters \
  --enable-protocol=file \
  --enable-decoder=h264 \
  --enable-vaapi \
  --enable-hwaccel=h264_vaapi \
  --enable-gpl \
  --enable-pthreads \
  --extra-cflags="-O3 -I$ffmpegIncludesDir -I$libvaIncludesDir" \
  --extra-cxxflags="-O3 -I$ffmpegIncludesDir -I$libvaIncludesDir" \
  --extra-ldflags="--initial-memory=33554432 --no-entry --relocatable -L$ffmpegLibrariesDir -L$libvaLibrariesDir" \
  --nm="emnm -g" \
  --ar=emar \
  --as="$EMSDK/upstream/bin/wasm-as" \
  --ranlib=emranlib \
  --cc=emcc \
  --cxx=em++ \
  --objcc=emcc \
  --dep-cc=emcc

Since I specified x86_32 architecture, I provided the i386 version of the libva (VAAPI) library to match the target architecture. However, the configuration failed with the error "unknown file type: /usr/lib/i386-linux-gnu/libva.so" in the resulting config.log file.

This error suggests that Emscripten's toolchain cannot process native Linux shared libraries (.so files), which makes sense since these are compiled for native execution rather than WebAssembly. The configuration specifically targets VAAPI hardware acceleration for H.264 decoding, but this approach appears fundamentally flawed since VAAPI requires direct hardware access that isn't available in the browser sandbox, and the native libraries cannot be linked into a WASM module.

This experience has led me to question whether the hardware acceleration dependencies can be meaningfully compiled for WASM, or if alternative approaches are needed.

5
  • Doesn't sound trivial, maybe chrome extension could call out to native exe? or webcodecs api? stackoverflow.com/questions/17025600/… Commented Jul 7 at 13:44
  • Please remember that webassembly is only an interface to a web browser; it is just already precompiled code which runs still next to javascript. WebGPU (developer.mozilla.org/en-US/docs/Web/API/WebGPU_API) seems like a possible path, but I doubt it would integrate of of the box with accelerated code from ffmpeg which uses different apis (like cuda) for delegating code to gpu. Commented Jul 20 at 6:52
  • @MaciejAleksandrowicz I personally don't understand the OP's problem here. The written words say a lot of claim about some unknown (what does it do?) mature, substantial and extensive "C++ codebase" but they say nothing about a real Javascript problem. Nine paragraphs later, and not a single "Media processing here means extracting frame thumbnails, so I made a function and the problem is...". It's like they've never actually used the WASM API, or the MediaSource API, or the WebCodecs API... PS: First I thought it was a bot but I think the OP used AI (ie: is not an English speaker?) Commented Jul 20 at 16:50
  • @MaciejAleksandrowicz, thank you for your suggestion about WebGPU. As far as I understand from WebGPU docs, I can set up a compute shader that evaluates a function directly on the GPU, but I cannot access the embedded h.264/h.265 decoder. So I fear that, currently, WebGPU is not a viable option. Commented Aug 6 at 8:26
  • @VC.One, yes it's the first time I deal with these technologies and I'm sorry if I did not explain myself clearly. Currently, as I wrote in my original post, I'm working on a media processing SDK with a substantial C++ codebase. The company I work for has a relatively new web app and the aim is to integrate the SDK in it so that the existing mature and high-performance c++ codebase can be re-used. The WASM technology looks like the right choice to take advantage of the c++ SDK inside a web browser, but if any other techs I'm not aware of are available, I'm open to other ways to reach the goal Commented Aug 6 at 8:58

1 Answer 1

0

Pure JavaScript solutions would require abandoning our mature C++ codebase and likely result in significant performance degradation.

Rewriting the entire codebase in JavaScript is not feasible due to complexity and performance constraints_

You don't have to re-write the whole codebase. Your question says you are decoding H264 streams. That tells us that you are processing .h264 files (or receiving such a data structure).

You have to only write Javascript code that extracts the bytes range of a frame (in the .h264 data every video frame is separated by a header ending with 3 bytes of [00] [00] [01].

Use Javascript or C++ (via WASM) to search the file or bytes array, extracting a frame, then giving those bytes (put inside an array) to Webcodecs for the fastest possible in-browser decoding.

The result is given back into C++ for some other functions to do the next "media processing" part.

PS:
Are you aware that browsers can play h264 and H265? If your "decoding H264" actually means "playback of H264" then maybe forget decoding and just mux into MP4 and then display that MP4 through a HTML5 <video> tag.

Muxing is fast (almost instant for small files under 20MB) so you can display something in no time.

Are there alternative approaches that would allow us to maintain our existing C++ codebase and performance requirements?

Webcodecs is using hardware-accelerated decoding. The browser's source code will talk to the GPU and decode a frame as fast as possible (if the user device has a GPU that allows it).

When a user provides H264/HEVC file bytes, instead of sending them to your C++ decoding function, just pass them instead to a Javascript function for decoding using Webcodecs.

A decode process will return image data. You did not say if your C++ decoding gives back YUV or RGB data. I'm not sure what you expect as result, but Javascript will give you RGB. Use it in the next C++ function (usually this "next" function handles the returned output from your FFmpeg decoding).

WASM allows for passing of bytes data between C++ functions and Javascript functions. They both write and read from the same global array (Memory). So JS can be used to decode and write those output bytes into WASM memory, then your C++ function reads WASM memory at same offset & length to get the same decoded bytes from JS.

Sign up to request clarification or add additional context in comments.

1 Comment

Many thanks for your response, I'll give you further details about my needs. The h.264/h.265 video is encapsulated inside a proprietary streaming protocol and we use the SDK to extract the frames. Once extracted, the frames are decoded using ffmpeg and then some post-processing is applied to the decoded images, so playing back using the "<video>" javascript tag is not an option. Using Webcodecs, as far as I understand, means passing the h.264/h.265 video frame to the javascript side for HW decoding and then back to the c++ environment for the post-processing stage. Is that correct?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.