-6

I have been told that when it comes to GPU APIs like Vulkan and DirectX and the host is for example little-endian and the GPU is big-endian that you can read for example a 32-bit integer and the driver will make sure that, if necessary, the bytes are swapped.

  1. Is this the case on most or all APIs? CUDA? WebGPU etc.?

Secondly, I have been told that its safe to read integers like this as long as you're reading integers (owing to the fact that the driver does the necessary byte-swapping), but when you are packing bits that the assurance of correct bit pattern breaks down. I don't understand why this is, because if the driver ensures that the bytes will be swapped if necessary, then when I pack 0xFF on the host and then I read 0xFF on the GPU, assuming the host is little-endian and the GPU is big-endian, according to the claim that the driver swaps the bytes, then 0xFF on the GPU will also give me the low bits.

  1. Is this correct?

1 Answer 1

2

Vulkan requires that the device has the same endian as the host.

The representation and endianness of these types on the host must match the representation and endianness of the same types on every physical device supported.

This is the first of the Fundamentals for the Vulkan spec specified in §3.1

The Direct3D spec (version 10 to 11.3) is a lot looser, only talking about endianness with regards to texture formats

Note, that this table only is true about the effective format definitions for little-endian host CPU systems. The D3D10+ specification for formats has diverged from the D3D9 format definitions, as a response to merging the vertex and texture formats and desiring a cross-endianness solution.

https://microsoft.github.io/DirectX-Specs/d3d/archive/D3D11_3_FunctionalSpec.htm#25.1.1%20Mapping%20of%20Legacy%20Formats

The Direct3D12 spec mentions endianness even less, with its only mention for planar depth formats.

https://microsoft.github.io/DirectX-Specs/d3d/PlanarDepthStencilDDISpec.html#depth-plane-1

I couldn't find any mention of endianness in the WDDM documentation (aside from a random mention of how to declare an integer for checking the magic value of an ACPI table).

Overall, you can just assume that the endianness will match the host. You can probably also assume that it will be little endian, since basically nothing in the consumer space uses big endian (I could only find references for some IBM mainframes and spacecraft processors using big endian).


As for whether a driver could even feasibly flip the endian: no, probably not. Modern graphics drivers generally allow directly mapping memory on the device. Even if it's performing DMA (or even PIO) the driver would still have to know the data format of the buffer, which generally isn't known until it's loaded into the rendering pipeline with the input layout (and that's to make no mention of storage buffers). The GPU could technically implement this at the hardware level, but at that point it'd make more sense to just use the same endian as the host, which basically has to be little endian for desktop support, and any niche environments would have the GPU developed alongside the CPU.

Sign up to request clarification or add additional context in comments.

2 Comments

So things are pretty much forced to be little-endian for anything to work? I mean, Vulkan won't work on little-endian GPUs because it basically runs on little-endian hosts, and the same for DirectX?
Unless you're on an IBM mainframe, or a spacecraft, you are pretty much only going to have little endian systems. Vulkan works fine on little endian GPUs, so long as the host is little endian. Same goes for D3D, and since every Windows desktop is little-endian, every GPU with Windows drivers is also likely little endian (assuming it supports a programmable pipeline).

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.