Update 2 - cuda::std::bit_cast is here!
The newest version of libcu++ has
Implemented and backported C++20 bit_cast. It is available in all standard modes and constexpr with compiler support
It is available in the CUDA Toolkit >= 12.8 and from the CCCL repo.
Update 1 - Why you might not want to use --expt-relaxed-constexpr
My view of --expt-relaxed-constexpr has changed after finding some funny behavior similar to what is described in this issue in a Nvidia project. I.e. they know about these problems which might be the reason for the flag being deemed experimental.
While I don't think that usage std::bit_cast in particular in device code is problematic, compiling with this flag could cause accidental usage of other constexpr functions that are less basic and less safe. Also note that the flag does not only allow the usage of constexpr functions at compile time as I previously thought, but also at runtime (i.e. with non-constexpr input) which is the cause of these issues. This was probably fine at the time of introduction as constexpr functions were very restricted but with newer C++ standards more and more functionality became available in constexpr functions that is not available in device code and seem to be simply ignored which is dangerous. With CUDA 12.8 Nvidia has added information regarding this issue to the documentation.
Initial answer - You could use --expt-relaxed-constexpr
Given a host compiler that supports it, you can use std::bit_cast in CUDA C++20 device code (i.e. CUDA >=12) to initialize a constexpr variable. You just need to tell nvcc to make it possible by passing --expt-relaxed-constexpr.
This flag is labeled as an "Experimental flag", but to me it sounds more like "this flag might be removed/renamed in a future release" than a "here be dragons" in terms of its results. It is also already quite old, which gives me some confidence. See the CUDA 8.0 nvcc docs from 2016 (docs for even older versions are not available online as html, so I didn't check further back).
As constexpr code is evaluated by the compiler on the host independent of the surrounding device context, I would not expect this flag to be some brittle "black magic". It just needs to pass off the evaluation to the host compiler and use the resulting value/object.
Given all this context I would rather expect the --expt-relaxed-constexpr-behavior to become the default in some future CUDA version than it vanishing without a replacement.
If you don't need constexpr
For anyone who needs a non-constexpr version of bit_cast, see Safe equivalent of std::bit_cast in C++11 (just add __device__).
reinterpret_castseems to work for type-punning in CUDA (not sure if the compiler just doesn't use the strict aliasing rule in the absence of__restrict__or if I just haven't seen a case where it can fail), but the correct/secure C++ way is usingmemcpy. In my experience the compiler is able to avoid the actualmemcpyso one doesn't have to worry about performance.constexpr floatwhich is probably what you are looking for? In this case aliasing (of a literal) isn't really a problem either.memcpyway. But I'm explicit interested in aconstexprway. For what it's worth, I intend to pass that value by means of a template parameter (which only supports integer types).--expt-relaxed-constexprsuggestion message by NVCC, but the wording it comes with carries a big Here be Dragons vibe.