CUDA Driver API vs. CUDA runtime
- by Morten Christiansen
When writing CUDA applications, you can either work at the driver level or at the runtime level as illustrated on this image (The libraries are CUFFT and CUBLAS for advanced math):
I assume the tradeoff between the two are increased performance for the low-evel API but at the cost of increased complexity of code. What are the concrete differences and are there any significant things which you cannot do with the high-level API?
I am using CUDA.net for interop with C# and it is built as a copy of the driver API. This encourages writing a lot of rather complex code in C# while the C++ equivalent would be more simple using the runtime API. Is there anything to win by doing it this way? The one benefit I can see is that it is easier to integrate intelligent error handling with the rest of the C# code.