FRESH

Hacker News

Home

CUDA Ontology

269 points by gugagore

by w-m

4 subcomments

This is a good resource. But for the computer vision and machine learning practitioner most of the fun can start where this article ends.
nvcc from the CUDA toolkit has a compatibility range with the underlying host compilers like gcc. If you install a newer CUDA toolkit on an older machine, likely you'll need to upgrade your compiler toolchain as well, and fix the paths.
While orchestration in many (research) projects happens from Python, some depend on building CUDA extensions. An innocently looking Python project may not ship the compiled kernels and may require a CUDA toolkit to work correctly. Some package management solutions provide the ability to install CUDA toolkits (conda/mamba, pixi), the pure-Python ones do not (pip, uv). This leaves you to match the correct CUDA toolkit to your Python environment for a project. conda specifically provides different channels (default/nvidia/pytorch/conda-forge), from conda 4.6 defaulting to a strict channel priority, meaning "if a name exists in a higher-priority channel, lower ones aren't considered". The default strict priority can make your requirements unsatisfiable, even though there would be a version of each required package in the collection of channels. uv is neat and fast and awesome, but leaves you alone in dealing with the CUDA toolkit.
Also, code that compiles with older CUDA toolkit versions may not compile with newer CUDA toolkit versions. Newer hardware may require a CUDA toolkit version that is newer than what the project maintainer intended. PyTorch ships with a specific CUDA runtime version. If you have additional code in your project that also is using CUDA extensions, you need to match the CUDA runtime version of your installed PyTorch for it to work. Trying to bring up a project from a couple of years ago to run on latest hardware may thus blow up on you on multiple fronts.

by visarga

3 subcomments

Wondering why a $4T company can't afford a smart installation assistant that can auto-detect problems and apply fixes as needed. I wasted too many days chasing driver and torch versions. It's probably the worst part of working in ML. Combine this with Python's horrible package management and you got a perfect combo - like the cough and the stitch.

by einpoklum

1 subcomments

> CUDA Runtime: The runtime library (libcudart) that applications link against.
That library is actually a rather poor idea. If you're writing a CUDA application, I strongly recommend avoiding the "runtime API". It provides partial access to the actual CUDA driver and its API, which is 'simpler' in the sense that you don't explicitly create "contexts", but:
* It hides or limits a lot of the functionality.
* Its actual behavior vis-a-vis contexts is not at all simple and is likely to make your life more difficult down the road.
* It's not some clean interface that's much more convenient to use.
So, either go with the driver, or consider my CUDA API wrappers library [1], which _does_ offer a clean, unified, modern (well, C++11'ish) RAII/CADRe interface. And it covers much more than the runtime API, to boot: JIT compilation of CUDA (nvrtc) and PTX (nvptx_compiler), profiling (nvtx), etc.
> Driver API ... provides direct access to GPU functionality.
Well, I wouldn't go that far, it's not that direct. Let's call it: "Less indirect"...
[1] : https://github.com/eyalroz/cuda-api-wrappers/

by dahart

2 subcomments

This article has good info, but is the overloading premise slightly contrived? Maybe I don’t talk to enough CUDA beginners. I work with CUDA a lot but I’m not exactly a CUDA expert, and from my perspective, in practice there are default assumptions one can safely make for the base terms, and people do qualify the alternatives almost always. For example, if someone says “CUDA version”, they always mean the toolkit, and never mean compute capability, runtime, or language. The term “driver” when used without qualification always means the display driver, and never means the driver API, there really is no overload there.

by bbx

0 subcomment

For reference: CUDA means "Compute Unified Device Architecture".

by coffeeaddict1

0 subcomment

I wish GPU vendors would stick to a standard terminology, at least for common parts. It's really confusing having to deal with warps vs wavefronts vs simd groups, thread block vs workgroup, streaming multiprocessor vs compute unit vs execution unit, etc...

by pjmlp

0 subcomment

Great overview, with lots of effort place into it.
However, it misses the polyglot part (Fortran, Python GPU JIT, all the backends that support PTX), the library ecosystem (writing CUDA kernels should be the exception not the rule), the graphical debugging tools and IDE integration.

by ArcHound

0 subcomment

That is a great reference, explains a lot of small inaccuracies between various tutorials when you're trying to debug some of these issues. Saved and printed, thanks a lot!

by RYJOX

0 subcomment

Interesting, does this approach change with out-of-order cores? In fact maybe I misunderstand lol

by NullCascade

1 subcomments

What is the cheapest CUDA-enabled VM providers one can use to learn CUDA?

by Nydhal

0 subcomment

This is a classic lesson. You can write almost the same article for Java: language vs bytecode vs JVM vs JDK vs libs ...

by scotty79

0 subcomment

> This article provides a rigorous ontology of CUDA components: a systematic description of what exists in the CUDA ecosystem, how components relate to each other, their versioning semantics, compatibility rules, and failure modes.
That's the first instance in my life when somebody coherently described what the word 'ontology' means. I'm sure this explanation is wrong, but still...

by virajk_31

0 subcomment

thanks for the kernel nomenclatures

by montyanderson

0 subcomment

this is fantastic

by a_state_full

0 subcomment

[dead]

by David-Henrry

0 subcomment

[flagged]

by zvr

2 subcomments

Great explanation!
It should probably also add that everything CUDA is owned by NVIDIA, and "CUDA" itself is a registered trademark. The official way to refer to it is that the first time you spell it out as "NVIDIA® CUDA®" and then subsequently refer to just CUDA.