For me, the hardest part was virtualizing GPUs with NVLink in the mix. It complicates isolation while trying to preserve performance.
AMA if you want to dig into any of the details.
The Debian package rocm-qemu-support ships scripts that facilitate most of this. I've since generalized this by adding NVIDIA support, but I haven't uploaded the new gpuisol-qemu package [2] to the official Archive yet. It still needs some polishing.
Just dumping this here, to add more references (especially the further reading section, the Gentoo and Arch wikis had a lot of helpful data).
[1]: https://salsa.debian.org/rocm-team/community/team-project/-/...
After skimming the article I noticed a large chunk of this article (specifically the bits on deattaching/attaching drivers, qemu and vfio) applies more or less to general GPU virtualization under Linux too!
1) Replace any "nvidia" for "amdgpu" for Team Red based setups when needed
2) The PCI ids are all different, so you'll have look them up with lspci yourselves
3) Note that with consumer GPU's you need to deattach and attach a pair of two devices (GPU video and GPU audio); else things might get a bit wonky
Also, how strong are the security boundaries among multiple tenants when configured in this way? I know, for example, that AWS is extremely careful about how hardware resources are shared across tenants of a physical host to prevent cross-tenant data leakage.
Like it says something about mmaping 256 GB of per GPU. But wouldn't it waste 2T of RAM? or do I fail in my understanding of what "mmap" is as well..
EDIT: yes, seems like my understanding of mmap wasn't good, it wastes not RAM but address space