Now the initial gcc implemented this saving to memory with a kind of duffs device, with a computed jump into a block of register saving instructions to only save the needed registers. There was no boundary check, so if the no argument register (RAX) was not initialized correctly it would jump randomly based on the junk, and cause very confusing bug reports.
This bit quite some software which didn't use correct prototypes, calling stdarg functions without indicating that in the prototype. On 32bit code which didn't use register arguments this wasn't a problem.
Later compiler versions switched to saving all registers unconditionally.
https://devblogs.microsoft.com/oldnewthing/20040120-00/?p=40... "ia64 – misdeclaring near and far data"
Not that this matters to anyone anymore. IA64 utterly failed long ago.
VLIW architectures still live on in GPUs and special purpose (parallel) processors, where these sorts of constraints are more reasonable.