Decoupled DiLoCo: Resilient, Distributed AI Training at Scale
48 points by metadat
by SilverElfin
3 subcomments
Is this actually innovative? I respect that there’s a lot of work in making it reality and doing it specifically for AI training by modifying their algorithms. But doing portions of work in clusters that are far apart and combining them has been done many times before for non AI things, right? Or so I would think.