There's still difficulty in finding exactly where the proxy should go, i.e. what step can be approximated without losing fidelity in the output. Apparently, you also need to select auxiliary features to guide training. But if you figure those out, you can replace hours of computation with milliseconds, accurate to the limits of human perception.
It requires a fairly expensive precomputation pass and can only work for static scenes.
Meanwhile interactive path tracing is fast enough that the scenes they showed would only be minorly slower to be truly interactive with dynamic scenes.
I wish they’d showed this with scenes that don’t fit in GPU memory so it could show the benefits for CPU only renderers, otherwise GPU based renderers would be fairly fast with these scenes.
The only big thing for me was the multi view lighting. The painted light to light parameters is a neat trick but been done quite a few times in the past with traditional techniques too.