So my first impression is that either this is a non-evolutionary algorithm mascarading as one and diluting concepts like mutation and crossover that have well defined meanings, or it is one but you're abusing terminology from other fields (like RL and "rewards") instead. Either way it's a confusing first impression, and one gets the subtle vibe that word choices are more there to create a "buzz" than to create clarity.
(not trying to be dismissive, I genuinely hope this is useful feedback)
Paper does look interesting, I'll try to read properly when I have time.