evolving agents like those are very good at finding bugs to abuse. so, if they'd find a zero day to break out of the quake-sandbox, what would they do? i guess, for the longest time they'd just use the DMA into the sandbox to increase their score (i.e. cheating by developing and using external tools), as their fitness function still weights CTF wins only.
to (accidentally) change their fitness function to change it to "conquer the world" they'd need to modify a lot of different processes at once.
I think the (very sci-fy) fear there isn't "we gave this agent a goal of 'score xyz points' and it escaped the sandbox to increment the points counter" but instead "we gave this agent a goal of 'conquer the world' and made it think that the game it was in was the world, and then it discovered otherwise"
to (accidentally) change their fitness function to change it to "conquer the world" they'd need to modify a lot of different processes at once.
IMO the chance is negligible for now.