Some control problems are difficult to state mathematically. What is the cost to go for putting on a sweater? What is the reward function for doing an ollie in a skateboard? If smart people work at it long enough they will figure out how to solve these problems. But, can we get away with less?
Turns out, if you spend a while using your human judgement to choose between two slightly different controllers you can incrementally improve the controller to (almost) do a back-flip. Could we do even better by using the power of the crowd?
You can try at https://da_doomer.gitlab.io/crowd-sourced-locomotion/.