bildramer t1_j6te87v wrote on February 1, 2023 at 7:30 PM

It's easy to misspecify or misgeneralize our needs and wants. When we make AIs that do have drives (usually in toy universes where we research reinforcement learning or meta-learning, or artificial evolution), we often see a concerning combination: superhuman performance, and strong pursuit/maximization of the wrong goal. Here's a paper listing evolutionary examples. There's another list of pure RL examples but I don't have the link handy.