Submitted by visarga t3_105l3t4 in singularity
AndromedaAnimated t1_j3cc5tg wrote
Despite this being not my idea of alignment approach (I am more into emergent moral abilities and the importance of choice), I love this article. It’s a new approach and this is always good.
I do see danger hidden in it though - think of „deceptive alignment“. My „prophecy“ here is that models that favor „harmlessness“ instead of „moral choice“ will be prone to deception.
Viewing a single comment thread. View all comments