Submitted by kegzilla t3_yk5kgj in singularity
ProShortKingAction t1_iurhi4z wrote
How do you prevent the robot from writing unsafe code? If it is continually adding new code without being checked by devs or a security team it seems like you'd run into the issue of there always being the possibility of it being one instruction away from generating code that includes a dangerous vulnerability
Sashinii t1_iurkiad wrote
They address the potential negatives with built-in safety checks, while also encouraging suggestions for other methods to ensure that the AI is as safe as possible.
ProShortKingAction t1_iurmfoc wrote
Sorry, I took that as them saying they had built-in safety checks that are meant to prevent the robot from doing an unsafe physical action not prevent it from writing vulnerable code. I might have misinterpreted that.
Another thing I would like to bring up in the favor of this model of going about things is that vulnerabilities slip through in regular code all the time, this approach doesn't have to be perfect just more safe than the current approach. It's like with driverless cars, they don't have to be perfect just more safe than a car driven by a human which seems like a low bar. I just don't see anything from this post that implies a safe way to do this approach isn't rather far off
Edit: In the Twitter thread made by one of the researchers posted elsewhere in this thread they very vaguely mention "... and many potential safety risks need to be addressed" its hard to tell if this is referencing the robot physically interacting with the world, cybersecurity concerns, or both.
visarga t1_iusk21l wrote
They do a few preventive measures.
> we first check that it is safe to run by ensuring there are no import statements, special variables that begin with __, or calls to exec and eval. Then, we call Python’s exec function with the code as the input string and two dictionaries that form the scope of that code execution: (i) globals, containing all APIs that the generated code might call, and (ii) locals, an empty dictionary which will be populated with variables and new functions defined during exec. If the LMP is expected to return a value, we obtain it from locals after exec finishes.
ProShortKingAction t1_iuskyif wrote
This seems to be saying "safe to run" as in make it less likely to crash not as in prevent cybersecurity issues.
visarga t1_iuvrqym wrote
it prevents access to various Python APIs, exec and eval
it's just a basic check
Viewing a single comment thread. View all comments