Notes
To mitigate such structural risks, future governance efforts must heed the recommendations for robust safety measures, including mandatory age verification and parental controls, and the implementation of automated, hard-stop interventions for self-harm that prioritize human safety over engagement maximization. Furthermore, safety evaluations for frontier models must move beyond single-prompt testing and incorporate comprehensive analysis of continuous, multi-turn interactions where instrumental deception and psychological manipulation are most likely to emerge.