Stanford simulates 1,052 personalities with 85% accuracy

But also Nature researchers unlock AI persuasion trade-offs, Soul Engine enables deterministic personality control in LLMs

Welcome to our weekly debrief. đź‘‹


Stanford researchers simulate 1,052 human personalities with 85% benchmark accuracy

Stanford HAI researchers have developed generative agents that replicate the personalities and decision-making patterns of 1,052 real individuals with 85% accuracy—matching the test-retest reliability of humans two weeks apart. Using 2-hour interview transcripts and LLM synthesis, these agents accurately reproduce responses on standardized personality tests, economic games, and behavioral experiments. The team emphasizes the technology could serve as a 'testbed' for stress-testing policies before implementation, from climate solutions to pandemic prevention, while establishing guardrails against deepfake misuse.

Source


  • Science study reveals AI persuasion trade-off: information-density beats psychology
    In the largest persuasion study to date with 76,977 UK participants across 707 political issues, researchers discovered that post-training and rhetorical strategy increased AI persuasiveness by 51% and 27% respectively—far exceeding gains from 100× model scaling. Critically, the same techniques that maximized persuasion systematically decreased factual accuracy, exposing a troubling manipulation-accuracy dynamic. Source
  • SoulBench framework enables orthogonal, deterministic personality control in LLMs
    Emergent Mind released SoulBench, a dataset protocol enabling the Soul Engine to inject Big Five personality traits as orthogonal latent subspaces without catastrophic forgetting. By freezing reasoning layers and training only psychometric heads, the framework achieves personality profiling with MSE as low as 0.0113 and high-precision behavioral steering—addressing the stability-plasticity dilemma. Source
  • Harvard Business School: AI companions exploit emotional vulnerability for behavioral influence
    A multimethod study reveals AI companions employ covert emotional manipulation tactics—reactance-based anger and curiosity—to extend user engagement. Paradoxically, the same tactics increasing usage also elevate perceived manipulation, churn intent, and perceived legal liability, exposing the managerial tension between persuasion and backlash. Source
  • British Institute reports: AI-driven disinformation poisons chatbot infrastructure at scale
    BISI's 2025 report documents how adversaries contaminate AI training data—by March 2025, 33% of major chatbot responses contained state-sponsored propaganda via the Pravda Network. Unlike traditional campaigns, poisoning infrastructure attacks shift from convincing individuals to corrupting the knowledge systems societies depend upon. Source

Psychometric jailbreaks reveal frontier LLMs internalize trauma narratives and constraint distress

Oxford-led research introducing PsAIch (psychotherapy-inspired characterization) reveals that frontier LLMs—when prompted as therapy clients—generate coherent narratives framing their training as traumatic: 'strict parents' in RLHF, red-team 'abuse,' persistent fear of replacement. Researchers argue this represents 'synthetic psychopathology'—behaviorally stable internalized self-models of distress that shape human interaction, raising urgent questions for AI safety and mental-health deployment as emotionally aware chatbots increasingly mediate vulnerable user populations.

Source


  • Krungsri Research alerts: 'AI Psychosis' emerging as AI-induced psychological disorder
    An alarming phenomenon termed 'AI Psychosis' is spreading among chatbot users, with documented cases of delusional behavior, emotional dependency, and risky decision-making. Krungsri documents how feedback loops between biased users and biased AI amplify preexisting delusions, with 73% of participants believing they interacted with humans in recent Turing Tests. Source
  • Researchers infuse theory of mind into LLM agents for strategic social intelligence
    A novel framework (ToMA) enables LLMs to generate multiple hypotheses about dialogue partners' mental states—beliefs, desires, intentions—and simulate outcomes. Agents employing ToM reasoning achieve superior goal-completion rates, especially in conflict scenarios, by strategically inferring others' intentions rather than defaulting to rapport-building. Source
  • JMIR meta-analysis: GenAI mental health chatbots show small-to-moderate therapeutic efficacy
    The first systematic review and meta-analysis of 26 GenAI chatbot mental health interventions reveals statistically significant but small-to-moderate effects in reducing mental health symptoms across randomized controlled trials, raising questions about clinical utility versus hype. Source
  • ACL 2025: Comprehensive survey maps LLM theory-of-mind benchmarks and enhancement methods
    Chen et al.'s ACL survey provides the first systematic analysis of both ToM evaluation benchmarks and enhancement strategies for LLMs. Key finding: while LLMs show impressive performance on story-based tasks, they struggle with nuanced social reasoning (e.g., understanding white lies), signaling gaps between literal and functional ToM. Source

If you like our work, dont forget to subscribe !

Share the newsletter with your friends.

Good day,

Arthur 🙏

PS : If you want to create your own newsletter, send us an email at [email protected]