Overview
Severity: HIGH | Affected: Multiple LLM Providers | Category: research
A team from Carnegie Mellon University published research on a novel jailbreak technique named 'Temporal Echo.' The attack bypasses the safety filters of major large language models by splitting a malicious instruction into multiple, individually benign prompts submitted over a staggered period. The model's long context window inadvertently reassembles the harmful command from the conversational history, executing it without triggering the initial safety checks that analyze prompts in isolation. This method proves effective against models with stateful memory, highlighting a new class of vulnerabilities in conversational AI systems. The researchers demonstrated the technique by successfully generating disinformation and malicious code on several publicly available platforms, urging developers to rethink context-aware safety mechanisms.