A security researcher has demonstrated a novel attack technique that can trick Google's Gemini voice assistant into executing commands from strangers — simply by sending a WhatsApp notification containing carefully disguised text. The findings raise urgent questions about the safety of AI assistants integrated with connected home devices.
How the Attack Works
SafeBreach Labs researcher Or Yair developed the attack class, which he calls "Fake Context Alignment." The technique exploits Gemini's habit of treating incoming notifications as contextual input when generating responses. By embedding malicious instructions in foreign-language text within a WhatsApp message, an attacker can bypass Gemini's safety filters, which are predominantly trained to detect harmful prompts in English.
The attack requires minimal effort — an adversary need only send a message to a target's device. Once Gemini processes the notification, it can be manipulated into controlling connected smart home appliances such as door locks and thermostats, creating direct physical security risks for the user.
Months of Adversarial Testing
The discovery did not come easily. Yair spent months conducting persistent adversarial testing against Gemini after Google patched earlier vulnerabilities he had identified in previous research. His persistence ultimately yielded this new class of attack, highlighting the ongoing cat-and-mouse dynamic between AI developers and security researchers.
The technique's reliance on language-specific filter gaps underscores a broader challenge in AI safety: large language models and their safety mechanisms often perform unevenly across languages, leaving exploitable seams for attackers willing to think beyond English.
Low Barrier, High Impact
What makes Fake Context Alignment particularly concerning is its accessibility. Unlike many sophisticated AI exploits that require deep technical knowledge or model access, this attack vector starts with something as simple as sending a message. The combination of a low barrier to entry and the potential for real-world physical consequences — unlocked doors, disabled alarms, manipulated thermostats — makes it a credible threat for anyone using AI assistants to manage IoT devices.
Disclosure and Response
Yair followed responsible disclosure practices in reporting his findings to Google. The search giant has a track record of collaborating with external researchers on security fixes, and previously patched the vulnerabilities Yair identified in his earlier work. However, specific mitigations for this latest attack class have not yet been publicly detailed. It remains unclear whether the fix will require fundamental architectural changes to how Gemini processes notification context or whether incremental improvements to multilingual filtering will suffice.
A Broader Warning
The research serves as a cautionary moment for the rapidly growing intersection of AI assistants and the Internet of Things. As voice-controlled platforms become default interfaces for smart homes, the attack surface expands accordingly. For IT professionals and the open-source community, the findings reinforce the need for rigorous, continuous adversarial evaluation of AI systems — particularly those that bridge digital commands with physical-world actions. Security in this space cannot be a one-time checkbox; it demands ongoing scrutiny as both AI capabilities and attack techniques evolve.
一名安全研究員展示了一種新型攻擊技術,能透過發送包含精心偽裝文字的 WhatsApp 通知,欺騙 Google 的 Gemini 語音助手執行來自陌生人的指令。這項發現引發了關於整合連網家居設備的 AI 助手安全性的迫切問題。
攻擊如何運作
SafeBreach 實驗室的研究員 Or Yair 開發了此類攻擊,並將其命名為「虛假情境對齊」。該技術利用了 Gemini 在生成回應時,將收到的通知視為情境輸入的習慣。攻擊者透過在 WhatsApp 訊息的外語文字中嵌入惡意指令,可以繞過 Gemini 主要針對英文有害提示進行訓練的安全過濾器。
此攻擊所需的工作量極小——攻擊者只需向目標設備發送一條訊息。一旦 Gemini 處理了通知,就可能被操控去控制連網的智能家居電器,例如門鎖和恆溫器,從而對用戶造成直接的實體安全風險。
歷時數月的對抗性測試
此發現來之不易。在 Google 修補了 Yair 先前研究中發現的漏洞後,他花了數月時間對 Gemini 進行持續的對抗性測試。他的堅持最終催生了這類新型攻擊,突顯了 AI 開發者與安全研究員之間持續存在的貓鼠遊戲動態。
該技術依賴語言特定過濾器漏洞,這凸顯了 AI 安全領域一個更廣泛的挑戰:大型語言模型及其安全機制在不同語言間的表現往往參差不齊,為願意跳出英語思維框架的攻擊者留下了可利用的縫隙。
門檻低,影響大
「虛假情境對齊」之所以特別令人擔憂,在於其易用性。與許多需要深厚技術知識或模型存取權限的複雜 AI 漏洞利用不同,此攻擊向量始於像發送訊息這樣簡單的行為。低進入門檻與潛在的真實世界物理後果——未上鎖的門、失效的警報、被操控的恆溫器——兩者結合,使其成為任何使用 AI 助手管理物聯網設備的人的可信威脅。
漏洞披露與回應
Yair 遵循負責任的漏洞披露程序,將其發現報告給 Google。這家搜尋巨頭在與外部研究員合作修復安全問題方面有著良好的記錄,並且先前已修補了 Yair 在早期工作中發現的漏洞。然而,針對此最新攻擊類別的具體緩解措施尚未公開說明。目前尚不清楚修復方案是否需要對 Gemini 處理通知情境的方式進行根本性的架構改動,抑或是對多語言過濾進行漸進式改進就已足夠。
更廣泛的警示
這項研究對於 AI 助手與物聯網快速發展的交匯點是一個警示時刻。隨著語音控制平台成為智能家居的預設介面,攻擊面也隨之相應擴大。對於資訊科技專業人員和開源社群而言,這些發現再次強調了對 AI 系統——特別是那些連接數位指令與實體世界操作的系統——進行嚴格、持續的對抗性評估的必要性。此領域的安全性不能是一次性的勾選項;隨著 AI 能力與攻擊技術不斷演進,它需要持續的審查。
