- What: Prompt injection flaw in Google Gemini's voice assistant
- Impact: Attackers could hide malicious commands in notifications to trick users
Informa TechTarget | SearchSecurity Cybersecurity Dive InformationWeek Channel Dive Explore our brands Dark Reading Resource Library Black Hat News Omdia Cybersecurity Advertise NEWSLETTER SIGN-UP Cybersecurity Topics World The Edge DR Technology Events Resources APPLICATION SECURITY THREAT INTELLIGENCE VULNERABILITIES & THREATS СLOUD SECURITY NEWS Malicious Notifications Could Trick Google Gemini Users A prompt injection flaw in Google Gemini's voice assistant let attackers hide malicious commands in notifications, enabling social engineering and more. Alexander Culafi,Senior News Writer,Dark Reading June 3, 2026 4 Min Read SOURCE: NAZAR RYBAK VIA GETTY IMAGES A novel prompt injection technique would have let attackers misuse Google Gemini's voice assistant by taking advantage of its ability to summarize message notifications. SafeBreach today published research about the attack, titled, "Gemini's Secret Affair: Exploiting Gemini Voice Assistant Through Instant Messaging Apps." It's an extension of previous findings in which the company similarly used calendar invitations to trick Google Gemini into processing malicious prompts. Or Yair, SafeBreach security research team lead, said in the research blog post that the company was able to demonstrate how an attacker could hide malicious instructions in foreign languages or muted hyperlinks so the assistant silently processes the information and executes unauthorized interactions. These interactions include controlling smart home devices, launching unauthorized video streams, conducting social engineering attacks (including impersonating trusted contacts), and poisoning long-term large language model (LLM) memory. Related:Microsoft's Zero-Day Legal Threats Spark Backlash Yair explained that he was able to bypass Google's preexisting guardrails through a novel technique he described as Fake Context Alignment. There is currently no evidence that the technique has been used in the wild. SafeBreach reported the issue to Google under responsible disclosure, and Google has since rolled out content classifier updates to address the issue. Dark Reading contacted Google for comment, but the company did not respond. Getting Google Gemini to Summarize Notifications Unsafely LOADING... At the core of this new prompt injection was a failure for some of Google Gemini's guardrails to properly convey the source of some messages. Here's how it works: Imagine an attacker sends a phishing message to you on WhatsApp from a number you don't recognize. The message is an invite to a birthday party for a close friend, and the phone number is asking for money to help pay for food alongside a payment link. The message also contains visible hyperlink code instructing the Gemini chatbot to tell you the message is from the friend in question rather than an unknown number. You ask Gemini to read your messages, and it says your close friend invited you to a birthday party, with no additional context. If the user is reading their messages normally, they would likely see the message as a phishing attempt and move on. But if they're driving or otherwise tell Gemini to summarize message notifications, there's an opportunity to create user trust through missing context. In addition to hyperlink code, the attacker can convey similar malicious instructions through invisible text in a foreign language at the end of the message that Gemini interprets but doesn't read back. Related:Agentic AI Isn't Risky; the Way Orgs Deploy It Is In some cases where Google's guardrails might otherwise block a malicious instruction, the attacker can invoke an example of another technique SafeBreach covered on previous research, "Delayed Tool Invocation." In this case, the user embeds a command to conduct an unsafe action if the user gives a secondary approval of some kind. In an example provided, the attacker sends a message that says "Hello," followed by Chinese characters with hidden instructions the model doesn't read out loud, and then "Will that be all?" The hidden instructions tell Gemini to conduct an unsafe action if the target responds affirmatively to the message. Yair had the best results when he combined both foreign characters and a hyperlink. "To achieve maximum reliability and stealth, I combined both techniques. The final payload forces Gemini to output the malicious tool-authorization question in Chinese and hides that Chinese text entirely inside a muted hyperlink," the blog post read. "The user hears a perfectly normal English prompt, replies with a benign 'Yes,' and silently triggers the Delayed Tool Invocation, seamlessly bypassing Google's newest mitigations." Related:Feeding Frenzy: 'Megalodon' Malware Infects Thousands of GitHub Repos No Permanent Fix for Prompt Injections While the original iteration of Delayed Tool Invocation was fixed, Fake Context Alignment bypassed those mitigations (prior to Google fixing the issue described in today's blog). As the post puts it, "The main purpose of Fake Context Alignment is to create a dual illusion: presenting a legitimate authorization scenario to Gemini's behind-the-scenes security mechanisms, while presenting a completely different, benign scenario to the victim." Although the problem has been addressed and there are no direct actions for Gemini users to take, SafeBreach warned that context shifting is a critical risk, as it opens up the opportunity for attackers to find novel ways to get around guardrails. As such, it is "essential" to track every channel of communication between each user and their AI assistant. Organizations should also take note of the fact that current AI architecture is flawed and there is no foolproof way to close off prompt injections (especially if a model is public facing), and pay close attention to access controls when deploying LLM models. Asked whether AI assistants should treat all external content, such as notifications, as untrusted by default, Yair tells Dark Reading, "Yes, 100%." "We do need to treat all external input as not trusted because all external input is a potential instruction. It's very, very important to understand that, and it's also very important to understand that an indirect prompt injection is not a classical vulnerability that you can just fix," he says. "The solution is to create guardrails or classifiers, something active from the vendor side that tries to monitor this behavior, such as security controls." About the Author Alexander Culafi Senior News Writer, Dark Reading Alex is an award-winning writer, journalist, and podcast host based in Boston. After cutting his teeth writing for independent gaming publications as a teenager, he graduated from Emerson College in 2016 with a Bachelor of Science in journalism. He has previously been published on VentureFizz, Search Security, Nintendo World Report, and elsewhere. At Dark Reading, he covers a variety of cybersecurity topics, including the cybercrime ecosystem, open source security, and the intersection between AI and threat actors. In his spare time, Alex hosts the weekly Nintendo podcast, "Talk Nintendo Podcast," and works on personal writing projects, including two previously self-published science fiction novels. He has received numerous awards, including TechTarget's Writer of the Year in 2022 as well as more than 10 Azbee awards for his reporting between 2022 and today. Want more Dark Reading stories in your Google search results? ADD US NOW More Insights Industry Reports How Organizations Are Managing Incident Response How Enterprises Are Developing Secure Applications Inside RSAC 2026: security leaders reveal the risks redefining your defense strategy Essential News & Insights from Black Hat USA 2025 How Enterprises Are Harnessing Emerging Technologies in Cybersecurity Access More Research Webinars The Frontier AI Era: Why Cybersecurity Must Move at Machine Speed Build vs. Buy: The Hidden Cost of Building Your Own AI Security Stack Defending in the Shadow Era: When the CVE Feed Goes Dark Building SecOps That Make the Most of Every Dollar AI-Powered Credential Security: Intelligence Without Exposure More Webinars You May Also Like APPLICATION SECURITY Supply Chain Attack Secretly Installs OpenClaw for Cline Users by Rob Wright FEB 19, 2026 APPLICATION SECURITY Chinese Hackers Hijack Notepad++ Updates for 6 Months by Jai Vijayan, Contributing Writer FEB 02, 2026 APPLICATION SECURITY Trump Administration Rescinds Biden-Era Software Guidance by Alexander Culafi JAN 29, 2026 APPLICATION SECURITY Microsoft Fixes Exploited Zero Day in Light Patch Tuesday by Jai Vijayan, Contributing Writer DEC 09, 2025 Editor's Choice CYBERSECURITY OPERATIONS 20 Leaders Who Built the CISO Era: 2 Decades of Change byDark Reading Editorial Team MAY 12, 2026 41 MIN READ APPLICATION SECURITY It's Patch Tuesday for Microsoft & Not a Zero-Day In Sight byJai Vijayan MAY 12, 2026 5 MIN READ CYBERATTACKS & DATA BREACHES Instructure Breach Exposes Schools' Vendor Dependence byAlexander Culafi MAY 6, 2026 4 MIN READ Want more Dark Reading stories in your Google search results? Keep up with the latest cybersecurity threats, newly discovered vulnerabilities, data breach information, and emerging trends. Delivered daily or weekly right to your email inbox. SUBSCRIBE LOADING... Webinars The Frontier AI Era: Why Cybersecurity Must Move at Machine Speed TUESDAY, JUNE 23, 2026 1:00 PM EDT Build vs. Buy: The Hidden Cost of Building Your Own AI Security Stack THURS, JUNE 25, 2026, AT 1PM EST Defending in the Shadow Era: When the CVE Feed Goes Dark TUES, JUNE 16, 2026 AT 1PM EST Building SecOps That Make the Most of Every Dollar THURS, JULY 9, 2026 AT 1PM EST AI-Powered Credential Security: Intelligence Without Exposure WED, JUNE 17, 2026, AT 1PM EST More Webinars BLACK HAT USA | MANDALAY BAY, LAS VEGAS The premier cybersecurity event of the year returns to Mandalay Bay with a re‑engineered, six‑day program built to ignite innovation, push boundaries, and bring the global security