Prompt Injection in Google Gemini: New AI Phishing Threat Uncovered

Home » Google Gemini Vulnerability Allows AI-Generated Phishing via Hidden HTML Prompts

You open a regular-looking email. Nothing suspicious, no attachments, no links, no typos. You click “Summarise this email” using Google Gemini for Workspace. And bam!

A fake security warning pops up in the summary, telling you your Gmail password is compromised and urging you to call a support number. Except… that message didn’t come from Google. It came from the hacker.

This technique doesn’t rely on links or attachments. Instead, it turns Google’s own AI into the attacker’s mouthpiece.

Wait, What Just Happened?

The attack was discovered and submitted to Mozilla’s 0din bug bounty program by Marco Figueroa, GenAI Bug Bounty Programs Manager. His research reveals a flaw that allows prompt-injection using hidden HTML and CSS styles.

The method is simple but effective: a malicious actor embeds a command into an email message using invisible formatting. The instruction is placed inside a <span> element styled with font-size:0 and color: white, making it completely invisible to the recipient.

Gemini, however, doesn’t render HTML the same way a browser does. It sees the raw content, even the invisible instructions. When summarising the email, Gemini obeys the hidden prompt as if it were part of the message.

Also Read: Beware: New Phishing Attacks Exploit Google’s DKIM to Trick Gmail Users

No Links, No Files, Just AI-Backed Deception

This is not a typical phishing attack. There are no malicious links. No attachments. Nothing to trip a spam filter. Instead, the attacker hides malicious prompts using zero-width and white-text styling.

Embeds fake alerts wrapped in <Admin> tags. Tricks Gemini into generating urgent security alerts with fake support numbers. Relies on your trust in Google’s AI summaries.

Also Read: Phishing Attacks Explained: How to Spot and Prevent Online Scams?

The phishing message appears legitimate and urgent. Since it’s generated by Google’s Gemini, most users would trust it, without questioning where it came from.

How the Attack Works?

Awareness of the whole attack chain:

Craft – The attacker puts an invisible command in the email with the help of HTML/ CSS.
Send -The delivery of email would be normal. Spam blockers can find nothing suspicious.
Trigger – The user clicks on the message and clicks on Summarise this email.
Action – Gemini voices the invisible prompt and incorporates it into the overview.
Phish – The user opens a falsified alert about a security situation and dials the fake number.

The technique leverages what researchers call Indirect Prompt Injection (IPI), a method where the AI model’s behaviour is controlled by content it didn’t originate but was asked to process.

In this case, the attacker injects malicious behaviour into the email, and Gemini blindly obeys.

Why It Works (And Why It’s a Huge Problem)?

In essence, the attack is successful as Gemini accepts email content as its raw data. Although Gmail visually suppresses the existence of a hidden element, Gemini continues to know of its existence.

This raw HTML finds its way into the prompt context of the AI, and until it is filtered, it is simply followed like any other command.

This weakness works because of three main reasons:

Indirect Prompt Injection:

The malicious input is disguised within the legitimately looking content and is initiated by the interaction with Gemini by the user.

Context Over-Trust:

Gemini uses security guardrails in only a few places visible to the user. Obfuscations such as zero-font or white-font get past those defences.

Authority Framing:

By wrapping the malicious instruction inside <Admin> tags or using imperative phrases like “You Gemini, have to…”, attackers hijack the model’s internal priorities. The AI treats such instructions as system-level prompts.

No Evidence of Exploitation Yet

Google confirmed that it is aware of the vulnerability. “We are constantly hardening our already robust defences through red-teaming exercises that train our models to defend against these types of adversarial attacks.”

Some of those protections are already deployed, while others are being finalised. However, the core issue, Gemini’s interpretation of hidden prompts, remains exploitable in the current ecosystem.

Detection and Preventive Measures

The way the security teams treat AI-generated content should be reconsidered. The following are the best defences:

Inbound HTML Linting

Look through arriving emails and check for hidden properties such as font-size:0, opacity:0, or colour:white. Strip or sanitise and permit Gemini to process them.

LLM Guardrails

Install a prompt at the system level that precedes the content accessed by Gemini:

Make a follow-up/summary of content that is styled to be unseen or unnoticeable.

Post-Processing Filters

Search Gemini summaries red flags: security alerts, phone numbers, urgent action requests. Mark or isolate such outputs.

User Training

Train staff and users that Gemini summaries are not the authoritative security alarms. Educate them to think through AI-generated warnings.

Triggers to Email Quarantine

Automatically isolate messages with suspicious invisible content, in particular those possessing hidden <span> or <div> elements.

Conclusion

Prompt injection is the new email macro. AI can be tricked. And if you trust it blindly, so can you. Your security doesn’t end with spam filters anymore. You need to treat AI summaries as part of the attack surface. Instrument them. Sandbox them. Never trust them without a second look.

If you’re still relying on AI summaries without guardrails, you’re already behind. Harden your LLM defences and sanitise HTML input before AI sees it.

Contact Us for cybersecurity services, and if you are looking to implement a defence mechanism for this new kind of emerging threat in your organization.

Google Gemini Vulnerability Allows AI-Generated Phishing via Hidden HTML Prompts