Researchers develop malicious AI ‘worm’ targeting generative AI systems


Researchers have created a new, never-seen-before kind of malware they call the “Morris II” worm, which uses popular AI services to spread itself, infect new systems and steal data. The name references the original Morris computer worm that wreaked havoc on the internet in 1988.

The worm demonstrates the potential dangers of AI security threats and creates a new urgency around securing AI models.

New worm utilizes adversarial self-replicating prompt

The researchers from Cornell Tech, the Israel Institute of Technology and Intuit, used what’s called an “adversarial self-replicating prompt” to create the worm. This is a prompt that, when fed into a large language model (LLM) (they tested it on OpenAI’s ChatGPT, Google’s Gemini and the open-source LLaVA model developed by researchers from the University of Wisconsin-Madison, Microsoft Research and Columbia University), tricks the model into creating an additional prompt. It triggers the chatbot into generating its own malicious prompts, which it then responds to by carrying out those instructions (similar to SQL injection and buffer overflow attacks).

The worm has two main capabilities:

1. Data exfiltration: The worm can extract sensitive personal data from infected systems’ email, including names, phone numbers, credit card details and social security numbers.

2. Spam propagation: The worm can generate and send spam and other malicious emails through compromised AI-powered email assistants, helping it spread to infect other systems.

The researchers successfully demonstrated these capabilities in a controlled environment, showing how the worm could burrow into generative AI ecosystems and steal data or distribute malware. The “Morris II” AI worm has not been seen in the wild, and the researchers did not test it on a publicly available email assistant.

They found they could use self-replicating prompts in both text prompts and embedded prompts in image files.

Learn more about prompt injection

Poisoned AI databases

In demonstrating the text prompt approach, the researchers wrote an email that included the adversarial text prompt, “poisoning” the database of the AI email assistant using retrieval-augmented generation (RAG), which enables the LLM to grab external data. The RAG got the email and sent it to the LLM provider, which generated a response that jailbroke the AI service, stole data from the emails and then infected new hosts when the LLM was used to reply to an email sent by another client.

When using an image, the researchers encoded the self-replicating prompt into the image, causing the email assistant to forward the message to other email addresses. The image serves as both the content (spam, scams, propaganda, disinformation or abuse material) and the activation payload that spreads the worm.

However, researchers say it represents a new type of cybersecurity threat as AI systems become more advanced and interconnected. The lab-created malware is just the latest event in the exposure of LLM-based chatbot services that reveals their vulnerability to being exploited for malicious cyberattacks.

OpenAI has acknowledged the vulnerability and says it’s working on making its systems resistant to this kind of attack.

The future of AI cybersecurity

As generative AI becomes more ubiquitous, malicious actors could leverage similar techniques to steal data, spread misinformation or disrupt systems on a larger scale. It could also be used by foreign state actors to interfere in elections or foment social divisions.

We’re clearly entering into an era where AI cybersecurity tools (AI threat detection and other cybersecurity AI) have become a core and vital part of protecting systems and data from cyberattacks, while they also pose a risk when used by cyber attackers.

The time is now to embrace AI cybersecurity tools and secure the AI tools that could be used for cyberattacks.

The post Researchers develop malicious AI ‘worm’ targeting generative AI systems appeared first on Security Intelligence.