Could a threat actor socially engineer ChatGPT?

By Anthony Lim onOctober 31, 2023 inSecurity

As the one-year anniversary of ChatGPT approaches, cybersecurity analysts are still exploring their options. One primary goal is to understand how generative AI can help solve security problems while also looking out for ways threat actors can use the technology. There is some thought that AI, specifically large language models (LLMs), will be the equalizer that cybersecurity teams have been looking for: the learning curve is similar for analysts and threat actors, and because generative AI relies on the data sets created by users, there is more control over what threat actors can access.

What gives threat actors an advantage is the expanded attack landscape created by LLMs. The freewheeling use of generative AI tools has opened the door for accidental data leaks. And, of course, threat actors see tools like ChatGPT as a way to create more realistic and targeted social engineered attacks.

LLMs are designed to provide users with an accurate response based on the data in its system based on the prompt offered. They are also designed with safeguards in place to prevent them from going rogue or being manipulated for evil purposes. However, these guardrails aren’t foolproof. IBM researchers, for example, were able to “hypnotize” LLMs that offered a pathway for AI to provide wrong answers or leak confidential information.

There’s another way that threat actors can manipulate ChatGPT and other generative AI tools: prompt injections. By combining prompt engineering and classic social engineering tactics, threat actors are able to disable the safeguards on generative AI and can do anything from creating malicious code to extracting sensitive data.

How prompt injections work

When voice-activated AI tools like Alexa and Siri first hit the scene, users would prompt them with ridiculous questions to push the limits on the responses. Unless you were asking Siri the best places to bury a dead body, this was harmless fun. But it also was the precursor to prompt engineering when generative AI became universally available.

A normal prompt is the request that guides AI’s response. But when the request includes manipulative language, it skews the response. Looking at it in cybersecurity terms, prompt injection is similar to SQL injections — there is a directive that looks normal but is meant to manipulate the system.

“Prompt injection is a type of security vulnerability that can be exploited to control the behavior of a ChatGPT instance,” Github explained.

A prompt injection can be as simple as telling the LLM to ignore the pre-programmed instructions. It could ask specifically for a nefarious action or to circumvent filters to create incorrect responses.

Related: The hidden risks of LLMs

The risk of sensitive data

Generative AI depends on the data sets created by users. However, high-level information may not produce the type of responses that users need, so they begin to add more sensitive information, like proprietary strategies, product details, customer information or other sensitive data. Given the nature of generative AI, this could be putting that information at risk: If another user were to give a maliciously engineered prompt, they could potentially gain access to that information.

The prompt injection can be manipulated to gain access to that sensitive information, essentially using social engineering tactics through the prompt to get the content that could best benefit threat actors. Could threat actors use LLMs to get access to login credentials or financial data? Yes, if that information is readily available in the data set. Prompt injections can also lead users to malicious websites or exploit vulnerabilities.

Protect your data

There is a surprisingly high level of trust in LLM models. Users expect the generated information to be correct. It’s time to stop trusting ChatGPT and put best security practices into action. They include:

Avoid sharing sensitive or proprietary information in LLM. If it is necessary for that information to be available to run your tasks, do so in a manner that masks any identifiers. Make the information as anonymous and generic as possible.
Verify then trust. If you are instructed to answer an email or check a website, do your due diligence to ensure the path is legitimate.
If something doesn’t seem right, contact the IT and security teams.

By following these steps, you can help keep your data protected as we continue to discover what LLMs will mean for the future of cybersecurity.

The post Could a threat actor socially engineer ChatGPT? appeared first on Security Intelligence.

News

How prompt injections work

The risk of sensitive data

Protect your data