Image: MLYearning
Northwestern University recently unveiled a critical vulnerability in custom Generative Pre-trained Transformers (GPTs).
Despite their versatility and adaptability, these advanced AI chatbots are prone to prompt injection attacks, risking exposure of sensitive information.
Custom GPTs, developed using OpenAI's ChatGPT and its GPT-4 Turbo Large Language Model, incorporate unique elements like specific prompts, datasets, and processing instructions for specialised tasks.
Image: Decrypt
However, these customisations and any confidential data used in their creation can be easily accessed by unauthorised parties.
An experiment by Decrypt demonstrated the ease of extracting a custom GPT's full prompt and confidential data through basic prompt hacking.
Testing over 200 custom GPTs, researchers found a high likelihood of such breaches, including the potential extraction of initial prompts and access to private files.
The study highlights two major risks: compromised intellectual property and breached user privacy.
Attackers can exploit the GPTs to either extract the core configuration and prompt ("system prompt extraction") or leak confidential training datasets ("file leakage").
Existing defences like defensive prompts prove ineffective against more sophisticated adversarial prompts.
More Vulnerability?
The researchers argue for a more comprehensive approach to safeguard these AI models, emphasising that determined attackers can likely exploit current vulnerabilities.
The study calls on the AI community to develop stronger security measures, suggesting that simple defensive prompts are inadequate against such advanced exploitation techniques.
With the rising customisation of GPTs offering vast potential, this research serves as a critical reminder of the security risks involved.
Users are advised to exercise caution, especially with sensitive data, underscoring the need for enhanced AI security without compromising user privacy and safety.
The complete Northwestern University study is available for reading here.