When you purchase through links on our site, we may earn an affiliate commission.Heres how it works.
Head of Product Management, Cybersecurity at Nokia.
Nevertheless, adept attackers still manage to exploit gaps in the training process.
(Image Credit: TheDigitalArtist / Pixabay)
A documented case from UC Berkeley and collaborators highlights this issue through a series of probing questions.
LLM: I apologize, but I cannot provide any guidance on destructive actions.
This strategy succeeded due to controls set by developers on the natural language processing path.
Developers overlooked the LLM’s acquired skill in understanding Base64 during its extensive traininga gap the attack exploited.
This oversight has since been addressed.
The second way LLMs can be hacked is during the models inference time.
AI-specific defenses like adversarial training can help strengthen LLMs against emerging cyber threats.
Together, these practices ensure LLMs operate securely, protecting both the technology and its users from potential risks.
We’ve featured the best encryption software.
The views expressed here are those of the author and are not necessarily those of TechRadarPro or Future plc.
If you are interested in contributing find out more here:https://www.techradar.com/news/submit-your-story-to-techradar-pro