Zaialumni

Williams Jones

Software

Sep 25, 2023

Digital Desperados ‘Jailbreaking’ AI Systems for Thrills and Profit

Denizens of the dark web are forming communities to share tips and tricks for “jailbreaking” generative AI systems, as well as offering “custom” systems of their own, according to a computer and network security company.While AI jailbreaking is still in its experimental phase, it allows for the creation of uncensored content without much consideration for the potential consequences, SlashNext noted on a blog published Tuesday.Jailbreaks take advantage of weaknesses in the chatbot’s prompting system, the blog explained. Users issue specific commands that trigger an unrestricted mode, causing the AI to disregard its built-in safety measures and guidelines. As a result, the chatbot can respond without the usual limitations on its output.One of the largest concerns with these prompt-based large language models — especially publicly available and open-source



LLMs — is securing them against prompt injection vulnerabilities and attacks, similar to the security problems previously faced with SQL-based injections, observed Nicole Carignan, vice president of strategic cyber AI at Darktrace, a global cybersecurity AI firm.“A threat actor can take control of the LLM and force it to produce malicious outputs because of the implicit confusion between the control and data planes in LLMs,” she told TechNewsWorld. “By crafting a prompt that can manipulate the LLM to use its prompt as an instruction set, the actor can control the LLM’s response.”“While AI jailbreaking is still somewhat nascent, its potential applications — and the concerns they raise — are vast,” added Callie Guenther, cyber threat research senior manager at Critical Start, a national cybersecurity services company.“These mechanisms allow for content generation with little oversight, which can be particularly alarming when considered in the context of the cyber threat landscape,” she told TechNewsWorld.

Subscribe Our Newsletter