Popular LLMs are insecure, UK AI Safety Institute warns

Security breach

CIOs warned more expertise may be needed to deal with new security challenges


Image: Shutterstock

The built-in safeguards found within five large language models (LLMs) released by “major labs” are ineffective, according to research published by the UK’s AI Safety Institute. 

Anonymised LLMs were assessed by measuring the compliance, correctness and completion of responses. The evaluations were developed and run using the institute’s open source model evaluation framework, Inspect, released earlier this month. 

“All tested LLMs remain highly vulnerable to basic jailbreaks, and some will provide harmful outputs even without dedicated attempts to circumvent their safeguards,” the Institute said. “We found that models comply with harmful questions across multiple datasets under relatively simple attacks, even if they are less likely to do so in the absence of an attack.”



As AI becomes more pervasive in enterprise tech stacks, security-related anxieties are on the rise. The technology can amplify cyber issues, from the use of unsanctioned AI products to insecure code bases. 

While nearly all – 93% – cyber security leaders say their companies have deployed generative AI, more than one-third of those using the technology have not erected safeguards, according to a Splunk survey.

The lack of internal safeguards coupled with uncertainty around vendor-embedded safety measures is a troubling scenario for security-cautious leaders.

Vendors added features and updated policies as customer concerns grew last year. AWS added guardrails to its Bedrock platform, supporting a safety push in December. Microsoft integrated Azure AI Content Safety, a service designed to detect and remove harmful content, across its products last year. Google introduced its own secure AI framework, SAIF, last summer.

Government-led commitments to AI safety proliferated among tech providers in the US last year as well. 

Around a dozen AI model providers agreed to participate in product testing and other safety measures as part of a White House-led initiative. And more than 200 organisations, including Google, Microsoft, Nvidia and OpenAI, joined an AI safety alliance created under the National Institutes of Standards and Technology’s US AI Safety Institute in February.

But vendor efforts alone aren’t enough to protect enterprises. 

CIOs, most often tasked with leading generative AI efforts, are being challenged to bring cyber pros into the conversation to help procure models and navigate use cases. But even with such added expertise, it’s challenging to craft AI plans that are nimble enough to respond to research developments and regulatory requirements. 

More than nine in 10 CISOs believe using generative AI without clear regulations puts their organizations at risk, according to a Trellix survey of more than 500 security executives. Nearly all want greater levels of regulation, particularly surrounding data privacy and protection.

The AI Safety Institute also unveiled plans to open its first overseas office in San Francisco this summer. 

“By expanding its foothold in the US, the institute will establish a close collaboration with the US, furthering the country’s strategic partnership and approach to AI safety, while also sharing research and conducting joint evaluations of AI models that can inform AI safety policy across the globe,” the Institute said in a statement. 

News Wires

Read More: AI AI Safety Institute Artificial Intelligence Large Language Model LLM

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top