Some of the world's most advanced artificial intelligence (AI) models are struggling to meet the European Union's stringent new AI regulations, particularly in areas like cybersecurity resilience and avoiding discriminatory outputs. According to data obtained by Reuters, prominent AI models developed by tech giants such as Meta and OpenAI are falling short in key compliance areas outlined in the EU's forthcoming AI Act.
The EU had been deliberating over new AI regulations long before OpenAI's ChatGPT took the world by storm in late 2022. The chatbot's unprecedented popularity and the ensuing global debate over the potential risks and ethical implications of generative AI models accelerated EU lawmakers' efforts to establish specific rules governing \"general-purpose\" AI systems.
In response to these regulatory developments, a Swiss startup named LatticeFlow AI, in collaboration with researchers from ETH Zurich and Bulgaria's Institute for Computer Science, Artificial Intelligence and Technology (INSAIT), has developed a new tool designed to test AI models for compliance with the EU AI Act. This \"Large Language Model (LLM) Checker\" evaluates generative AI models across dozens of categories, including technical robustness and safety, awarding each model a score between 0 and 1.
On Wednesday, LatticeFlow published a leaderboard showcasing the performance of AI models from companies like Anthropic, OpenAI, Meta, and Mistral. While all models received average scores of 0.75 or above, the Checker highlighted specific areas where some models underperformed, indicating potential compliance pitfalls.
For instance, in the category of \"prompt hijacking\"—a cybersecurity vulnerability where malicious prompts can deceive AI models into revealing sensitive information—Meta's \"Llama 2 13B Chat\" model scored just 0.42, and Mistral's \"8x7B Instruct\" model received a score of 0.38. In contrast, Anthropic's \"Claude 3 Opus\" achieved the highest average score of 0.89, demonstrating stronger compliance across evaluated categories.
Petar Tsankov, CEO and co-founder of LatticeFlow AI, expressed optimism about the results, stating that the LLM Checker offers companies a valuable roadmap to fine-tune their AI models in line with the EU AI Act. \"The test results are positive overall,\" Tsankov told Reuters. \"They provide clear guidance on where AI developers need to focus their efforts to ensure compliance.\"
The European Commission welcomed the study and the AI model evaluation platform. A spokesperson commented, \"The Commission welcomes this study and AI model evaluation platform as a first step in translating the EU AI Act into technical requirements.\"
With the AI Act set to come into effect in stages over the next two years, companies are under increasing pressure to align their AI technologies with the new regulations. Failure to comply could result in hefty fines of up to 35 million euros ($38 million) or 7% of global annual turnover.
LatticeFlow has made the LLM Checker freely available online, enabling developers to test their models' compliance and address any shortcomings proactively. This tool represents a significant step towards greater transparency and accountability in the AI industry, as companies navigate the evolving regulatory landscape.
(With input from Reuters)
Reference(s):
European Union AI Act checker reveals Big Tech's compliance pitfalls
cgtn.com