EU AI Act Checker Highlights Compliance Challenges for Leading AI Models

In a significant development for the global tech industry, a new tool evaluating compliance with the European Union’s AI Act has revealed that some of the most prominent artificial intelligence models are falling short in key regulatory areas. The findings spotlight potential challenges for big tech companies as they navigate the complexities of the EU’s stringent AI regulations coming into effect over the next two years.

The EU AI Act, which represents one of the world’s most comprehensive efforts to regulate artificial intelligence, aims to ensure that AI technologies are developed and used in ways that are ethical, transparent, and safe. With the rapid advancement of AI capabilities, especially following the public release of OpenAI’s ChatGPT in late 2022, lawmakers have accelerated efforts to establish clear guidelines for “general-purpose” AI models.

The compliance checker, developed by Swiss startup LatticeFlow AI in collaboration with researchers from ETH Zurich and Bulgaria’s Institute for Computer Science, Artificial Intelligence and Technology (INSAIT), assesses AI models across dozens of categories, including technical robustness and safety. The framework assigns scores between 0 and 1, providing a quantitative measure of each model’s alignment with the EU’s regulatory standards.

The latest evaluations placed models from Anthropic, OpenAI, Meta, and Mistral under scrutiny. While these models achieved average scores of 0.75 or higher, indicating a reasonable level of compliance, the tool uncovered deficiencies in critical areas such as cybersecurity resilience and the potential for discriminatory outputs.

For instance, Meta’s “Llama 2 13B Chat” model received a score of 0.42 in the category of “prompt hijacking”—a cybersecurity concern where malicious prompts could lead to unauthorized access or manipulation of sensitive information. Similarly, Mistral’s “8x7B Instruct” model scored 0.38 in the same category, highlighting vulnerabilities that need to be addressed.

Anthropic’s “Claude 3 Opus” model emerged with the highest average score of 0.89, suggesting a stronger alignment with the AI Act’s requirements. The availability of the LLM Checker as a free online tool offers developers an opportunity to proactively assess and enhance their models’ compliance.

“The test results are generally positive and provide companies with a clear roadmap to fine-tune their AI models in line with upcoming regulations,” said Petar Tsankov, CEO and co-founder of LatticeFlow AI. He emphasized the importance of early and thorough compliance efforts to avoid potential fines and operational disruptions.

The European Commission has welcomed the initiative. “The Commission appreciates this study and the AI model evaluation platform as a crucial step in translating the EU AI Act into practical technical requirements,” a spokesperson stated.

Companies that fail to comply with the AI Act may face substantial penalties, including fines of up to 35 million euros (approximately $38 million) or 7 percent of their global annual turnover. This underscores the urgency for businesses, not only in Europe but worldwide, to ensure their AI technologies meet the established standards.

The implications of these findings are significant for stakeholders across Asia. As Asian tech companies and developers continue to expand their AI capabilities and global reach, understanding and aligning with international regulations like the EU AI Act becomes essential. Ensuring compliance will not only facilitate smoother entry into European markets but also enhance the trust and reliability of AI applications worldwide.