Facebook owner Meta has announced the release of a new batch of artificial intelligence (AI) models from its research division, highlighting a “Self-Taught Evaluator” designed to reduce human involvement in the AI development process.
The Self-Taught Evaluator, introduced in an August paper, employs the “chain of thought” technique, akin to the method used by OpenAI’s recently released o1 models. This approach breaks down complex problems into smaller, logical steps, enhancing the accuracy of AI responses in challenging subjects such as science, coding, and mathematics.
Notably, Meta’s researchers trained the evaluator model entirely on AI-generated data, eliminating human input during the training stage. This advancement showcases the potential for AI to evaluate and improve itself, moving towards the creation of autonomous AI agents capable of learning from their own mistakes.
Such agents are envisioned by many in the AI field as intelligent digital assistants that can perform a wide array of tasks without human intervention. The development could also reduce reliance on the costly and inefficient process known as Reinforcement Learning from Human Feedback (RLHF), which depends on human annotators with specialized expertise to verify AI outputs.
“We hope, as AI becomes more and more superhuman, that it will get better and better at checking its work so that it will actually be better than the average human,” said Jason Weston, one of the researchers behind the project. “The idea of being self-taught and able to self-evaluate is basically crucial to the idea of getting to this sort of superhuman level of AI.”
While other companies like Google and Anthropic have also explored Reinforcement Learning from AI Feedback (RLAIF), they often do not release their models for public use. Meta’s decision to make its models available signals a commitment to open research and collaboration in the AI community.
In addition to the Self-Taught Evaluator, Meta released other AI tools, including an update to its image-identification Segment Anything model, a tool that accelerates large language model (LLM) response generation times, and datasets aimed at aiding the discovery of new inorganic materials.
Reference(s):
cgtn.com