Anthropic's Claude 3 AI model beat GPT-4 in the Chatbot Arena rankings

By: Bohdan Kaminskyi | 29.03.2024, 21:22

Image generated using Microsoft Designer and DALL-E 2

Anthropic's Claude 3 Opus large language model has outperformed OpenAI's GPT-4 for the first time on Chatbot Arena, a popular crowdsourced ranking used by researchers to evaluate the capabilities of AI language models.

Here's What We Know

Independent researcher Simon Willison noted that this is the first time that the best available models, such as Opus for complexity and Haiku for efficiency, are from a vendor other than OpenAI.

Chatbot Arena chatbot rankings

Chatbot Arena is managed by the Large Model Systems Organisation (LMSYS ORG) and is based on subjective evaluations by users comparing the output of different language models. This approach helps to overcome the difficulties in objectively evaluating the performance of AI chatbots.

The success of Claude 3 shows the growing competition in the field of AI language models. Some users have already replaced ChatGPT with Claude 3 in their workflows, which may affect OpenAI's market share.

However, OpenAI is expected to release a major new model, the successor to GPT-4 Turbo, during this year, possibly in the summer. This is likely to lead to further changes in Chatbot Arena rankings in the coming months and years.

Researchers emphasise the importance of diversity among leading vendors in the field, as it helps AI language model technologies evolve and improve their performance.

Source: Ars Technica