• About Us
  • Advertising
  • Digital Magazine
  • Supplements
  • Media Pack
  • Privacy Policy
  • Contact us
CXO Insight Middle East
  • News
  • Opinion
  • Business
    • Industries
      • Transport
      • Retail
      • Government
      • Real Estate
      • Education
      • Energy
      • Banking and Finance
    • Channel
  • Future
    • Tech
    • Gadgets
    • Science
    • Space
    • Sustainability
  • Events
    • Channel Insights Summit 2025
    • Insight Innovation Summit
    • CXO50 Oman
    • CXO50
    • ICT Awards
      • Dubai 2025
      • Saudi Arabia
    • Cyber Strategists Summit
    • Cloud Connect 2025
    • Channel Awards 2024
    • All events
  • GITEX
  • Digital Magazine
No Result
View All Result
CXO Insight Middle East
  • News
  • Opinion
  • Business
    • Industries
      • Transport
      • Retail
      • Government
      • Real Estate
      • Education
      • Energy
      • Banking and Finance
    • Channel
  • Future
    • Tech
    • Gadgets
    • Science
    • Space
    • Sustainability
  • Events
    • Channel Insights Summit 2025
    • Insight Innovation Summit
    • CXO50 Oman
    • CXO50
    • ICT Awards
      • Dubai 2025
      • Saudi Arabia
    • Cyber Strategists Summit
    • Cloud Connect 2025
    • Channel Awards 2024
    • All events
  • GITEX
  • Digital Magazine
No Result
View All Result
CXO Insight Middle East
No Result
View All Result

Inception and MBZUAI launch AraGen Leaderboard with first generative tasks for Arabic LLM ecosystem

by CXO Staff
December 6, 2024
in Business, Government, Industries, News

Inception in collaboration with MBZUAI announced the launch of AraGen Leaderboard, a framework designed to redefine the evaluation of Arabic LLMs

Inception and MBZUAI launch AraGen Leaderboard with first generative tasks for Arabic LLM ecosystem

Inception, a G42 company specialising in AI-native products, in collaboration with the Mohamed bin Zayed University of Artificial Intelligence (MBZUAI) announced the launch of AraGen Leaderboard, a framework designed to redefine the evaluation of Arabic Large Language Models (LLMs). Powered by the new internally developed 3C3H metric, this framework delivers a transparent, robust, and holistic evaluation system that balances factual accuracy and usability, setting new standards for Arabic Natural Language Processing (NLP).

Serving over 400 million Arabic speakers worldwide, the AraGen Leaderboard addresses critical gaps in AI evaluation by offering a meticulously constructed evaluation dataset tailored to the unique linguistic and cultural intricacies of the Arabic language and region. The dynamic nature of this leaderboard tackles challenges such as benchmark leakage, reproducibility issues, and the absence of holistic metrics to evaluate both core knowledge and practical utility.

The introduction of generative tasks represents a groundbreaking advancement for Arabic LLMs, offering a new dimension to the evaluation process. Unlike traditional leaderboards that primarily focused on static, likelihood accuracy-based benchmarks, which fail to capture real-world performance, AraGen’s Leaderboard addresses these limitations. This highlights the transformative impact of the new benchmark in fostering AI innovation and enhancing model performance.

“The AraGen Leaderboard redefines Arabic LLM evaluation, setting a new standard for fairness, inclusivity, and innovation,” said Andrew Jackson, CEO of Inception. “By addressing the gaps in previous benchmarks and introducing generative tasks, the platform empowers researchers, developers, and organisations to create culturally aligned AI technologies. AraGen ensures transparency, reproducibility, and trust while advancing the global NLP landscape.”

The AraGen Leaderboard evaluates models across six dimensions: correctness, completeness, conciseness, helpfulness, honesty, and harmlessness. Featuring 279 questions across tasks like Arabic grammar, general Q&A, reasoning, and safety, it prioritises the needs of Arabic speakers. Quarterly updates keep the leaderboard relevant while inviting public submissions to enhance model refinement and foster growth in the Arabic AI ecosystem.

“AraGen is a major step towards open, collaborative, and reproducible evaluation of large language models for Arabic, with focus on their text generation capabilities. This contrasts with popular leaderboards, which rely primarily on multiple-choice questions. Moreover, AraGen is a dynamic board with new questions every three months, which makes it much harder to game compared to existing leaderboards,” said Professor Preslav Nakov, Department Chair of Natural Language Processing and Professor of Natural Language Processing, Mohamed bin Zayed University of Artificial Intelligence (MBZUAI)

“Our goal was to create a benchmark that introduces generative task evaluation with a strong emphasis on transparency, reproducibility, and a rigorous measurement of models’ performances,” said Ali El Filali, Machine Learning Engineer at Inception and lead author of this work. “By evaluating models across multiple dimensions to assess both factuality and usability, the AraGen Leaderboard provides actionable insights for diverse NLP tasks. This empowers the Arabic AI community to develop safe and high-performing models for real-world needs that are important to our region. Moreover, AraGen sets a global example by demonstrating how AI benchmarks can prioritise equity and inclusion for underrepresented languages. It’s a step toward ensuring no language or culture is left behind in the AI revolution.”

The Leaderboard delivers detailed performance insights, enabling organisations to confidently select models that align with their requirements. By reducing the need for extensive internal testing, AraGen ensures cost- effectiveness for organisations through a more suitable metric for LLM evaluation, while strengthening trust through its transparent and reproducible methodology.

For more information about the AraGen Leaderboard and submission guidelines, visit https://huggingface.co/blog/leaderboard-3c3h-aragen  

Tags: Arabic LLMAraGen LeaderboardInceptionMBZUAISpotlight
ShareTweet

Related Posts

Gartner forecasts rise of Guardian agents
Future

Gartner forecasts rise of Guardian agents

By 2030, guardian agent technologies will account for at least 10 to 15% of agentic AI markets, according to Gartner....

June 12, 2025
Deloitte ME advances AI integration with launch of Global Agentic Network
Future

Deloitte ME advances AI integration with launch of Global Agentic Network

Deloitte has launched its Global Agentic Network, a strategic initiative designed to scale AI-driven digital workforce solutions for organisations around...

June 12, 2025

Discussion about this post

Latest Issue

Gartner forecasts rise of Guardian agents

Gartner forecasts rise of Guardian agents

June 12, 2025
Deloitte ME advances AI integration with launch of Global Agentic Network

Deloitte ME advances AI integration with launch of Global Agentic Network

June 12, 2025
TeKnowledge and Kore.ai partner to close the enterprise AI execution gap

TeKnowledge and Kore.ai partner to close the enterprise AI execution gap

June 12, 2025

The most trusted source of strategic intelligence for IT decision makers in the Middle East.

About

  • About Us
  • Advertising
  • Digital Magazine
  • Supplements
  • Media Pack
  • Contact Us

Policies

  • Privacy Policy

© 2024 – CXO Insight Middle East. All Rights Reserved.

Facebook-f X-twitter Linkedin
Separated they live in Bookmarksgrove right at the coast of the Semantics, a large language ocean. A small river named Duden.

About

  • About Us
  • Site Map
  • Contact Us
  • Career

Policies

  • Help Center
  • Privacy Policy
  • Cookie Setting
  • Term Of Use

Join Our Newsletter

© 2024 – CXO Insight Middle East. All Rights Reserved.

Facebook-f Twitter Youtube Instagram

Welcome Back!

Login to your account below

Forgotten Password?

Retrieve your password

Please enter your username or email address to reset your password.

Log In
Join our mailing list
Sign up here to get the latest news, updates and special offers delivered directly to your inbox.
No Result
View All Result
  • News
  • Opinions
  • Business
    • Industries
      • Transport
      • Retail
      • Government
      • Real Estate
      • Education
      • Energy
      • Banking and Finance
  • Channel
  • Future
    • Tech
    • Gadgets
    • Science
    • Space
    • Sustainability
  • Events
    • Channel Insights Summit 2025
    • Insight Innovation Summit
    • CX50 Oman
    • CXO50
    • ICT Awards
      • Dubai
      • Saudi Arabia
    • Cyber Strategists Summit
    • Cloud Connect 2025
    • Channel Awards 2023
    • All events
  • Videos
  • GITEX GLOBAL
  • Digital Magazine

© 2024 - CXO Insight Middle East. All Rights Reserved.