XpertBench: New Benchmark Reveals 'Expert Gap' in LLMs Across Professional Domains

Source: arXiv cs.AIPublished: 27 Mar 2026(3mo ago)Added to AI-101: 5 Apr 2026

AI-generated

TLDR

Researchers have released XpertBench, a comprehensive benchmark with 1,346 tasks across 80 categories spanning finance, healthcare, legal services, education, and research.

Results reveal a significant 'expert-gap' in current AI systems, with even leading models achieving only around 66% peak success rates and mean scores hovering near 55%.

Key Takeaways

A new benchmark with 1,346 expert-curated tasks shows leading LLMs achieve only 55-66% success rates on professional-level work in finance, healthcare, and legal services

Read original →