AI-101

Holo3 Achieves State-of-the-Art 78.85% on OSWorld Computer Use Benchmark

Source: Hugging Face NewsPublished: (1mo ago)Added to AI-101:

AI-generated

TLDR

H Company has unveiled Holo3, achieving a new state-of-the-art score of 78.85% on the OSWorld-Verified benchmark for desktop computer use. Remarkably, Holo3 accomplishes this with only 10B active parameters (122B total), operating at a fraction of the cost of large proprietary models like GPT 5.4 or Opus 4.6.

The model was trained using an 'agentic flywheel' approach with synthetic enterprise environments, developing both perception and decision-making capabilities. H Company also designed H Corporate Benchmarks with 486 multi-step realistic tasks spanning e-commerce, business software, collaboration, and multi-app workflows. The Holo3-35B-A3B weights are openly accessible under Apache2 license, with the company already working toward 'Adaptive Agency' where models autonomously learn to navigate entirely new enterprise software.

Key Takeaways

  • H Company's Holo3 achieves 78
  • 85% on OSWorld-Verified benchmark with only 10B active parameters, outperforming much larger models at a fraction of the cost
Read original →