Microsoft Releases Three New Foundational AI Models for Speech and Images
Source: TechCrunch AIPublished: (1mo ago)Added to AI-101:
AI-generated
TLDR
Microsoft has unveiled three new foundational AI models through its Microsoft AI (MAI) division, which was established approximately six months ago. The models demonstrate versatility across multiple modalities, with capabilities spanning voice-to-text transcription, audio generation, and image creation.
Led by CEO Mustafa Suleyman, the simultaneous release of three models represents an aggressive stance in the competitive AI market. The diversity of capabilities—spanning audio and image modalities—suggests Microsoft is attempting comprehensive coverage of high-demand AI applications, positioning itself to compete more directly with established players like OpenAI and others.
Key Takeaways
- Microsoft's AI division has released three new foundational models capable of voice transcription, audio generation, and image creation, challenging competitors six months after formation