MAI-Voice-2

Turn text into expressive, natural-sounding speech in seconds.

Features

MAI-Voice-2 produces natural, expressive speech from text or a short reference clip, with built-in guardrails ensuring only authorized, consented voices can be used.

Realistic expression

Organic pacing, tone, and emotional range that sound like a person, not a text-to-speech engine.

Voice

Acacia

Joy

Acacia

Anger

Acacia

Disgust

Acacia

Fear

Acacia

Sadness

Emotion

Elm

Joy

Elm

Anger

Elm

Disgust

Elm

Fear

Elm

Sadness

Emotion

Birch

Joy

Birch

Anger

Birch

Disgust

Birch

Fear

Birch

Sadness

Emotion

Grove

Joy

Grove

Anger

Grove

Disgust

Grove

Fear

Grove

Sadness

Emotion

Instant voice matching

Capture any voice from a short reference clip, no fine-tuning needed.

Stable, high-fidelity output that preserves speaker consistency across audiobooks, podcasts, and lectures.

Lectures

Audiobooks

Podcasts

Courses

Documentaries

Natural and expressive across 15 languages

Fluid, emotionally rich speech in 15 languages, without sacrificing quality.

German

Spanish

French

Hindi

Indonesian

Italian

Korean

Dutch

Portuguese

Russian

Thai

Turkish

Vietnamese

Chinese

A white egret stands in a dark, tranquil pond surrounded by lush, colorful foliage and lily pads, with the bird’s reflection visible in the water.

Using the Model

Text-to-speech made natural

Shakespearean Wisdom

Behold the silver wanderer of the reeds, gliding soft upon the mirrored dark. With patient poise it waits between the worlds of water, wind, and whispered evening light. A creature not in haste, yet never still, teaching us grace through every careful step.

Motivational Trainer

Alright, time to focus. Notice how the egret doesn’t rush the moment, it studies it. Every movement is deliberate, every pause intentional. That’s discipline. That’s control. So when the opportunity appears, you can strike without hesitation. Patience earns the catch. 

Sports Commentator

With everything on the line, the egret makes its move! Slow through the shallows… watching… waiting… And it’s a sudden strike! Got it! Incredible precision from the long beak! The fish never saw it coming. What a scene! Complete composure under pressure. A masterclass performance here in the pond tonight.

Performance

Sadness Example

00:00 00:00

Version Comparison

MAI-Voice-2

Languages supported 15
Voice cloning Multilingual
Price $22 per 1M characters

View docs

MAI-Voice-1

Languages supported 1
Voice prompting / cloning English
Price $22 per 1M characters

View docs

Featured Partner

An older man wearing glasses and a blazer reads a book inside a cozy bookstore filled with shelves and stacks of books.

“One of [the researchers] recorded my introduction and the next thing I knew, he was playing my voice…and the intonation, the pauses…I just thought, wow, that’s quite nice.

“It’s really exciting for me because what we’re about to embark on together is in a way my lifetime’s ambition, which is to bring poetry to everyone.”

– William Sieghart, Ode Founder/Director and author of the “Poetry Pharmacy” anthologies

Dig Deeper

Model card Blog post

Try MAI-Voice-2

MAI Playground

Experiment with all other MAI models.

Try in Playground

Microsoft Foundry (Azure Speech)

Build and deploy MAI-Voice with Azure Speech.

Try in Azure Speech

MAI-Voice-2

MAI-Thinking-1

MAI-Code-1-Flash

MAI-Image-2.5

MAI-Transcribe-1.5

MAI-Voice-2

Features

Voice

Joy

Anger

Disgust

Fear

Sadness

Emotion

Joy

Anger

Disgust

Fear

Sadness

Emotion

Joy

Anger

Disgust

Fear

Sadness

Emotion

Joy

Anger

Disgust

Fear

Sadness

Emotion

Instant voice matching

Built for long-form

Natural and expressive across 15 languages

German

Spanish

French

Hindi

Indonesian

Italian

Korean

Dutch

Portuguese

Russian

Thai

Turkish

Vietnamese

Chinese

Using the Model

Shakespearean Wisdom

Motivational Trainer

Sports Commentator

Leading in expressiveness and naturalness

Joy Example

Sadness Example

Joy Example

Sadness Example

Joy Example

Sadness Example

Joy Example

Sadness Example

Joy Example

Sadness Example

Joy Example

Sadness Example

Joy Example

Sadness Example

Joy Example

Sadness Example

Joy Example

Sadness Example

Version Comparison

MAI-Voice-2

MAI-Voice-1

Featured Partner

Dig Deeper

Try MAI-Voice-2

MAI Playground

Microsoft Foundry (Azure Speech)