What Is Sarvam AI? The Indian startup that beat Google Gemini and ChatGPT

Sarvam AI's OCR tool shows strong potential for India-specific tasks, outperforming Google Gemini and ChatGPT in reading documents written in native Indian languages. Its Bulbul V3 model has also emerged as a leading AI voice-generation system. Here is what you need to know about Sarvam AI.

advertisement

Bengaluru-based Sarvam AI has emerged as a major player in India’s AI push, developing language and voice models tuned to Indian languages and use cases. Founded by Dr Vivek Raghavan and Dr Pratyush Kumar in the year 2023, the company builds compact, efficient models designed for phones, call systems and local languages rather than only for high-end cloud computers.

advertisement

WHAT IS SARVAM AI AND WHY IT MATTERS

Sarvam’s aim is practical: build AI that works for India’s linguistic mix and limited bandwidth. Its product line includes small-to-medium sized language models, speech tools and APIs for speech-to-text and text-to-speech.

The firm argues that careful data curation and task tuning can beat larger, generic models on India-specific problems.

HOW DID IT "BEAT" GOOGLE AND ChatGPT?

The company’s recent launches, notably Bulbul V3 (a text-to-speech system) and Vision (an OCR/vision model), were tested against global systems and came out ahead on targeted, India-focused benchmarks.

In a blind listening study and automated error tests, Bulbul V3 recorded lower error rates on telephony-grade audio and handled numerals, named entities and code-mixed text better than several global TTS systems, according to Sarvam and coverage in national media.

Separately, early tests on document reading in Indian languages showed the company’s Vision tool outperforming generalist models on some India-language OCR tasks.

advertisement

These wins are task-specific, not blanket judgments of overall AI capability.

HOW INDEPENDENT WERE THE TESTS?

Sarvam says the Bulbul V3 results come from blind listening studies and automated comparisons done with partners and public models.

Media reports note independent listener votes and large sample sizes for some tests, but they also caution that vendor-led evaluations require outside replication for conclusive rankings.

In short, the results are notable but should be read as early evidence, not final proof that one company has dethroned global leaders.

WHAT THIS MEANS FOR USERS AND BUSINESSES

For Indian firms and public services, Sarvam’s tools promise cheaper, local-language voice agents and better OCR for native scripts.

The company has also worked with cloud partners and is part of government talks on sovereign AI, which could accelerate adoption in government and telecom applications. Still, large global models remain stronger on many general tasks; Sarvam’s advantage today is its India focus and efficiency.

Sarvam AI’s recent results are important for India’s tech ecosystem.

The results show that focused engineering and local present data can produce models that has the caliber to outshine generic systems on real-world, country-specific tasks.

advertisement

Wider, independent testing will decide how far this lead goes

- Ends
Published By:
Rishab Chauhan
Published On:
Feb 8, 2026

Bengaluru-based Sarvam AI has emerged as a major player in India’s AI push, developing language and voice models tuned to Indian languages and use cases. Founded by Dr Vivek Raghavan and Dr Pratyush Kumar in the year 2023, the company builds compact, efficient models designed for phones, call systems and local languages rather than only for high-end cloud computers.

WHAT IS SARVAM AI AND WHY IT MATTERS

Sarvam’s aim is practical: build AI that works for India’s linguistic mix and limited bandwidth. Its product line includes small-to-medium sized language models, speech tools and APIs for speech-to-text and text-to-speech.

The firm argues that careful data curation and task tuning can beat larger, generic models on India-specific problems.

HOW DID IT "BEAT" GOOGLE AND ChatGPT?

The company’s recent launches, notably Bulbul V3 (a text-to-speech system) and Vision (an OCR/vision model), were tested against global systems and came out ahead on targeted, India-focused benchmarks.

In a blind listening study and automated error tests, Bulbul V3 recorded lower error rates on telephony-grade audio and handled numerals, named entities and code-mixed text better than several global TTS systems, according to Sarvam and coverage in national media.

Separately, early tests on document reading in Indian languages showed the company’s Vision tool outperforming generalist models on some India-language OCR tasks.

These wins are task-specific, not blanket judgments of overall AI capability.

HOW INDEPENDENT WERE THE TESTS?

Sarvam says the Bulbul V3 results come from blind listening studies and automated comparisons done with partners and public models.

Media reports note independent listener votes and large sample sizes for some tests, but they also caution that vendor-led evaluations require outside replication for conclusive rankings.

In short, the results are notable but should be read as early evidence, not final proof that one company has dethroned global leaders.

WHAT THIS MEANS FOR USERS AND BUSINESSES

For Indian firms and public services, Sarvam’s tools promise cheaper, local-language voice agents and better OCR for native scripts.

The company has also worked with cloud partners and is part of government talks on sovereign AI, which could accelerate adoption in government and telecom applications. Still, large global models remain stronger on many general tasks; Sarvam’s advantage today is its India focus and efficiency.

Sarvam AI’s recent results are important for India’s tech ecosystem.

The results show that focused engineering and local present data can produce models that has the caliber to outshine generic systems on real-world, country-specific tasks.

Wider, independent testing will decide how far this lead goes

- Ends
Published By:
Rishab Chauhan
Published On:
Feb 8, 2026

Read more!
advertisement

Explore More