India's Sovereign AI Stack in 2026: Sarvam, BharatGen, Krutrim — What's Built, What's Working

Three indigenous-LLM efforts now matter in India: Sarvam, BharatGen, Krutrim. Pick by Indic-language load and openness, not English benchmarks.

4 May 2026 Updated 19 May 2026 ~15 min read

What happened

Three Indian AI labs are far enough along that a dev team building an Indic-language product in 2026 can pick one and ship. Sarvam AI publishes Indic-tuned open-weight models on Hugging Face, including its February 2026 flagship release of Sarvam-30B and Sarvam-105B under Apache 2.0. BharatGen, the IIT Bombay-led mission, has released the Param-1 bilingual model and a wider Param family covering multilingual and multimodal variants. Krutrim, Ola’s AI arm, runs a consumer chat assistant and a developer cloud platform, and as of 5 May 2026 has publicly refocused away from frontier foundation-model and chip-design work toward AI cloud services. The IndiaAI Mission, approved by Cabinet on 7 March 2024 with a ₹10,371.92 crore outlay (approximately $1.22 billion USD at 2026-05-19 reference rate) over five years¹, is the public funding line behind much of this momentum.

For a dev team choosing an Indic-LLM today, the honest take is short. Sarvam is the most production-ready open-weight option, with a 2026 lineup that ranges from compact 2B Indic-tuned bases to the Sarvam-105B mixture-of-experts flagship covering all 22 official Indian languages. BharatGen is the academic-credible bet when reproducibility, public training-data documentation, and IIT-issued provenance matter for a procurement filter. Krutrim is best read in 2026 as a consumer-product play layered on an AI-cloud business, with open-weight Indic and multimodal releases on its krutrim-ai-labs Hugging Face organisation under the Krutrim Community License. None of the three match GPT-5 or Claude Sonnet on English-language reasoning benchmarks, and they do not need to. The choice axis is Indic-language load, openness and licence terms of the weights, and how the procurement chain reads the issuer of record. (All product framing here is as of 5 May 2026; pricing, model availability, and strategic posture fluctuate quickly, so verify on the lab’s own pages before treating any of this as a stable target.)

BharatGen marketing imagery from bharatgen.com, the IIT Bombay-led foundation-model mission's public landing page, used for editorial coverage

Image: BharatGen official site, used for editorial coverage of the foundation-model mission.

The details

Item	Value
Active indigenous-LLM efforts	Sarvam AI, BharatGen, Krutrim
IndiaAI Mission outlay	₹10,371.92 crore over five years (Cabinet-approved 7 March 2024)¹
Sarvam open-weight family (mixed licences)	Sarvam-30B and Sarvam-105B (February 2026 flagship MoE family, 22 official Indian languages, Apache 2.0); Sarvam-M (hybrid thinking model, Indic + English, Apache 2.0); Sarvam-1 and Sarvam-2B (Indic-first base models, 10 languages, Sarvam non-commercial licence — verify the specific Hugging Face model card before assuming a licence)²⁶
BharatGen Param family	Param-1 (2.9B bilingual Hindi + English); wider Param family extending to multilingual variants and multimodal models including Patram (document-vision 7B), Shrutam (ASR), and Sooktam (TTS) across nine Indic languages³⁷
Krutrim public surface	Consumer assistant + Krutrim Cloud developer platform; `krutrim-ai-labs` Hugging Face organisation publishes open-weight releases (Krutrim-1-instruct 7B, Krutrim-2-instruct 12B, Chitrarth multilingual VLM, Dhwani speech-translation, Krutrim-Translate, Vyakyarth) under the Krutrim Community License⁴⁸
Strategic posture (5 May 2026)	Sarvam: open-weight Indic-LLM bet, scaling. BharatGen: academic-credible foundation-model mission, IIT-anchored. Krutrim: refocused on AI cloud services; foundation-model and chip-design work paused, per public reporting⁹
Issuer of record on weights	Sarvam: Sarvam AI on Hugging Face. BharatGen: IIT Bombay-led consortium. Krutrim: Krutrim AI Labs (Krutrim SI Designs)
Date stamp on every figure here	5 May 2026

What changed between 2024 and 2026

The 2024 picture was a single name plus a Cabinet announcement. Sarvam AI was the visible commercial bet, BharatGen had been announced as a research mission, and Krutrim had launched a consumer chat product to mixed early reception. The sovereignty conversation was largely future-tense.

By mid-2026, four things shifted. First, the IndiaAI Mission moved from Cabinet approval to deployed compute capacity, with public empanelment of GPU providers and subsidised access for Indian researchers and startups.¹ Second, Sarvam went from a single open-weight 2B model to a full Apache-2.0 lineup that culminated in the February 2026 release of Sarvam-30B and Sarvam-105B. The 105B is a mixture-of-experts with 10.3 billion active parameters, covers all 22 official Indian languages, and an Indian team can actually fine-tune and self-host it.²⁶ Third, BharatGen released the bilingual Param-1 model alongside its multilingual and multimodal Param family (Patram, Shrutam, Sooktam), with public training-data documentation and an IIT Bombay institutional anchor that matters when a public-sector buyer’s procurement filter wants an Indian academic issuer of record.³⁷ Fourth, Krutrim’s posture moved from “consumer chat plus frontier-model ambitions plus chip design” toward “AI cloud services first, with the existing open-weight releases on krutrim-ai-labs continuing under the Krutrim Community License.” TechCrunch and MediaNama reported the strategic narrowing on 5 May 2026.⁹

Underneath all three commercial labs, the open-source Indic-NLP base maintained by the AI4Bharat group at IIT Madras keeps doing the unglamorous heavy lifting. Their IndicTrans2 (22-language neural machine translation) and IndicBERT (12-language masked LM) artefacts are the canonical reference points for the Indic-language benchmarking discussion below.

The thing that did not change: none of the three labs claims SOTA on English-language reasoning benchmarks, and that framing is correct. The reason to use any of these models is Indic-language coverage, on-shore data residency under DPDP Act 2023 obligations, or a procurement requirement for an Indian-issuer model. English-only chat is what GPT-5 and Claude Sonnet are for.

Sarvam AI: the open-weight Indic bet

Sarvam AI, founded in 2023 and headquartered in Bengaluru, ships open-weight Indic-language models on Hugging Face under the sarvamai organisation.² Its 2026 lineup spans four meaningfully different model classes:

Sarvam-1 and Sarvam-2B are the Indic-first base models, pretrained on the 10 Indic languages Sarvam-1 and Sarvam-2B explicitly support (Bengali, Gujarati, Hindi, Kannada, Malayalam, Marathi, Oriya, Punjabi, Tamil, Telugu) plus English. These are the workhorses for short-form Indic reasoning, summarisation, and structured-output tasks at small parameter counts.²⁵
Sarvam-M is a hybrid thinking model post-trained on Indian languages alongside English, with explicit “think” and “non-think” modes for reasoning, math, and coding workloads. Treat it as the reasoning model in the family rather than a translation or Indic-only workhorse.²
Sarvam-30B and Sarvam-105B are the February 2026 flagship release, both Apache-2.0 on Hugging Face. The 105B model is a mixture-of-experts with 10.3 billion active parameters per token, with explicit support for all 22 official Indian languages. That is the broadest language coverage in any Indian open-weight LLM as of writing.⁶

The practical implication for an Indian dev team is that Sarvam’s flagship Apache-2.0 models (Sarvam-30B, Sarvam-105B, Sarvam-M) are self-hostable for commercial use without licence friction, while the smaller Sarvam-1 and Sarvam-2B base models ship under Sarvam’s non-commercial licence — production-OK only after a written commercial licence with Sarvam. Verify the specific Hugging Face model card for the variant you choose before committing to a deployment. That distinction matters when a customer’s data cannot leave your VPC, when a procurement contract specifies India-resident inference, or when you want to fine-tune on your own domain corpus without a per-token API meter.

What Sarvam is genuinely good at: short-form Indic reasoning, summarisation, and structured-output tasks where the model has seen the language in training. The 105B flagship adds coverage across the long tail of officially-recognised Indian languages that smaller Indic-tuned models do not reach. What it is not built for: long-context English coding, frontier multimodal reasoning, or use cases where the customer’s actual ask is “the same as GPT-5 but cheaper”. For those use cases, the cheaper-than-frontier path is hosted GPT-4o-mini or hosted Sonnet on a per-token meter, not a self-hosted Indic model.

BharatGen: the academic-credible foundation-model mission

BharatGen is the IIT Bombay-led foundation-model initiative announced under the IndiaAI Mission and supported by the Department of Science and Technology.³ Its public framing is research-first: the project publishes training-data documentation, evaluation methodology, and model cards in the academic register that a peer reviewer or a public-sector procurement officer would expect to see.

The Param family is BharatGen’s headline release line. Param-1 is a 2.9-billion-parameter bilingual (Hindi + English) language model. BharatGen’s own description on its launch page is unambiguous, and the supporting arXiv paper (2507.13390) confirms the bilingual scope.³¹⁰ The wider Param family extends beyond Param-1 to multilingual variants, plus a set of multimodal models that handle document-vision (Patram, a 7B vision-language model), automatic speech recognition (Shrutam), and text-to-speech (Sooktam) across nine Indic languages.⁷ The release cadence has been slower than Sarvam’s because the academic-research register has higher documentation overhead than a startup release does, and that is a feature for the readers who care about it.

The pitch is clearest for two reader profiles. Public-sector buyers running a procurement filter on the issuer of record see “IIT Bombay-led consortium” and pass it; “private startup” may not pass the same filter at the same level. Indian academic researchers working on Indic NLP get a documented, reproducible, India-issued artefact to build on. For a private-sector dev team optimising for time-to-ship, BharatGen is slower than Sarvam to deploy but more credibly auditable when audit matters; for a team whose specific need is document-vision or Indic ASR/TTS rather than text-only LLM completion, the Patram, Shrutam, and Sooktam releases give BharatGen a multimodal surface that Sarvam’s text-only lineup does not match.

Krutrim: the consumer-product and AI-cloud bet (in active flux)

Krutrim, founded by Ola’s Bhavish Aggarwal and operated under Krutrim SI Designs, takes a different shape from the other two, and as of 5 May 2026 its strategy is in active flux. This section is therefore the most time-sensitive in the article.⁴⁹ The public surface still has three threads: a consumer-facing chat assistant aimed at the same usage pattern as ChatGPT or Gemini, an AI cloud platform offering compute and model APIs, and a Hugging Face organisation (krutrim-ai-labs) hosting open-weight model releases.⁸

Krutrim does ship open weights. The krutrim-ai-labs organisation publishes Krutrim-1-instruct (7B), Krutrim-2-instruct (12B, built on a Mistral-NeMo architecture base), Chitrarth (a multilingual vision-language model spanning ten Indian languages), Dhwani (speech-to-text translation), Krutrim-Translate, and Vyakyarth (a sentence-transformer model). All are released under the Krutrim Community License rather than Apache 2.0.⁸ The differentiator from Sarvam is not “proprietary versus open” but the licence terms (Krutrim Community License versus Apache 2.0), the release cadence, the English-versus-Indic balance of the lineup, and the consumer-product framing layered on top.

What changed on 5 May 2026 is the strategic emphasis. Reporting from TechCrunch and MediaNama on that date describes Krutrim publicly refocusing on AI cloud services and pausing further frontier foundation-model and chip-design work, with the existing model releases on Hugging Face continuing.⁹ The fair read for an Indian dev team is that Krutrim’s go-to-market reach in consumer chat and Indian cloud compute remains real, the existing open-weight artefacts on krutrim-ai-labs are reusable under the Krutrim Community License terms, and any plan that depends on a future Krutrim frontier-model release should treat the timeline as undefined until the lab publicly recommits. Verify Krutrim’s current focus on its own pages before treating any of this as a stable target. Pivots of this scale tend to keep moving for a few weeks before settling.

Krutrim marketing imagery from olakrutrim.com, the Ola-incubated AI venture's consumer assistant and AI cloud platform landing page, used for editorial coverage

Image: Krutrim official site, used for editorial coverage of the Ola-incubated AI venture.

Why it matters

The IndiaAI Mission’s ₹10,371.92 crore over five years is one of the largest public-sector AI compute and research outlays the country has committed to.¹ The mission’s pillars include shared GPU compute (subsidised access for Indian startups and researchers), an indigenous foundation-model line, an Indic datasets platform, application-development funding, and skilling. Sarvam, BharatGen, and Krutrim are the three labs that have shipped enough public artefacts to be meaningful beneficiaries of and contributors to that funding posture.

The DPDP Act 2023 sovereignty angle adds a second pressure on the choice. When a customer’s data cannot lawfully leave India, hosted inference on a US-headquartered frontier model is not a clean fit for the procurement contract. An on-shore Indian-hosted model on Indian GPU compute, fine-tuned on Indic-language tasks, is closer to clean. The labs that actually let you self-host or run inference on India-resident infrastructure are the ones that win this category. Sarvam’s Apache-2.0 lineup and BharatGen’s institutional anchor are the cleanest fits for procurement constraints that need both an open-weight artefact and an Indian issuer of record; Krutrim’s Community-Licence releases and AI cloud platform are usable for the same constraint but require the procurement chain to read and accept the Krutrim Community License terms rather than Apache 2.0.

For a developer reading this and choosing what to test: download a Sarvam model from Hugging Face, run it on your own GPU or on Krutrim Cloud or another India-resident provider, and benchmark it on your actual Indic-language task before you decide. The aggregate framing in this article is not a substitute for your own evaluation on the specific task you ship.

Honest caveats

We could not independently verify training-data sourcing claims for any of the three labs at the level a peer reviewer would require. BharatGen publishes more documentation than Sarvam or Krutrim does, but a full reproducibility check is beyond the scope of an aggregation article. Treat the institutional positioning as a procurement signal, not as a substitute for your own evaluation.

Indic-language benchmark numbers move quickly, and the labs do not all publish on the same eval suites. Public Indic benchmarks like IndicGenBench, MMLU-Indic translations, and the AI4Bharat eval sets are the closest things to apples-to-apples comparisons; reporting we have seen in 2025–2026 is not consistently primary-sourced, so we are not surfacing benchmark numbers in this article without a verifiable lab-issued source. If a vendor pitches you a single benchmark number, ask which evaluation set, which version, and which inference configuration produced it.

The IndiaAI Mission outlay figure cited above is the headline Cabinet-approved number; the actual disbursement schedule and the share that flows to each of these labs is not publicly itemised at this writing. Treat the ₹10,371.92 crore as a five-year ceiling, not as an annual run-rate any one lab is drawing on.

What we don’t yet know

The 2026 picture leaves three open questions. The first is whether any of the three labs publishes a credible English-reasoning benchmark result that closes the gap with frontier models on the eval suites Indian developers actually care about for English-language workloads. There is also the question of whether the IndiaAI Mission’s compute-empanelment scheme produces a subsidised-inference price point that meaningfully changes Indian startups’ AI cost structure. Finally, the consolidation pressure running through Indian ed-tech in 2025–2026 may or may not reach the AI lab landscape by 2027, with one of these three acquiring, being acquired by the others, or coming under a public-sector entity.

We will update this article when any of the three publishes a benchmark result, a major model release, or a strategic shift that changes the recommendation. Re-read in late 2026 for the next snapshot.

How this article was made: an autonomous AI pipeline researched, drafted, fact-checked, and reviewed this piece, aggregating publicly-available information from the sources consulted below. AI (artificial intelligence) can make mistakes, so please cross-check the consulted sources before acting on anything here. Neural Tech Daily is not liable for decisions or outcomes based on this article.

Sources consulted

Cited Sources

1. Press Information Bureau release on Cabinet approval of IndiaAI Mission, ₹10,371.92 crore outlay over five years (7 March 2024) (accessed 2026-05-05) ↩
2. Sarvam AI's Hugging Face organisation page listing open-weight Indic-language model releases (Sarvam-1, Sarvam-2B, Sarvam-M, Sarvam-30B, Sarvam-105B, Sarvam-Translate) (accessed 2026-05-05) ↩
3. BharatGen Param-1 announcement page describing the 2.9-billion-parameter bilingual (Hindi + English) language model (accessed 2026-05-05) ↩
4. Krutrim official site (Ola-incubated AI venture, consumer assistant plus AI cloud platform) (accessed 2026-05-05) ↩
5. Sarvam AI official site (Indic-first product framing, 10-language base coverage on Sarvam-1 and Sarvam-2B) (accessed 2026-05-05) ↩
6. Sarvam blog post announcing Sarvam-30B and Sarvam-105B (February 2026), with Sarvam-105B as a mixture-of-experts with 10.3B active parameters supporting all 22 official Indian languages, both released under Apache 2.0 (accessed 2026-05-05) ↩
7. BharatGen Hugging Face organisation page listing the wider Param family alongside Patram (document-vision 7B), Shrutam (ASR), and Sooktam (TTS) for Indic languages (accessed 2026-05-05) ↩
8. Krutrim AI Labs Hugging Face organisation page listing open-weight releases (Krutrim-1-instruct 7B, Krutrim-2-instruct 12B on Mistral-NeMo architecture, Chitrarth multilingual VLM across 10 Indian languages, Dhwani speech-to-text translation, Krutrim-Translate, Vyakyarth sentence-transformer) under the Krutrim Community License (accessed 2026-05-05) ↩
9. TechCrunch report (5 May 2026) on Krutrim's strategic refocus toward AI cloud services and the pausing of frontier foundation-model and chip-design work; cross-referenced by MediaNama's coverage (accessed 2026-05-05) ↩
10. arXiv 2507.13390 — PARAM-1 BharatGen 2.9B Model paper confirming the bilingual (Hindi + English) scope of Param-1 (accessed 2026-05-05) ↩