On January 18, 2026, Arab News reported that Pakistani student Taimoor Hassan has launched Qalb, an Urdu‑first large language model he describes as the largest LLM built exclusively for the Urdu language. Trained on around 1.97 billion tokens and benchmarked on multiple evaluation suites, Qalb is intended to power future chat apps and sector‑specific localized models for Pakistan and the wider Urdu‑speaking world.
This article aggregates reporting from 1 news source. The TL;DR is AI-generated from original reporting. Race to AGI's analysis provides editorial context on implications for AGI development.
Qalb is a small but important marker of how quickly the LLM wave is localizing beyond English and a handful of major languages. A student‑led team claiming competitive performance on an Urdu‑only model trained on under 2 billion tokens suggests that the tooling, open weights, and know‑how required to build usable language models are diffusing rapidly—even in countries without hyperscale compute. ([arabnews.com](https://www.arabnews.com/node/2629770/pakistan))
Strategically, Urdu is a good test case for “long‑tail” languages: over 200 million speakers, rich literature, but under‑served by mainstream models. If Qalb and similar efforts deliver better cultural grounding, idiom handling and domain adaptation than generic multilingual LLMs, they’ll validate a pattern we’re already seeing in Arabic, Indonesian and African languages: local players fine‑tuning or training specialized models on modest budgets.
For the AGI race, this doesn’t shift the frontier of raw capability, but it does broaden who gets to participate in building and steering AI. That diversity matters: the more communities that can experiment with their own models, the richer the feedback on alignment, safety and usefulness. It also means that talent in places like Pakistan is getting hands‑on experience with model training, evaluation and deployment—skills that will be directly relevant as global players recruit for the next generation of AGI labs.



