NaijaVoices Breaks AI Language Barrier, Empowers Nigerians to Speak to Technology in Native Tongues – Abraham Owodunni

For decades, digital assistants like Alexa and Siri failed to speak the languages of Nigeria’s people. Now, that reality is changing,thanks to NaijaVoices, a groundbreaking local-language Artificial Intelligence speech project that is giving voice technology a truly Nigerian language.

Founded with the mission to make artificial intelligence more inclusive, NaijaVoices is addressing one of the most significant gaps in the country’s digital ecosystem: the inability of speech AI systems to understand and respond in Nigerian languages such as Yoruba, Igbo, and Hausa.

“For years, everyday Nigerians could not talk to AI systems like Alexa, Siri, in our local languages we actually use at home and in markets,” said Abraham Owodunni, a PhD researcher at The Ohio State University and lead contributor on the project. “NaijaVoices changes that by giving AI the rich, real voices it needs to understand our accents, proverbs, and rhythm.”

The project has released a massive speech dataset of over 1,838.5 hours, drawn from 5,455 native speakers and over 645,000 unique sentences. This dataset is openly available on HuggingFace, enabling researchers, developers, and businesses to build AI systems that actually “hear” Nigerians.

More than a technological feat, NaijaVoices is a deeply cultural intervention. Instead of relying on scraped web data, the team worked with local communities to co-create culturally grounded text prompts and record authentic voices across age groups and regions.

“We did something simple but powerful: we built the right data, the right way,” Owodunnni explained. “When modern speech AI models are trained on NaijaVoices, they learn our voices, so the accuracy jumps, and the results sound like us.”

The immediate applications are vast and critical. From healthcare, where patients can describe symptoms in their mother tongue, to finance, with secure voice-first verification for non-English speakers, the project is reshaping access to essential services. Emergency services, public information broadcasts, literacy tools, and customer care can now be tailored for the languages people actually use.

The team’s mantra, “our language is our strength”, underscores its approach to data collection and annotation. From writers to facilitators, every contributor is a stakeholder in shaping the final product. According to Owodunni“ That community‑first approach is why the data is authentic and models trained on this dataset work better.”

NaijaVoices isn’t just building data, it’s nurturing local AI talent. The initiative has turned the data creation process into a skills pipeline, equipping writers, engineers, and linguists with practical experience in natural language processing and speech technology.

Its success is also powered by strong partnerships, ranging from Lacuna Fund and Meta, to McGill University, Mila, Masakhane, Intron, and other global and African collaborators. “Each partner brings expertise such as compute, funding, linguistics, or deployment so that the final product is both ethical and useful,” said Owodunni.

But the team remains acutely aware of the ethical stakes. “Consent, privacy, and respect come first,” Owodunni affirmed. Voices are never secretly scraped; data is anonymized, and diversity across gender, age, and region is prioritized.

Looking ahead, the project aims to scale its impact across Africa. “NaijaVoices is a blueprint which is properly documented in our research paper, published at Interspeech 2025” Owodunni said. “The same community‑driven model can be used to build strong datasets for more African languages, so a child in Kano, Kampala, or Kisangani can learn and bank in the language they know best.”

Still, challenges remain. From the cost of storage and compute, and model training to the need for sustained funding and supportive policy frameworks, the journey is far from over. “Policy-wise, we’d love to see public services and major platforms required to offer local‑language support. That one decision would turbo‑charge inclusion and the market for Nigerian‑language AI,” Owodunni noted.

Among the project’s latest initiatives is the NaijaVoices Language Heritage Micro‑Grants, a ₦4 million package aimed at supporting community projects that document and revitalize Nigerian languages.

To aspiring Nigerian AI researchers, Abraham Owodunni offers this advice: “Start with a problem your community feels. Build with people, not for them. Share your results so others can climb higher.”

With plans to expand language coverage and pilot solutions in education and health, NaijaVoices is fast becoming a model for inclusive AI development across the continent.

“The vision is simple,” Owodunni concludes, “if you can speak it in Nigeria, AI should understand it beautifully.”

Related Articles