Lelapa AI, a South African AI research lab, has pioneered VulaVula – a for-profit language processing tool that translates, transcribes and analyses languages in English, Afrikaans, Zulu and Sesotho.
Data scarcity, ethical concerns
But AI experts say building LLMs in African languages poses significant challenges, ranging from availability of data to ethical concerns over consent, compensation and copyright.
Many African languages are low-resource languages, meaning there is a scarcity of data to train these models effectively – unlike high-resource languages such as English or French.
Michael Michie, Co-Founder of Everse Technology Africa, an AI startup building intelligence into data protection and privacy, said collecting the data needed to train LLMs also raised ethical questions.
In many African communities, oral tradition predominates, and certain communities may not be interested in sharing their language to train LLMs and this should be respected.
“There are currently no regulations or laws in African countries that address issues related to consent, privacy and compensation to communities when collecting data to train AI tools – this needs to be addressed,” said Michie.
“There are questions of who owns the language and who benefits. There needs to be guidelines to prevent exploitation and ensure the development of these LLMs benefits the people they are meant to serve,” he added.
Open-source initiatives like Creative Commons, which allow creators to legally share their work with specified conditions like ensuring attribution and non-commercial use, are also not a perfect solution, said some AI experts.
“At the moment there’s this push of saying everything should just be under Creative Commons,” said Vukosi Marivate, associate professor of computer science at the University of Pretoria and co-founder of Lelapa AI.
But if everything is open source, it may be harder to properly reimburse and acknowledge the original contributors to these language models, he said.
“A lot of people are working on LLMs now because of the prestige, that’s where the money is, but we need to make sure that our languages are actually being taken care of.”
Source link : https://www.deccanherald.com/technology/from-swahili-to-zulu-african-techies-develop-ai-language-tools-3069309
Author :
Publish date : 2024-06-17 04:26:53
Copyright for syndicated content belongs to the linked Source.