Posted on February 15, 2023
Tech giant Google is not just an internet search engine whose brand name has entered pop culture as a word. The company also funds academic research and scholarships – and it recently awarded the University of Pretoria (UP) three grants totalling $65 000 (over R1.1million).
About $40 000 (about R720 000) has been allocated to two one-year projects in which Google will collaborate with UP researchers, both at professorial and student level. These projects are in the cutting-edge fields of cross-lingual data resources, climate change and artificial intelligence (AI), and are both embedded in the South African context.
The remaining $25 000 (about R450 000) is an unrestricted gift to the University, which has split the amount into grants of R50 000. UP has invited its researchers to apply for this support on condition that their research relates to Google’s priority areas, such as responsible AI.
The two principal investigators at UP who have received research awards of about R360 000 each are Associate Professors Vukosi Marivate and Sonali Das.
Prof Marivate is the Absa Chair of Data Science in UP’s Department of Computer Science. He is already a Google research scholar, and is involved in creating new benchmarks for African languages. Prof Marivate will be working with UP lecturer, coordinator of the MIT in Big Data Science Master’s Degree and Witwatersrand University PhD candidate Abiodun (Abbey) Modupe, as well as computer science master’s student Thapelo Sindane.
They will investigate cross-lingual transfer learning for four South African languages: Sepedi, Sesotho, Setswana and IsiXhosa. The research includes creating task-based datasets for these languages, for which there are many speakers but which barely exist on the internet.
The team has been collecting texts, such as translated government speeches, in these four languages. The texts will be labelled based on two criteria: whether words are, for example, the name of a person or place, known as “entity recognition”; and whether they have positive, negative or neutral suggestions, known as “sentiment analysis”. Those two functions will be outsourced, after which Prof Marivate and his team will build an AI model showing the cross-references between the four languages. The excitement of the research lies in exploring unchartered territory.
“The cross-lingual stuff is not something we’re sure about,” Prof Marivate said. “All the languages are different. They have different histories, in some ways; they have different sets of data that might be available to us. We’ll try to come up with machine-learning models that can best transfer knowledge between them. That’s not a known thing – it’s something we’re going to experiment with.”
He compared it to a bilingual dictionary, which is a type of cross-lingual model. But this research is not about creating a bilingual dictionary, nor is it about being given one, he added.
“We will only have the texts in each of the languages,” Prof Marivate explained. “We won’t have any information about how those texts connect to one another. We need to build a mathematical model that figures out these connections.”
Research assistant Sindane said the project will pave the way for all practitioners facing technical challenges with limited data.
“This project could be a catalyst for technological advancements in African languages,” he said. “It could lead to more grammar correction tools and a health system that operates in native languages, for instance.”
Prof Das, a statistician in UP’s Department of Business Management, in the Faculty of Economic and Management Sciences, is leading a project that is investigating climate change information needs in South Africa and opportunities for AI and natural language understanding.
When Prof Das presented a one-minute pitch to Google for consideration, she stressed her experience and her fascination with the potential of working in a cross-disciplinary partnership.
“We are invested in the holistic nature of the problem,” she said. “Interdisciplinary and transdisciplinary is the way to go. Each of us has years of experience in our fields. Now we want to see the impact of bringing different strengths together.”
Her team consists of climate finance expert Professor Rangan Gupta of the Department of Economics, and two of her business management master’s students, Kenny Kutu and Hannah Brown.
Prof Das used to be a principal researcher in statistics at the Council for Scientific and Industrial Research, where her work included assessing risk in different climatic aspects in South Africa.
“I am still invested in finding relationships between climate events and their effects, particularly risk in terms of business decisions, their economic implications and consequent business strategies,” she said.
‘It’s important to make ourselves visible’
Like Prof Marivate and his project, Prof Das also stresses the open-ended nature of their research.
“I can’t tell you what information I am looking for because we don’t know,” she said. “What questions and answer pairs are we trying to evaluate, particularly in the era of accessible language models? I really don't know yet.”
They are starting by looking for answers to a few fundamental questions, including the following:
“South Africa is not like the rest of Africa, and it is not one situation,” Prof Das said. “People often think that something relevant for a location in Africa is an African problem in general – but Africa is diverse; it has six time zones. So it’s important to make ourselves visible in the right way to the world outside.”
South Africa is also heterogeneous and has problems other than the consistent problem of climate change, some of which are probably irreversible, she added.
“We have energy issues causing circular problems to the immediate agricultural problems; this has an impact on our food security, crop prices and decisions with regard to what will we grow in the next five years,” Prof Das said.
Google’s role in her project includes guidance on the open-source natural language understanding tools they will use in the research.
Copyright © University of Pretoria 2023. All rights reserved.
COVID-19 Corona Virus South African Resource Portal
To contact the University during the COVID-19 lockdown, please send an email to [email protected]
Get Social With Us
Download the UP Mobile App