While the rest of the world is experiencing the so-called golden age of statistics, it has been declared a discipline in crisis by the South African Department of Science and Technology (DST) and the National Research Foundation (NRF), said Professor Andriëtte Bekker, Head of the Department of Statistics at the University of Pretoria (UP).
She was speaking at the opening ceremony of the first International Symposium on Computational and Methodological Statistics and Biostatistics hosted by UP recently at its Future Africa Campus.
The main objective of the symposium was to unify fundamental methodological research in statistics together with computational aspects of the modern era. It also served as a platform to collaborate with international experts and pave the way for continued international future research collaboration. It included 15 speakers from around the globe.
Prof Bekker said although there was huge demand for highly skilled statisticians, most graduates entered formal employment without completing their postgraduate studies. “A major step in advancing the discipline was the establishment of the South African Research Chairs Initiative (SARCHI Chair) in Biostatistics at UP at the end of 2018,” she said.
Keynote speaker Professor Sudipto Banerjee, Head of the Department of Biostatistics at the University of California, delivered a talk on High-dimensional Bayesian Geostatistics (on your laptop!). He said despite the growing capabilities of geographic information systems and user-friendly software, statisticians today routinely encounter geographically referenced data containing observations from a large number of spatial locations and time points.
“Such data arise in diverse disciplines within the environmental and physical sciences. Over the last decade, hierarchical spatiotemporal process models have become widely deployed statistical tools for researchers to better understand the complex nature of spatial and temporal variability in the environmental and physical sciences.”
However, fitting hierarchical spatiotemporal models often involves expensive matrix computations with complexity increasing in cubic order for the number of spatial locations and temporal points. The computational bottleneck renders such models unfeasible for large data sets.
Prof Banerjee and some of his postgraduate students developed models for constructing well-defined highly scalable spatiotemporal stochastic processes, which can be used as ‘priors’ for spatiotemporal random fields within a hierarchical latent process setting and deliver full Bayesian inference.
Scalable spatial process models have been found especially attractive due to their richness and flexibility and, particularly so in the Bayesian paradigm, due to their presence in hierarchical model settings, said Prof Banerjee.
These models can be described as solutions for spatiotemporal big data. The models ensure that the algorithmic complexity has ~n floating point operations (flops), where n is the number of spatial locations (per iteration).
“One of the main advantages of these models is that they require standard computing environments such as desktops or laptops using easily available statistical software packages in the R statistical computing environment via the Comprehensive R Archive Network (CRAN),” Prof Banerjee said.
Several packages that automate Bayesian methods for point-referenced data and diagnose convergence of Markov chain Monte Carlo (MCMC) algorithms are easily available from CRAN. Packages that fit Bayesian models include geoR, geoRglm, spTimer, spBayes, spate and ramps, he said.
The symposium was preceded by a Graduate Learning Camp whose aim was to expose the next generation of young South African academics within the discipline of statistics to knowledge on Computational and Methodological Statistics and Biostatistics. The camp was exclusively earmarked for postgraduate students enrolled in any statistics and data science postgraduate degree at a South African university who do not yet have a PhD degree.
Several workshops were held on how to write scientific papers. The workshops were presented by professors Alfred Stein from the University of Twente (The Netherlands), Banerjee and Ding-Geng Chen from the University of North Carolina at Chapel Hill (United States) as well as Dr Joe Cappelleri, Executive Director of Biostatistics at Pfizer.
Acting Director of Future Africa, Professor Bernard Slippers, said: “We are living in a rapidly changing world and have to rethink the way we work. We have to move away from silos and focus on how to prepare and equip our students for the future. This is one of the main objectives of the Future Africa Campus – hosting symposiums of this kind in order to develop Africa’s science capacity and to strengthen the continent’s research networks.”