Research is conducted, performed and organized in the department through established platforms. These platforms provide research facilitation, financial support and direct research activities in the department.
Academic staff is expected to participate actively in the research endeavours of the department within the following focus areas:
RESEARCH FOCUS AREAS
Advanced Dynamic Statistical Analysis
SHORT TITLE: Statistical learning
SHORT DESCRIPTION: The main statistical theme of the research is nonparametric predictive modeling of quantitative and qualitative quantities, specifically considering flexible techniques which are typically computer intensive.
PEOPLE INVOLVED: F. Kanfer, S. Millard
SHORT TITLE: Statistical Image Processing
SHORT DESCRIPTION: The core of this research stems from my work during my PhD on the LULU operators and the resulting discrete pulse transform. The various applications in the field of image processing are now being investigated. This includes segmentation, image comparison, object detection and tracking, noise correction and modeling, pattern modeling, robotics, compressed sensing, brain imaging (FMRI images), Nano scale image analysis amongst others. Other projects include extreme value modeling of natural disasters as well as statistics in sport.
PEOPLE INVOLVED: I. Fabris-Rotelli, A. Stein (Netherlands), A. Kijko (University of Pretoria Natural Hazard Centre), F. Kanfer, S. Millard,
SHORT TITLE: Bayesian network modelling
The complex problem of environmental management is scoped by both time and geometric dynamics. Bayesian networks provides a probabilistic framework to deal with these dynamics. Because of its conditional independence assumption, it can handle high-dimensional spaces very well. It also provides a powerful ‘what-if’ analysis interface for managerial and scenario support. The use of Bayesian networks and similar graph theory in order to approximate Bayes-Nash equilibria is a novel research area. The application field is computation of optimal persuit strategies.
PEOPLE INVOLVED: A. de Waal, T Loots.
SHORT TITLE: Topic modelling for short text
SHORT DESCRIPTION:The Latent Dirichlet Allocation (LDA) model is one of the most popular topic models and it makes the generative assumption that a document belongs to many topics. Conversely, the Multinomial Mixture (MM) model, another topic model, assumes a document can belong to at most one topic, which we believe is an intuitively sensible assumption for short text. Based on this key difference, we posit that the MM model should perform better than the LDA. Initial results are promising and current research include a systematic and thorough evaluation framework for topic models as well as computational optimisation of algorithms.
PEOPLE INVOLVED: A. de Waal, J. Mazarura
SHORT TITLE: Quantile-based distribution theory
SHORT DESCRIPTION: The main aim of this project is the development of methodologies for the construction of quantile-based families of distributions. The focus is on the building of new distributional models through the transformation of quantile functions using different kernels. The measuring and description of the location, spread and shape of these new distributions are done, where possible, through conventional moments, L-moments as well as quantile-based measures. The most appropriate estimation algorithm for the parameters of each created distribution is identified.
PEOPLE INVOLVED: P. van Staden, N. Balakrishnan (McMaster University, Canada), H Boraine (Department of Performance Monitoring and Evaluation (DPME), The Presidency, South Africa), G. Jasso, (New York University, United States of America), R.A.R. King (University of Newcastle, Australia), B. V. Omachar.
SHORT TITLE: Distribution theory-Bridging the gaps
SHORT DESCRIPTION PROJECT: The focus of this project is primarily on the study, development and expansion of distributions, and the addressing of parametric statistical inferential aspects, which are within the classical as well as the Bayesian framework. This project aims to bridge gaps in existing theory and improve the modeling of characteristics of processes. This will establish a solid theoretical base for the distributions of interest, enhance understanding in this field and expand the general body of knowledge in this discipline.
PEOPLE INVOLVED: A. Bekker, M. Arashi (Shahrood University, Iran), D. de Waal (University of Free State), F. Marques (Universidade Nova de Lisboa, Lisbon), N. Balakrishnan (McMaster University, Canada), S.W. Human (Nedbank), P.J. Mostert (Stellenbosch University), J. Visagie, J. van Niekerk, J. Ferreira, T. Loots, S Makgai.
Generalized linear modeling
SHORT TITLE: Estimation procedures when the usual assumptions are violated
SHORT DESCRIPTION: The scenario addressed is that of estimating parameters in the exponential class in the case of general covariance structures. This includes the cases of unequal covariance matrices and small samples (n<p).
PEOPLE INVOLVED: H.F. Strydom
SHORT TITLE: Statistical analysis of grouped data
SHORT DESCRIPTION: The main objective of this research is to provide a theoretical foundation for analysing grouped data, taking the underlying continuous nature of the variable(s) into account. Due to a lack of the appropriate statistical techniques to evaluate grouped data, researchers are often tempted to ignore the underlying continuous nature of the data. By implementing the ML estimation procedure of Matthews and Crowther various exiting applications in the field of Official Statistics, Insurance and many more can be exploited.
PEOPLE INVOLVED: G. Crafford
SHORT TITLE: Bayesian inference in Econometric modelling
SHORT DESCRIPTION: We consider different situations in which the preliminary test estimator is applicable and to evaluate the performance of the preliminary test estimator under different loss functions. In situations where under- or overestimation exists, symmetric loss functions do not give optimal results and it necessitates the consideration of asymmetric loss functions. We also investigate situation where prior information is available about an unknown population parameter that could be used in the form of a constraint in the modeling procedure and situations where a model is specified under uncertainty where specific stochastic modeling assumptions must be tested before the appropriate model can be specified.
PEOPLE INVOLVED: J. Kleyn, M. Arashi
SHORT TITLE: STONK
SHORT DESCRIPTION: The focus is on statistical research in sport. The research aims and objectives include the development of an actuarial method based on life assurance valuation techniques for adjusting cricket teams’ scores in rain-affected Twenty20 cricket games, numerical and graphical performance measures for evaluating and comparing batsmen, bowlers and all-rounders in cricket, the construction of a system for ranking of rugby teams from multiple leagues and the optimal allocation of swimmers to relay teams
PEOPLE INVOLVED: P. Van Staden, I. N. Fabris-Rotelli, Ms. Marli Venter (Department of Insurance and Actuarial Science)
SHORT TITLE: Identifying and Evaluating Threshold Concepts in First Year Statistics modules at the University of Pretoria
SHORT DESCRIPTION: In the teaching of Statistics, certain central concepts/topics are experienced as more difficult to comprehend than others, especially within a group of students with diverse mathematical abilities. Misconception of such concepts/topics while studying Statistics on the 100 level, where the foundation of the discipline is laid, is problematic since it might prohibit the student from understanding and grasping the core concepts upon which the discipline is developed. These misconceptions will also influence the student`s future studies of the discipline since no proper holistic view of the inner mechanics of the different procedures and techniques nor the interrelatedness of the different procedures and techniques will be present. Failure to master these concepts/topics could also restrict the progression within a course since in all Statistics courses topics/concepts build onto one another. These concepts are referred to as threshold concepts where a threshold concept is a conceptual gateway that opens up a new and previously inaccessible way of thinking without which you cannot progress in the subject.
PEOPLE INVOLVED: A. Swanepoel, L. Fletcher, J. Engelbrecht (Department of Science, Mathematics and Technology Education), A. Harding (Department of Mathematics)
SHORT TITLE: The effect of interdisciplinary collaboration: Statistics and Academic Literacy
SHORT DESCRIPTION: A collaborative model for inquiry based project work in Statistics (WST133/143), with the Academic Literacy (LST) department, has been introduced since 2014. This research project focuses on feasibility and improvement of the model for future use, lecturer reflection on strengths and weaknesses of the model, students’ experiences of the model and the impact of the collaboration on student learning.
PEOPLE INVOLVED: A.D. Corbett, E. Coetzee, S. Immelman, I. Fouche
SHORT TITLE: Scaffolding learning in an extended learning programme for foundation mathematical statistics
SHORT DESCRIPTION: Action research on the in practice effect of a variety of teaching and learning methods, as informed by theory, is conducted in terms of its potential to shape the approach to learning of first year students. An assessment model that specifically targets students’ approach to learning through appropriate structuring of a variety of assessment opportunities and constructive alignment to desired learning outcomes, is implemented. It is based on a balanced mix of ideas, connections and extension levels of learning and thinking, across a set of activities, including the flipped classroom, blended learning and enquiry-based project work.
PEOPLE INVOLVED: A.D. Corbett, C. Kraamwinkel, E. Coetzee
SHORT TITLE: Investigating different initiatives to address the throughput rate of a large first level Statistics module
SHORT DESCRIPTION: This research area focuses on the educational facets of Statistics. Blended learning models, including the flipped classroom vs. the traditional teaching model, are explored and various paradigms e.g. constructivism which fosters the process of deep learning by employing knowledge scaffolding, are investigated.
PEOPLE INVOLVED: F. Reyneke, L. Fletcher. A Harding (Mathematics Department)
Statistical Process Control
SHORT TITLE: Advances in Statistical Process Control (SPC)
SHORT DESCRIPTION: In today's technologically advanced era of high-quality manufacturing with the possibility of real-time and on-line monitoring, it is of utmost importance to detect any change in location or change in variation (or, for that matter, any deviation from a specified target, however small it might be) as soon as possible. In the recent literature a number of techniques have proven to be effective in detecting small shifts. These techniques all have a common idea i.e. to use all (or, as much as possible) of the historically available information to make an informed and objective decision about the state of a process. So, it is only natural to consider if these approaches can be improved upon and/or if an optimal time-weighted procedure exists to detect minute changes. The envisaged research is to be a pioneer in filling this important gap in SPC and focuses specifically on developing a new class of control charting techniques for detecting tiny changes as effectively/quickly as possible.
PEOPLE INVOLVED: S.W. Human, N. Balakrishnan, N. Chakraborty, A. Mijburgh
Bayesian Nonlinear Models
SHORT TITLE: Robust Nonlinear Mixed Effects Regression Models in Tuberculosis Research
SHORT DESCRIPTION: Trials of the early bactericidal activity (EBA) of tuberculosis (TB) treatments assess the decline, during the first few days to weeks of treatment, in colony forming unit (CFU) count of Mycobacterium tuberculosis in the sputum of patients with smear-microscopy-positive pulmonary TB. Profiles over time of CFU count have conventionally been modeled using linear, bilinear or bi-exponential regression. Recent research proposes a biphasic nonlinear regression model for CFU count that comprises linear and bilinear regression models as special cases, and is more flexible than bi-exponential regression models. Bayesian nonlinear mixed effects (NLME) regression models are fitted jointly to the data of all patients from various trials, and statistical inference about the mean EBA of TB treatments is based on the Bayesian NLME regression model. The posterior predictive distribution of relevant slope parameters of the Bayesian NLME regression model provides insight into the nature of the EBA of TB treatments; specifically, the posterior predictive distribution allows one to judge whether treatments are associated with mono-linear or bilinear decline of log(CFU) count, and whether CFU count initially decreases fast, followed by a slower rate of decrease, or vice versa. The research primarily includes the investigation of heavy tailed and skew distributions to develop models that are robust to occasional outliers seen in data.
PEOPLE INVOLVED: Divan Burger (University of Pretoria) and Robert Schall (University of the Free State).
Useful general links
South African Statistical Association
Statistics South Africa
Statistics Question Bank
The Joy of Statistics - documentary by Professor Hans Rosling
ICCSSA (Institute of Certified and Chartered Statisticians of South Africa)
SAS Global Forum