Publications

The system architecture for the AWS test bench

Comparison of Cloud-Computing Providers for Deployment of Object-Detection Deep Learning Models

As cloud computing rises in popularity across diverse industries, the necessity to compare and select the most appropriate cloud provider for specific use cases becomes imperative. This research conducts an in-depth comparative analysis of two prominent cloud platforms, Microsoft Azure and Amazon Web Services (AWS), with a specific focus on their suitability for deploying object-detection algorithms. The analysis covers both quantitative metrics—encompassing upload and download times, throughput, and inference time—and qualitative assessments like cost effectiveness, machine learning resource availability, deployment ease, and service-level agreement (SLA). Through the deployment of the YOLOv8 object-detection model, this study measures these metrics on both platforms, providing empirical evidence for platform evaluation. Furthermore, this research examines general platform availability and information accessibility to highlight differences in qualitative aspects. This paper concludes that Azure excels in download time (average 0.49 s/MB), inference time (average 0.60 s/MB), and throughput (1145.78 MB/s), and AWS excels in upload time (average 1.84 s/MB), cost effectiveness, ease of deployment, a wider ML service catalog, and superior SLA. However, the decision between either platform is based on the importance of their performance based on business-specific requirements. Hence, this paper ends by presenting a comprehensive comparison based on business-specific requirements, aiding stakeholders in making informed decisions when selecting a cloud platform for their machine learning projects.

Comparision of different explainability tools

Explainable AI for the Classication of Brain MRIs

Background: Machine learning applied to medical imaging labor under a lack of trust due to the inscrutability of AI models and the lack of explanations for their outcomes. Explainable AI has therefore become an essential research topic. Unfortunately, many AI models, both research and commercial, are unavailable for white-box examination, in which access to a model’s internals is required. There is therefore a need for black-box explainability tools in the medical domain. Several such tools for general images exist, but their relative strengths and weak- nesses when applied to medical images have not been explored. Methods: We use a publicly available dataset of brain MRI images and a model trained to classify cancerous and non-cancerous slices to assess a number of black-box explainability tools (LIME, RISE, IG, SHAP and ReX) and one white-box tool (Grad-CAM) as a baseline comparator. We use several common measures to assess the con- condance of the explanations with clinician provided annotations, including the Dice Coefficient, Hausdorff Distance, Jaccard Index and propose a Penalised Dice Coefficient which combines the strengths of these measures. Results: ReX (Dice Coefficient = 0.42±0.20) consistently performs relatively well across all measures with comparable performance to Grad-CAM (Dice Coef- ficient = 0.33±0.22). A panel of images is presented for qualitative inspection, showing a number of failure modes. Conclusion: In contrast to general images, we find evidence that most black-box explainability tools do not perform well for medical image classifications when used with default settings.

The SpectMatch Semi-Supervised Algorithm

Exploring Semi-Supervised Learning for Audio-Based Automated Classroom Observations

Systematic classroom observation is often used in evaluating and enhancing the quality of classroom instruction. However, classroom observation can potentially suffer from human bias. In addition, the traditional classroom observation is too expensive for resource-constrained environments (e.g., Sub-Saharan Africa, South and Central Asia). A cost-effective automation of classroom observation could potentially enhance both quality and resolution of feedback to the teacher, and hence potentially result in enhancing quality of instruction. Audio-based automatic classroom observation using supervised deep learning techniques has yielded good results in limited contexts. However, one challenge when using supervised techniques is the high cost of collecting and labelling the classroom audio data. One solution for such data-starved scenarios is to use semi-supervised learning (SSL) which requires significantly lesser data and labels. This paper explores an audio-adaptation of the state-of-the-art SSL FixMatch algorithm to automate classroom observation. An adaptation of the FixMatch algorithm was proposed to automate the coding for the Stallings class observation system. The proposed system was trained on classroom audio data collected in the wild. The supervised approach had an F1-score of 0.83 on 100% labeled data. The proposed FixMatch adaptation achieved an impressive F1-score of 0.81 on 20% labeled data, 0.79 on 15% labeled data, 0.76 on 10% labeled data, and 0.72 using only 5% of labeled data. This suggests that algorithms like FixMatch that use consistency regularization and pseudo-labeling have a great potential for being used to automate classroom observation using a small labelled set of audio snippets.

Investigating the Effect of Patient-Related Factors on Computed Tomography Radiation Dose Using Regression and Correlation Analysis

Computed tomography (CT) is a widely utilized diagnostic imaging modality in medicine. However, the potential risks associated with radiation exposure necessitate investigating CT exams to minimize unnecessary radiation. The objective of this study is to evaluate how patient-related parameters impact the CT dose indices for different CT exams. In this study, a dataset containing CT dose information for a cohort of 333 patients categorized into four CT exams, chest, cardiac angiogram, cardiac calcium score and abdomen/pelvis, was collected and retrospectively analyzed. Regression analysis and Pearson correlation were applied to estimate the relationships between patient-related factors, namely body mass index (BMI), weight and age as input variables, and CT dose indices, namely the volume CT dose index (CTDIvol), dose length product (DLP), patient effective dose (ED) and size-specific dose estimate (SSDE), as output variables. Moreover, the study investigated the correlation between the different CT dose indices. Using linear regression models and Pearson correlation, the study found that all CT dose indices correlate with BMI and weight in all CT exams with varying degrees as opposed to age, which did not demonstrate any significant correlation with any of the CT dose indices across all CT exams. Moreover, it was found that using multiple regression models where multiple input variables are considered resulted in a higher correlation with the output variables than when simple regression was used. Investigating the relationships between the different dose indices, statistically significant relationships were found between all dose indices. A stronger linear relationship was noticed between CTDIvol and DLP compared to the relationships between each pair of the other dose indices. The findings of this study contribute to understanding the relationships between patient-related parameters and CT dose indices, aiding in the development of optimized CT exams that ensure patient safety while maintaining the diagnostic efficacy of CT imaging.