SOFTWARE RISK PREDICTION AT REQUIREMENT AND DESIGN  PHASE  : AN ENSEMBLE MACHINE LEARNING APPROACH

YIBELTAL, ASSEFA

SOFTWARE RISK PREDICTION AT REQUIREMENT AND DESIGN PHASE : AN ENSEMBLE MACHINE LEARNING APPROACH

YIBELTAL, ASSEFA

URI: http://ir.bdu.edu.et/handle/123456789/15766

Date: 2023-06

Abstract:

Software development is a highly structured process that involves the creation and maintenance of a particular system, ranging from simple applications to complex enterprise software. Despite following a well-defined process, unforeseen events can occur at any stage of the SDLC that may impact the software development process, leading to losses or failures in software development. Software projects inherently involve risks, and no software development project is immune to these risks. Identifying and predicting such risks accurately is a challenge in software project development. Specially most of risk occur at requirement and design phase which leads to expand the risks for other later phase and more economic losses. To address this challenge, this study aims to develop a software risk prediction model using homogenous ensemble machine learning algorithms. These algorithms were selected due to their proven effectiveness in handling complex datasets and their ability to achieve high prediction accuracy. We have used an experimental research methodology to develop a software risk prediction model. The methodology involved collecting datasets related to requirements and design from publicly available websites such as Zenodo and Harvard education dataset around 400 number of instances. These datasets were then used to train and validate the performance of the machine learning algorithms. Our study has achieved impressive prediction with the algorithms Gradient Boost, Random Forest, AdaBoost, and bagging algorithms with their homogenous decision tree which are score 98.67%, 97.3%, 96.0%, and 96.0% respectively. Using the four different homogeneous ensemble machine learning algorithms we develop software risk predictive models. Ultimately, Gradient Boost was selected as the algorithm to construct our risk predictive model due to its superior performance and ability to handle complex data. By employing this model, software development organizations can improve their ability to identify and mitigate risks, thereby improving the quality and reliability of their software products. Keywords: ensemble machine learning algorithms, requirements phase, design phase, software risk prediction.

Show full item record