Volume 18, No. 6, 2021

Elma+ : An Ensemble Learning Based Model For Accurate Prediction Of Software Defects



The recent developments made in the software engineering technologies has introduced the growth of the data. In order to handle the explosive growth of the data, many software quality assessments are introduced to validate designed software. Software defects assessment is one of the finest qualities of software engineering models. Software defects classification is the basis for effective management of software defects. This paper presents a novel ELMA+ technique to predict the classes of software defects. Here, an ensemble learning approach is taken to do the prediction systems. Initially, the defect data is collected from the public repository. A simpler exploratory data analysis is done to know the count of presence and absence of software defects. SMOTE technique is applied to preprocess the collected dataset. The presence of oversampling data has lowered the participation of minority classes during the training process. In order to leverage the minority classes and the presence of data ambiguity issues, the oversampling data are aligned with the synthetic data creation. The generated synthetic examples in alignment to the real-time data behave like feature space instead of data space. The minority of each class combines with the line segments of the nearest data points. Once the majority and minority classes are defined properly, then the oversampled data are sorted out. The scaled features are then fed into the ensemble of classifiers, namely, k-nearest neighbors (k-NN), Adaboost and Bagging. These three classifiers take the feature scaled data as input to classify the defects of the softwares. The proposed framework is simulated and has shown the efficacy of the proposed ensemble classifiers in terms of accuracy, sensitivity, specificity and the precision. The comparative analysis done from the perspective of before and after SMOTE application. It is clearly understood from the achieved results that the feature scaled data into the ensemble classifiers has yielded better outcomes.

Pages: 651-667

Keywords: Software engineering; Software defects; SMOTE technique; feature scaling process; Ensemble classifiers; Oversampling data.

Full Text