dc.description.abstract |
Architectural design is the first task after the requirement elicitation and analysis phase. In designing software architecture, the elicited requirements are the most important architectural driver for the design. Software architecture can be derived from functional, non-functional, technical constraints, and business constraints. The requirements which are significant for the design of architecture are called architecturally significant requirements (ASR). If ASR is not correctly identified, the resulting architecture will not be good architecture. Wrongly designed software can’t achieve the desired goal and quality, and this may eventually lead to the complete failure of the software. Due to the complex behaviors behind architectural requirements, identifying the correct requirement is complex even for experienced architects. In this study, we build a machine learning model for the identification and classification of architecturally significant non-functional requirements (ASNFR) for a real-time system from the SRS document. The proposed model can identify non-functional requirements as to whether it is architecturally significant or not significant and it classifies the requirement to its non-functional requirement categories. We have prepared a dataset from literature, books, online repository, and projects of a real-time system. We have used three machine learning techniques: support vectored machine, Naive Bayes, and K-Nearest Neighbor using TF-IDF and software engineering pre-trained word2vec. Grid search cross-validation techniques are used to tune the optimal value of hyperparameters of algorithms from the predefined possible values. We have used 10 fold stratified cross-validation for evaluating and comparing the model. ASNFR identification model predicts 88% accuracy using SVM with TF-IDF and 87% in NB and KNN using TF-IDF and it predicts 73%, 70%, and 75% using SVM, NB, and KNN with pre-trained word2vec respectively. For ASNFR classification the proposed model predicts 74%, 72%, and 77% in SVM, NB, and KNN with TF-IDF respectively, and predicts 53%, 40%, and 39% in SVM, NB, and KNN with the word2vec respectively. SVM with TF-IDF outperforms the other for the identification of ASNFR and KNN with TF-IDF outperforms for the classification of ASNFR. |
en_US |