Abstract:
Coffee is a popular and widely consumed beverage that people worldwide enjoy. It has a long history and cultural significance, often part of daily routines. Coffee quality assessment is vital in the industry, influencing consumer satisfaction and market value. Traditional assessment methods usually rely on subjective judgments, which can be time-consuming, labor-intensive, and costly. In this study, we develop a prediction model for coffee quality attributes by fusing multiple modalities. The model was trained on a comprehensive dataset of various coffee samples and their corresponding quality attribute values. We used machine learning and image analysis techniques to develop the model. We explore various algorithms like linear regression, random forests, and other regressions, along with feature selection techniques such as recursive feature elimination and polynomial featuring with scaling. We used HOG, GLCM, and color descriptors to extract color, texture, and shape features from roasted coffee beans. The model was trained and tested using a dataset divided into training (80%) and testing (20%) sets. To evaluate the model we use mean squared error (MSE), mean absolute error (MAE), and coefficient of determination (R2 score). Our model demonstrated exceptional performance, achieving a high R2 score of 0.98754. This indicates a strong correlation between the predicted and actual sensorial scores. The low MAE of 0.014088 and MSE of 0.00969 further validate the accuracy and reliability of our model's predictions. This automated approach may saves time, reduces labor costs, and enhances overall efficiency in the coffee production and distribution processes. Furthermore, the model may improve consumer satisfaction by offering high quality coffee products for the Ethiopian and international market.
Key Words: Coffee, Quality Attributes, Linear Regression, Machine learning, Feature Extraction and Feature Selection