Machine learning for authentication of black tea from narrow-geographic origins: combination of PCA and PLS with LDA and SVM classifiers
DATE:
2024-07
UNIVERSAL IDENTIFIER: http://hdl.handle.net/11093/9105
EDITED VERSION: https://linkinghub.elsevier.com/retrieve/pii/S0023643824006807
UNESCO SUBJECT: 2301 Química Analítica
DOCUMENT TYPE: article
ABSTRACT
This study investigates the feasibility of using UV–Vis spectroscopy coupled with machine learning methods to authenticate tea samples based on their geographical origins in a narrow longitudinal strip (200 km). Several preprocessing methods, such as standard normal variate (SNV), auto-scaling, multiplicative scatter correction (MSC), mean centring (MC), first derivative, and their combinations, were applied to eliminate the noninformative information. The partial least squares-linear discriminant analysis (PLS-LDA) model using first derivative spectra represented the following results, including 98.0% sensitivity, 99.5% specificity, and a mean accuracy of 98.0%. The support vector machine (PLS-SVM) classifier using first derivative spectra represented 94.0% sensitivity, 98.6% specificity, and a mean accuracy of 94.0%. The satisfactory results of the models depicted that the chemical components of tea, such as polyphenols, chlorogenic and fatty acids that absorb UV radiation are the chemical markers that can discriminate tea samples based on their geographical origin. Therefore, UV–Vis spectral fingerprinting combined with machine learning methods could be a practical, feasible, and simple method for classifying tea based on their geographical origins in a narrow longitudinal strip.