Machine learning methods applied to the prediction of pseudo-nitzschia spp. blooms in the Galician Rias Baixas (NW Spain)
DATE:
2021-03-25
UNIVERSAL IDENTIFIER: http://hdl.handle.net/11093/2096
EDITED VERSION: https://www.mdpi.com/2220-9964/10/4/199
UNESCO SUBJECT: 2510.01 Oceanografía Biológica ; 1209.14 Técnicas de Predicción Estadística ; 2417.05 Biología Marina
DOCUMENT TYPE: article
ABSTRACT
This work presents new prediction models based on recent developments in machine learning methods, such as Random Forest (RF) and AdaBoost, and compares them with more classical approaches, i.e., support vector machines (SVMs) and neural networks (NNs). The models predict Pseudo-nitzschia spp. blooms in the Galician Rias Baixas. This work builds on a previous study by the authors (doi.org/10.1016/j.pocean.2014.03.003) but uses an extended database (from 2002 to 2012) and new algorithms. Our results show that RF and AdaBoost provide better prediction results compared to SVMs and NNs, as they show improved performance metrics and a better balance between sensitivity and specificity. Classical machine learning approaches show higher sensitivities, but at a cost of lower specificity and higher percentages of false alarms (lower precision). These results seem to indicate a greater adaptation of new algorithms (RF and AdaBoost) to unbalanced datasets. Our models could be operationally implemented to establish a short-term prediction system.