13125786064?profile=RESIZE_400xCombining data from multiple analytical tests into a single classification model can give greater discrimination than a model based on a single analytical technique.  There are three levels of data fusion: low-, mid- and high-level. Low-level fusion is the concatenation of analytical data obtained from several different sources.  Mid-level strategies imply that only the most significant features are fused after conducting a feature extraction step. High-level fusion combines the classification or regression results after they have been extracted separately from each type of data source.

This study (open access) aimed to build a model, testing different low- and mid- level fusion approaches of Raman and ATR FT-IR spectroscopy data, that can discriminate honey from different botanical origins (acacia, linden, colza, and raspberry) and different harvest years (2020 and 2021).  The models were trained using honeys exclusively collected from Romania but was then validated using honeys (of both the classes sought and of unrelated classes) from other countries of origin.

The authors tried different data fusion approaches.  They concluded that data fusion provided more accurate classification results than those resulting from a single input data type (i.e. Raman or IR) in most of the cases. Nevertheless, the simple increase of the number of input variables through the concatenation of the experimental data will not automatically generate an improvement in the models' prediction rate.  In order to obtain the best results using the data fusion approach, it is essential to find the best way of reducing the input space to those variables that have the highest discrimination capacity.

They found that the best performances were obtained when the low-level data fusion approach was used for both botanical and harvesting year recognition models. The differentiation potential of the classifiers was proven using an external validation set, leading to test accuracies between 84% and 100% for the two investigated classification criteria. The single difference was that for the botanical differentiation, the use of the fingerprint spectral regions proved to be more effective, while for the harvesting year classification, the involvement of the entire spectral ranges led to a better performance.

Mid-level data fusion provided similar differentiation accuracies in cross-validation, either for the use of the fingerprint regions or of the entire measured spectral ranges.

Photo by Roberta Sorge on Unsplash

E-mail me when people leave their comments –

You need to be a member of FoodAuthenticity to add comments!

Join FoodAuthenticity