classification (5)

12212937491?profile=RESIZE_400xThis peer-reviewed pre-print (open access) reports a classification model for different Greek olive oil cultivars using combined data from two analytical techniques: volatile component analysis (6 marker compounds) by solid phase microextraction – gas chromatography (SPME-GC-MS) and spectral analysis by attenuated total reflectance-Fourier transform infrared spectroscopy (ATR-FTIR)

The model was built to differentiate Greek oils from 3 cultivars: Koroneiki, Megaritiki and Amfissis.  The reference database was constructed from samples collected over 3 harvest seasons.  The authors report that application of the supervised methods of linear and quadratic discriminant cross-validation analysis, based on volatile component data, provided a correct classification score of 97.4 and 100.0%, respectively. The corresponding statistical analyses were used in the mid-infrared spectra where the 96.1% of samples were discriminated correctly.

The authors conclude that ATR-FTIR and SPME-GC-MS, in conjunction with the appropriate feature selection algorithm and classification methods, are powerful tools for the authentication of Greek olive oil. They consider that the proposed methodology could be used in industrial settings for the determination of Greek olive oil botanical origin.

Read more…

13409990692?profile=RESIZE_400xThis study (open-access author’s link available until February 14, with thanks to Michele Suman for sharing) reports the development and validation of a non-targeted classification method for authenticity of dried oregano leaves by atmospheric pressure matrix-assisted laser desorption ionization mass spectrometry (AP-MALDI-MS).

The model was trained on 23 authentic oregano samples (sourced from a reputable company with full supply chain traceability - originated from Italy, France, Turkey, or Albania, harvested between 2019 and 2022) along with five pure adulterants (dried leaves of savory (Satureja montana), myrtle (Lagerstroemia indica), sumac leaves (Rhus coriaria), strawberry tree (Arbutus unedo), and olive tree (Olea europaea)), plus sixteen adulterated oregano samples, intentionally mixed with the above mentioned adulterants at ranges between 5 % and 60 %.

The most abundant signals were characterized by collision induced dissociation and library search, the spectral data were submitted to statistical analysis. A basal inquiry of the data by partial least squared discriminant analysis (PLS-DA) was carried out for the simple assessment of the discrimination capabilities of the ± AP-MALDI-MS signatures. The researchers then constructed two distinct random forest (RF) classifiers using the positive and negative most informative ions teased out by recursive feature elimination from the training sets. The aforementioned most significant variables (m/z values) were also merged by mid-level data fusion and used to build a third RF classifier.

They report that the cross-validations of the three RF classifiers achieved good outcomes as demonstrated by the satisfactory values of overall accuracy (84.9 %, 92.1 %, and 92.8 %, respectively). The three RF classifiers were tested on the hold-out data, which revealed reliable classifier performances (accuracy 80.1 %, 87.0 %, and 85.4 %).

Photo by 360floralflaves on Unsplash

Read more…

13404710057?profile=RESIZE_400xA recent FAN blog described non-destructive impedance sensors as a tool to classify meat freshness.

In this paper (open access) the authors have used the same principle and developed a classification model for potato varieties based on the effect of their dry matter content on an electrical impedance sensor.  The test is destructive as the potato must be sliced.  The authors built a reference database from data from 9 cultivars (Actrice, Ambra, Constance, El Mundo, Fontane, Gaudi, Jelly, Monalisa and Universa) sourced directly from the grower.  These cultivars were chosen because they cover a wide range of dry matter content.  The authors collected multivariate analytical data from the impedance sensor; impedance magnitude and phase data along with derived parameters such as the minimum phase point of each spectrum, the ratio between the low- and high-frequency values of the impedance magnitude,  the dissipation factor, the distance between the zero and the maximum value of the Nyquist plot, and  the Cole model equivalent circuit parameters.

They conclude that machine learning methods for predicting potato dry matter and varieties, based on impedance data, can achieve an equivalent (sub-optimal) performance to conventional methods and that they hold promise for future improvement to surpass conventional methods. An improved deeper analysis could aim to reduce the root-mean-squared error and increase the coefficient of determination value, thereby enhancing the accuracy of dry matter data predictions. To achieve this, various techniques such as feature engineering, hyperparameter tuning, and advanced modelling approaches (e.g. convolutional neural networks) could be explored. The authors consider that alternate chemometric methods like the Kennard-Stone algorithm, which selects representative samples based on distance criteria, could lead to more robust dataset partitioning. Additionally, incorporating data fusion with results obtained through infrared spectroscopy could further improve the model’s performance.

Photo by Rodrigo dos Reis on Unsplash

Read more…

12973053455?profile=RESIZE_400xIn this study (purchase required) the researchers build a classification model for differentiate freshwater from seawater shrimp (prawns), Litopenaeus vannamei, based on fatty acid (FA) profiling in muscle and hepatopancreas.

They built an untargeted model, using k-nearest neighbor (KNN) and random forest (RF), to identify discriminatory variables.

They then identified, using orthogonal partial least squares-discriminant analysis (OPLS-DA) specific FAs to create their classification model: six (C22:6n3, C20:3n3, C17:0, C18:3n3, C20:5n3, and C20:2) from the muscle and seven (C22:6n3, C16:0, C18:3n3, C18:2n6, C20:2, C20:1, and C18:1n9) from the hepatopancreas.

They report that, using FA profiles from the two tissues, both KNN and RF had initial and cross-validated classification rates >93%, while the predictive classification rates of the models based on muscle FA profiles were higher than that of the models based on hepatopancreas FA profiles. They conclude, therefore, that FA profiles in muscle were more effective than hepatopancreas FAs for this promising classification method.

Photo by Dan Dennis on Unsplash

Read more…

12803795253?profile=RESIZE_710xAbstract

Honey authentication is a complex process which traditionally requires costly and time-consuming analytical techniques not readily available to the producers.
 
This study aimed to develop non-invasive sensor methods coupled with a multivariate data analysis to detect the type and percentage of exogenous sugar adulteration in UK honeys. Through-container spatial offset Raman spectroscopy (SORS) was employed on 17 different types of natural honeys produced in the UK over a season. These samples were then spiked with rice and sugar beet syrups at the levels of 10%, 20%, 30%, and 50% w/w. The data acquired were used to construct prediction models for 14 types of honey with similar Raman fingerprints using different algorithms, namely PLS-DA, XGBoost, and Random Forest, with the aim to detect the level of adulteration per type of sugar syrup.
 
The best-performing algorithm for classification was Random Forest, with only 1% of the pure honeys misclassified as adulterated and <3.5% of adulterated honey samples misclassified as pure. Random Forest was further employed to create a classification model which successfully classified samples according to the type of adulterant (rice or sugar beet) and the adulteration level.
 
In addition, SORS spectra were collected from 27 samples of heather honey (24 Calluna vulgaris and 3 Erica cinerea) produced in the UK and corresponding subsamples spiked with high fructose sugar cane syrup, and an exploratory data analysis with PCA and a classification with Random Forest were performed, both showing clear separation between the pure and adulterated samples at medium (40%) and high (60%) adulteration levels and a 90% success at low adulteration levels (20%).
 
The results of this study demonstrate the potential of SORS in combination with machine learning to be applied for the authentication of honey samples and the detection of exogenous sugars in the form of sugar syrups. A major advantage of the SORS technique is that it is a rapid, non-invasive method deployable in the field with potential application at all stages of the supply chain.
 
Read more…