machine learning (7)

13649103858?profile=RESIZE_400xThere are a large number of both commercial and in-house-written digital tools that attempt to classify and predict food safety risks based upon historic records in the EU Rapid Alert Service in Food and Feeds (RASFF) database.  With all such tools, it is important to remember that RASFF records are not a representative sample of either tests or results, and were never intended as a source of trends; the purpose of RASFF is rather to share specific individual alerts which may require regulatory action  on a cross-border basis.  Official tests are highly targeted, and often informed by previous RASFF alerts, so more alerts about a specific issue drives more official tests which drives more alerts (i.e. a feedback mechanism).  Also, RASFF only records the “positive” results, so there is no denominator; no indication of the number of “negative” results or the % incidence of an issue.  And finally, RASFF only records issues with a food safety concern so most food authenticity test results are excluded.

Despite these caveats, RASFF is still one of the most extensive and systematic public databases of food safety incidents and is likely to form the basis of many AI risk-prediction systems for years to come.

This paper (purchase required) evaluated the effectiveness of the Machine Learning models that sit behind such systems. The authors report that transformer-based models significantly outperform traditional machine learning methods, with RoBERTa achieving the highest classification accuracy. SHAP analysis highlights key hazards salmonella, aflatoxins, listeria and sulphites as primary factors in serious risk classification, while procedural attributes like certification status and temperature control are less impactful.

They conclude that despite improvements in accuracy, computational efficiency and scalability remain challenges for real-world deployment of AI risk-scoring and prediction systems.

Photo by Clarisse Croset on Unsplash

Read more…

13645902688?profile=RESIZE_400xNontargeted analysis for food authenticity by liquid chromatography–mass spectrometry (LC-MS) can provide data on thousands of chemical features. However, most studies that train machine learning models for food authentication use sample sizes in the tens or hundreds.  Such training sets are typically considered too small to be optimal, as it introduces the problem of overfitting when working with such a large feature-to-sample ratio.

This study (open access) aimed to mitigate this issue with a machine learning protocol designed for sub-optimal training sets, using honey as an example.   A recursive feature elimination (RFE) pipeline was developed specifically to address the challenges of optimizing the honey chemical fingerprint for multiclass machine learning classifiers on a limited number of samples with imperfect labels. A support vector machine was used for both RFE and classification to reduce the 2028 nontargeted features down to just 54 features (a 97.3% reduction) without any loss of classification performance.

The authors report that the resulting model was a 6-class classifier, capable of identifying monofloral blueberry, buckwheat, clover, goldenrod, linden, or other honey with a nested cross-validation Matthews correlation coefficient (MCC) of 0.803 ± 0.046. The development of a k-nearest neighbours filter and the decision to continue the RFE process beyond the iteration with the highest classification score were instrumental in achieving this outcome.

They conclude that this work shows a complete pipeline that automates feature selection from nontargeted LC-MS spectra when working with a limited number of samples and imperfect labels. This process can also be expanded to other food groups and spectral data.

Photo by Andrea De Santis on Unsplash

Read more…

12740263497?profile=RESIZE_400xAuthenticity testing of honey is the best-known example of the need for a weight-of-evidence approach.  One analytical test is unlikely to give a definitive answer.  Using a panel of different tests, using different techniques and principles, can give an incremental list of suspicions.

The routine use of machine learning for constructing reference databases has enabled the rapid expansion of techniques that – in principle – can discriminate differences between “authentic” and “inauthentic” reference samples, and thus could be added to this weight-of-evidence armoury.  Two recent publications are a case in point.

In the first study (purchase required) the authors produced their own reference set by adulterating honey with syrups, then showed that they could be discriminated from authentic honey using Differential Scanning Calorimetry (DSC)  (they used graph-based semi-supervised learning to construct the classification model).  DSC is based on melting curves, and is an indirect measure of water content.  It is a cheap, routine, test widely used in many industries and – as such – would be ideal as the first step in an analytical workflow in order to screen out the most crudely adulterated samples.

This second study (purchase also required) is an example of building much more focussed and granular reference databases to address a highly specific authenticity question.  The authors measured carbon-13 ratios in honeys (and in their constituent protein) from 196 authentic honeys sourced from 56 cities in Turkey.  This analytical technique is usually used as a marker for exogenous sugar addition, but in this study the authors used the more subtle variations (driven by changes in flora, temperatures and humidities) to build a classification model for regional origin.  They were able to cluster the honeys into one of 7 distinct geographical regions of Turkey based upon their carbon-13 ratios.

[thanks to FAN-member Peter Farnell for spotting that the original version of this blog referenced "Differential Scanning Colourimetry" - which would, indeed, have been a novel technique worthy of comment]

Read more…

13404710057?profile=RESIZE_400xA recent FAN blog described non-destructive impedance sensors as a tool to classify meat freshness.

In this paper (open access) the authors have used the same principle and developed a classification model for potato varieties based on the effect of their dry matter content on an electrical impedance sensor.  The test is destructive as the potato must be sliced.  The authors built a reference database from data from 9 cultivars (Actrice, Ambra, Constance, El Mundo, Fontane, Gaudi, Jelly, Monalisa and Universa) sourced directly from the grower.  These cultivars were chosen because they cover a wide range of dry matter content.  The authors collected multivariate analytical data from the impedance sensor; impedance magnitude and phase data along with derived parameters such as the minimum phase point of each spectrum, the ratio between the low- and high-frequency values of the impedance magnitude,  the dissipation factor, the distance between the zero and the maximum value of the Nyquist plot, and  the Cole model equivalent circuit parameters.

They conclude that machine learning methods for predicting potato dry matter and varieties, based on impedance data, can achieve an equivalent (sub-optimal) performance to conventional methods and that they hold promise for future improvement to surpass conventional methods. An improved deeper analysis could aim to reduce the root-mean-squared error and increase the coefficient of determination value, thereby enhancing the accuracy of dry matter data predictions. To achieve this, various techniques such as feature engineering, hyperparameter tuning, and advanced modelling approaches (e.g. convolutional neural networks) could be explored. The authors consider that alternate chemometric methods like the Kennard-Stone algorithm, which selects representative samples based on distance criteria, could lead to more robust dataset partitioning. Additionally, incorporating data fusion with results obtained through infrared spectroscopy could further improve the model’s performance.

Photo by Rodrigo dos Reis on Unsplash

Read more…

12366097699?profile=RESIZE_180x180A scientific paper entitled ”Authenticity Assessment of Ground Black Pepper by Combining Headspace Gas-Chromatography Ion Mobility Spectrometry and Machine Learning” has now been published in Food Research International (Elsevier journal) 

The study assessed a broad variety of authentic samples originating from eight countries and three continents. The method uses head-space gas-chromtaography ion mobility spectrometry (HS-HC-IMS), combined with machine learning. It requires no sample preparation and is rapid. In this proof-of-concept study, the methos successfully classified samples with an accuracy of >90% with a 95% level of confidence.

Access the paper for free until the end of March 2024.

Photo by Anas Alhajj on Unsplash

Read more…

10772016682?profile=RESIZE_710x

Here, a biomarker-free detection assay was developed using an optical nanosensor array to aid in the food safety of citrus juices.

Researchers have coupled machine learning capability of their computational process named algorithmically guided optical nanosensor selector (AGONS) with the fluorescence data collected using their nanosensor array, in a biomarker-free detection assay, to construct a predictive model for citrus juice authenticity. 

Over 707 measurements of pure and adulterated citrus juices were collected for prediction. Overall, the approach achieved above 90% accuracy on three data sets in discriminating three pure citrus fruit juices, artificially sweetened tangerine juice with various concentrations of corn syrup, and juice-to-juice dilution of orange juice using apple juice. 

Read abstract

 

Photo by ABHISHEK HAJARE on Unsplash

 
Abstract Image
 
 
 
 
 

 

Read more…

9405001068?profile=RESIZE_584xA new environmentally friendly prototype sensor has been developed by CSIRO, Australia's national science agency, to help combat food-fraud and protect the reputation of Australian produce.

The novel technology uses vibration energy harvesting and machine learning to accurately detect anomalies in the transportation of products such as meat. 

For example, if a refrigeration truck carrying exported meat stopped during its journey to the processing plant, the technology would be able to detect this and if any products had been moved or removed during this period.

This allows producers and logistics operators to pin-point handling errors and identify when products are stolen or substituted.

Read full article.

Read more…