Abstract Oenological tannins are commercial preparations extracted from various botanical sources, including oak, grape seeds, tea, quebracho and galls, and are widely used in winemaking to modulate oxidative processes, stabilize colour and manage astringency. These preparations are classified into two main categories: condensed tannins (proanthocyanidins) and hydrolysable tannins (ellagitannins and gallotannins). Despite their widespread use, the effect of adding oenological tannins to wine remains largely unpredictable due to substantial variability in botanical origin, extraction methods and chemical composition. This unpredictability poses challenges both for oenologists, who aim to achieve specific technological outcomes, and for regulatory authorities, which require reliable tools for botanical authentication and quality control. Current analytical methods for tannin characterization are often time consuming, expensive, or limited in their ability to predict functional properties relevant to oenological applications. This thesis developed an integrated analytical framework that combines Linear Sweep Voltammetry (LSV) with machine learning algorithms to simultaneously classify oenological tannins according to their botanical origin and predict their key functional properties in wine. Up to forty five commercial tannins from three botanical sources (oak, grape seeds and galls) were characterized by voltammetric fingerprinting in model wine solutions and in red wine. The functional properties investigated concern critical aspects of tannin behaviour: (i) antioxidant capacity, assessed through multiple mechanisms (electron transfer, hydrogen atom transfer and radical scavenging); (ii) oxygen consumption kinetics in red wine, both with and without added sulphur dioxide; (iii) impact on colour evolution, quantified through CIELAB colour coordinates; (iv) sulphur dioxide consumption; (v) acetaldehyde binding capacity, relevant for anthocyanin–tannin condensation reactions and colour stabilization; and (vi) classification of the phenolic profile of red wine based on its voltammetric fingerprint. This strategy makes it possible to establish quantitative relationships between electrochemical fingerprints and multiple functional properties of the tannin types under investigation. In this study, a Support Vector Machine (SVM) was used to classify tannins according to their botanical origin (oak, grape seeds and galls), While initial models showed high internal accuracy but failed on external validation due to over-optimistic bias, the implementation of data augmentation proved decisive. By increasing the number of voltammograms via Generative Adversarial Networks (GANs), it was possible to overcome the dimensionality imbalance and achieve a reliable test accuracy of 90% (ROC-AUC of 0.99). Remarkably, the optimised model demonstrated perfect discrimination when evaluated on an independent external dataset, achieving 100% accuracy and providing strong preliminary evidence of its generalisation capability. SHAP analysis improved model interpretability, identifying key voltammetric regions driving the classification. With regard to antioxidant capacity, correlation analysis linked specific potential ranges (225–285 mV, 570–602 mV, 908–1195 mV) to different antioxidant mechanisms, demonstrating that for some of the spectrophotometric assays employed, LSV captures information comparable to that of the assays themselves, but in a faster and simpler manner, while simultaneously providing mechanistic insights via the contributions of different phenolic groups. Oxygen consumption tests in red wine revealed distinct tannin reactivities: oak ellagitannins > condensed tannins from grape seeds > gallotannins. Predictive models of oxygen reactivity achieved R² values of 0.90 and errors < 10% (in the test set), enabling the prediction of O₂ consumption from electrochemical data. Results on colour changes in CIELAB space due to the addition of oenological tannins to red wine showed that tannin origin is a key factor, more so than SO₂ levels, with Random Forest models able to predict such changes with minimal errors (although R² values are not very high, reaching a maximum of 0.81). Predictive models of SO₂ consumption were able to estimate it with high accuracy, with a maximum error of 10% in the validation set. Regarding tannin reactivity towards acetaldehyde, different kinetic profiles were observed: oak tannins exhibited the highest initial reactivity, grape seed tannins showed a biphasic pattern, and gallotannins displayed lower reactivity, with efficiency on a phenolic content basis favouring oak tannins. In this case, no predictive model was developed due to the lack of a dataset with a sufficient number of data points for this purpose. The classification of phenolic groups in red wine achieved excellent performance, distinguishing the different structural types. Overall, this thesis proposes a rapid and cost-effective characterisation of tannins for authentication, quality control, and prediction of their impact upon addition to red wine. The integration of electrochemistry with machine learning could enable winemaking practices to be guided by analyses that are both rapid and easy to perform.
Electrochemical and machine learning approaches for the classification and functional assessment of oenological tannins
Rosario Pascale
2026-01-01
Abstract
Abstract Oenological tannins are commercial preparations extracted from various botanical sources, including oak, grape seeds, tea, quebracho and galls, and are widely used in winemaking to modulate oxidative processes, stabilize colour and manage astringency. These preparations are classified into two main categories: condensed tannins (proanthocyanidins) and hydrolysable tannins (ellagitannins and gallotannins). Despite their widespread use, the effect of adding oenological tannins to wine remains largely unpredictable due to substantial variability in botanical origin, extraction methods and chemical composition. This unpredictability poses challenges both for oenologists, who aim to achieve specific technological outcomes, and for regulatory authorities, which require reliable tools for botanical authentication and quality control. Current analytical methods for tannin characterization are often time consuming, expensive, or limited in their ability to predict functional properties relevant to oenological applications. This thesis developed an integrated analytical framework that combines Linear Sweep Voltammetry (LSV) with machine learning algorithms to simultaneously classify oenological tannins according to their botanical origin and predict their key functional properties in wine. Up to forty five commercial tannins from three botanical sources (oak, grape seeds and galls) were characterized by voltammetric fingerprinting in model wine solutions and in red wine. The functional properties investigated concern critical aspects of tannin behaviour: (i) antioxidant capacity, assessed through multiple mechanisms (electron transfer, hydrogen atom transfer and radical scavenging); (ii) oxygen consumption kinetics in red wine, both with and without added sulphur dioxide; (iii) impact on colour evolution, quantified through CIELAB colour coordinates; (iv) sulphur dioxide consumption; (v) acetaldehyde binding capacity, relevant for anthocyanin–tannin condensation reactions and colour stabilization; and (vi) classification of the phenolic profile of red wine based on its voltammetric fingerprint. This strategy makes it possible to establish quantitative relationships between electrochemical fingerprints and multiple functional properties of the tannin types under investigation. In this study, a Support Vector Machine (SVM) was used to classify tannins according to their botanical origin (oak, grape seeds and galls), While initial models showed high internal accuracy but failed on external validation due to over-optimistic bias, the implementation of data augmentation proved decisive. By increasing the number of voltammograms via Generative Adversarial Networks (GANs), it was possible to overcome the dimensionality imbalance and achieve a reliable test accuracy of 90% (ROC-AUC of 0.99). Remarkably, the optimised model demonstrated perfect discrimination when evaluated on an independent external dataset, achieving 100% accuracy and providing strong preliminary evidence of its generalisation capability. SHAP analysis improved model interpretability, identifying key voltammetric regions driving the classification. With regard to antioxidant capacity, correlation analysis linked specific potential ranges (225–285 mV, 570–602 mV, 908–1195 mV) to different antioxidant mechanisms, demonstrating that for some of the spectrophotometric assays employed, LSV captures information comparable to that of the assays themselves, but in a faster and simpler manner, while simultaneously providing mechanistic insights via the contributions of different phenolic groups. Oxygen consumption tests in red wine revealed distinct tannin reactivities: oak ellagitannins > condensed tannins from grape seeds > gallotannins. Predictive models of oxygen reactivity achieved R² values of 0.90 and errors < 10% (in the test set), enabling the prediction of O₂ consumption from electrochemical data. Results on colour changes in CIELAB space due to the addition of oenological tannins to red wine showed that tannin origin is a key factor, more so than SO₂ levels, with Random Forest models able to predict such changes with minimal errors (although R² values are not very high, reaching a maximum of 0.81). Predictive models of SO₂ consumption were able to estimate it with high accuracy, with a maximum error of 10% in the validation set. Regarding tannin reactivity towards acetaldehyde, different kinetic profiles were observed: oak tannins exhibited the highest initial reactivity, grape seed tannins showed a biphasic pattern, and gallotannins displayed lower reactivity, with efficiency on a phenolic content basis favouring oak tannins. In this case, no predictive model was developed due to the lack of a dataset with a sufficient number of data points for this purpose. The classification of phenolic groups in red wine achieved excellent performance, distinguishing the different structural types. Overall, this thesis proposes a rapid and cost-effective characterisation of tannins for authentication, quality control, and prediction of their impact upon addition to red wine. The integration of electrochemistry with machine learning could enable winemaking practices to be guided by analyses that are both rapid and easy to perform.| File | Dimensione | Formato | |
|---|---|---|---|
|
Electrochemical and machine learning approaches for the classification and functional assessment of oenological tannins .pdf
embargo fino al 30/05/2027
Descrizione: PhD Thesis
Tipologia:
Tesi di dottorato
Licenza:
Creative commons
Dimensione
4.92 MB
Formato
Adobe PDF
|
4.92 MB | Adobe PDF | Visualizza/Apri Richiedi una copia |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.



