Dipeptidyl peptidase-IV (DPP-4) enzyme inhibitors are a promising category of diabetes medications. Bioactive peptides, particularly those derived from bovine milk proteins, play crucial roles in inhibiting the DPP-4 enzyme. This study describes a comprehensive strategy for DPP-4 inhibitory peptide discovery and validation that combines machine learning and virtual proteolysis techniques. Five machine learning models, including GBDT, XGBoost, LightGBM, CatBoost, and RF, were trained. Notably, LightGBM demonstrated superior performance with an AUC value of 0.92 ± 0.01. Subsequently, LightGBM was employed to forecast the DPP-4 inhibitory potential of peptides generated through virtual proteolysis of milk proteins. Through a series of in silico screening process and in vitro experiments, GPVRGPF and HPHPHL were found to exhibit good DPP-4 inhibitory activity. Molecular docking and molecular dynamics simulations further confirmed the inhibitory mechanisms of these peptides. Through retracing the virtual proteolysis steps, it was found that GPVRGPF can be obtained from β-casein through enzymatic hydrolysis by chymotrypsin, while HPHPHL can be obtained from κ-casein through enzymatic hydrolysis by stem bromelain or papain. In summary, the integration of machine learning and virtual proteolysis techniques can aid in the preliminary determination of key hydrolysis parameters and facilitate the efficient screening of bioactive peptides.
| 科 Family | 属数 Number of genus | 种数 Number of species | 占总种数比例 Percentage of total species (%) | 属 Genus | 种数 Number of species | 占总种数比例 Percentage of total species (%) |
|---|---|---|---|---|---|---|
| 鹅膏菌科Amanitaceae | 2 | 11 | 5.26 | 鹅膏菌属 Amanita | 10 | 4.78 |
| 小菇科 Mycenaceae | 2 | 12 | 5.74 | 丝盖伞属 Inocybe | 5 | 2.39 |
| 多孔菌科 Polyporaceae | 8 | 14 | 6.70 | 蜡蘑属 Laccaria | 5 | 2.39 |
| 红菇科 Russulaceae | 3 | 23 | 11.00 | 小皮伞属 Marasmius | 6 | 2.87 |
| 小菇属 Mycena | 11 | 5.26 | ||||
| 光柄菇属 Pluteus | 5 | 2.39 | ||||
| 红菇属 Russula | 17 | 8.13 | ||||
| 栓菌属 Trametes | 5 | 2.39 |