To establish a three classification model for cultivated,semi-wild,and wild Astragali Radix characterized by flavonoids,and explore and evaluate the application of techniques of automated machine learning and data augmentation in the field of drug analysis.
Firstly,correlation analysis and principal component analysis were conducted on the flavonoid content data of Astragali Radix,and models of decision tree and logistic regression were established to analyze the importance of flavonoid components based on the models. Then,using the AutoGluon framework with 5 as num_bag_folds,2 sets of 30 models respectively through 64 batches of real data and 600 batches of virtual data generated based on real data with the TVAE table data generation algorithm for training were obtained,and these models were evaluated by accuracy.
The analysis of machine learning models,indicated that formononetin,campanulin and onospin played the important roles in the quality control of Astragali Radix,especially for the source grade control. The accuracy of model prediction showed that the models based on Neural Net and tree-model always had the best classification effect for Astragali Radix. The virtual data generated by data augmentation technique is basically consistent with the actual data in terms of the accuracy trend of the model training process.
Related techniques of machine learning have good application value in the classification of Astragali Radix characterized by flavonoids.
| 科 Family | 属数 Number of genus | 种数 Number of species | 占总种数比例 Percentage of total species (%) | 属 Genus | 种数 Number of species | 占总种数比例 Percentage of total species (%) |
|---|---|---|---|---|---|---|
| 鹅膏菌科Amanitaceae | 2 | 11 | 5.26 | 鹅膏菌属 Amanita | 10 | 4.78 |
| 小菇科 Mycenaceae | 2 | 12 | 5.74 | 丝盖伞属 Inocybe | 5 | 2.39 |
| 多孔菌科 Polyporaceae | 8 | 14 | 6.70 | 蜡蘑属 Laccaria | 5 | 2.39 |
| 红菇科 Russulaceae | 3 | 23 | 11.00 | 小皮伞属 Marasmius | 6 | 2.87 |
| 小菇属 Mycena | 11 | 5.26 | ||||
| 光柄菇属 Pluteus | 5 | 2.39 | ||||
| 红菇属 Russula | 17 | 8.13 | ||||
| 栓菌属 Trametes | 5 | 2.39 |