Based on the US National Health and Nutrition Survey from 2005 to 2021, an interpretable machine learning method was used to identify patients with depression in people over 65 years old.
The data of 2005 Mel 2018 and 2019-2020 were used as training set and test set, respectively, and three machine learning models of Lasso Logistic, random forest, and XG Boost were fitted. The best model of area under the curve (AUC) on the test set was selected and explained by interpretable machine learning model SHAP.
The AUC value of XG Boost model was the highest, which was 0.933 (0.912-0.954). Sleep problems, health problems, and eosinophil count were the top three important variables affecting senile depression. The absolute values of SHAP were 1.16, 0.83, and 0.55, respectively, which showed the main influencing factors of each individual.
Machine learning is superior to logistic regression model in predicting depression in the elderly. Interpretable machine learning can explain the model from the global and individual levels to make predictions, open the black box of machine learning models, and can be used as a supplement to machine learning models in practical application.
| 科 Family | 属数 Number of genus | 种数 Number of species | 占总种数比例 Percentage of total species (%) | 属 Genus | 种数 Number of species | 占总种数比例 Percentage of total species (%) |
|---|---|---|---|---|---|---|
| 鹅膏菌科Amanitaceae | 2 | 11 | 5.26 | 鹅膏菌属 Amanita | 10 | 4.78 |
| 小菇科 Mycenaceae | 2 | 12 | 5.74 | 丝盖伞属 Inocybe | 5 | 2.39 |
| 多孔菌科 Polyporaceae | 8 | 14 | 6.70 | 蜡蘑属 Laccaria | 5 | 2.39 |
| 红菇科 Russulaceae | 3 | 23 | 11.00 | 小皮伞属 Marasmius | 6 | 2.87 |
| 小菇属 Mycena | 11 | 5.26 | ||||
| 光柄菇属 Pluteus | 5 | 2.39 | ||||
| 红菇属 Russula | 17 | 8.13 | ||||
| 栓菌属 Trametes | 5 | 2.39 |