Understanding protein corona composition is essential for evaluating their potential applications in biomedicine. Relative protein abundance (RPA), accounting for the total proteins in the corona, is an important parameter for describing the protein corona. For the first time, we comprehensively predicted the RPA of multiple proteins on the protein corona. First, we used multiple machine learning algorithms to predict whether a protein adsorbs to a nanoparticle, which is dichotomous prediction. Then, we selected the top 3 performing machine learning algorithms in dichotomous prediction to predict the specific value of RPA, which is regression prediction. Meanwhile, we analyzed the advantages and disadvantages of different machine learning algorithms for RPA prediction through interpretable analysis. Finally, we mined important features about the RPA prediction, which provided effective suggestions for the preliminary design of protein corona. The service for the prediction of RPA is available at http://www.bioai-lab.com/PC_ML.
| 科 Family | 属数 Number of genus | 种数 Number of species | 占总种数比例 Percentage of total species (%) | 属 Genus | 种数 Number of species | 占总种数比例 Percentage of total species (%) |
|---|---|---|---|---|---|---|
| 鹅膏菌科Amanitaceae | 2 | 11 | 5.26 | 鹅膏菌属 Amanita | 10 | 4.78 |
| 小菇科 Mycenaceae | 2 | 12 | 5.74 | 丝盖伞属 Inocybe | 5 | 2.39 |
| 多孔菌科 Polyporaceae | 8 | 14 | 6.70 | 蜡蘑属 Laccaria | 5 | 2.39 |
| 红菇科 Russulaceae | 3 | 23 | 11.00 | 小皮伞属 Marasmius | 6 | 2.87 |
| 小菇属 Mycena | 11 | 5.26 | ||||
| 光柄菇属 Pluteus | 5 | 2.39 | ||||
| 红菇属 Russula | 17 | 8.13 | ||||
| 栓菌属 Trametes | 5 | 2.39 |