In order to advance the analysis and mining of TCM(traditional Chinese medicine) text data and achieve intelligent extraction and processing of knowledge, the BIO(begin, inside, outside) sequence labeling method, the BiLSTM-CRF model, and manually defined rules were adopted to complete the knowledge extraction task. Utilizing the Py2neo library in Python 3.6 and the Neo4j database, a spleen and stomach disease knowledge graph was constructed based on Neo4j, and a TCM spleen and stomach disease named entity recognition system was developed using the Flask framework. The results show that the BiLSTM-CRF model achieves high performance and good generalization ability on the test set, with accuracy, precision, recall, and F1 scores of 96.19%, 86.64%, 88.82%, and 87.71%, respectively. The constructed knowledge graph includes eight types of node labels, such as prescriptions or patent medicines, Chinese medicines, and clinical manifestations, as well as ten types of relationships. It supports the querying and discovery of nodes and relationships among Western medical diagnosis, TCM syndromes, and TCM treatment principles for spleen and stomach diseases. It is concluded that the BiLSTM-CRF model demonstrates excellent generalizability in named entity recognition of TCM spleen and stomach disease. It exhibits outstanding performance in handling complex text structures and domain-specific terminology, providing strong support for the research on knowledge extraction and knowledge graph construction in Traditional Chinese Medicine for spleen and stomach diseases.
| 科 Family | 属数 Number of genus | 种数 Number of species | 占总种数比例 Percentage of total species (%) | 属 Genus | 种数 Number of species | 占总种数比例 Percentage of total species (%) |
|---|---|---|---|---|---|---|
| 鹅膏菌科Amanitaceae | 2 | 11 | 5.26 | 鹅膏菌属 Amanita | 10 | 4.78 |
| 小菇科 Mycenaceae | 2 | 12 | 5.74 | 丝盖伞属 Inocybe | 5 | 2.39 |
| 多孔菌科 Polyporaceae | 8 | 14 | 6.70 | 蜡蘑属 Laccaria | 5 | 2.39 |
| 红菇科 Russulaceae | 3 | 23 | 11.00 | 小皮伞属 Marasmius | 6 | 2.87 |
| 小菇属 Mycena | 11 | 5.26 | ||||
| 光柄菇属 Pluteus | 5 | 2.39 | ||||
| 红菇属 Russula | 17 | 8.13 | ||||
| 栓菌属 Trametes | 5 | 2.39 |