Data plays a crucial role in the successful deployment of urban rail transit large language models (LLMs). RetrievalAugmented Generation (RAG) technology emerges as a promising approach for developing industryspecific LLMs and mitigating hallucination issues. However, the lack of comprehensive industry knowledge bases hinders its effectiveness. This study proposes a novel framework for constructing a knowledge graphbased RAG knowledge base for urban rail transit LLMs. This framework consists of four key dimensions: classification skeleton, semantic benchmark, feature rules, and logical relationships. These dimensions are implemented through entity classification systems, terminology dictionaries, attribute libraries, and entity relationship tables, respectively. By incorporating industryspecific attributes for entities, this approach goes beyond the traditional subjectpredicateobject triple structure of knowledge graphs, resulting in a comprehensive and multifaceted representation of industry knowledge. This knowledge base serves as the core component of the RAG system, providing standardized, reliable, and traceable domain knowledge through a systematic pipeline of data collection, structuring, vectorization, and knowledge representation. This process significantly enhances the reliability and domain expertise of the content generated by urban rail transit LLMs, paving the way for a new era driven by both data and knowledge.
| 科 Family | 属数 Number of genus | 种数 Number of species | 占总种数比例 Percentage of total species (%) | 属 Genus | 种数 Number of species | 占总种数比例 Percentage of total species (%) |
|---|---|---|---|---|---|---|
| 鹅膏菌科Amanitaceae | 2 | 11 | 5.26 | 鹅膏菌属 Amanita | 10 | 4.78 |
| 小菇科 Mycenaceae | 2 | 12 | 5.74 | 丝盖伞属 Inocybe | 5 | 2.39 |
| 多孔菌科 Polyporaceae | 8 | 14 | 6.70 | 蜡蘑属 Laccaria | 5 | 2.39 |
| 红菇科 Russulaceae | 3 | 23 | 11.00 | 小皮伞属 Marasmius | 6 | 2.87 |
| 小菇属 Mycena | 11 | 5.26 | ||||
| 光柄菇属 Pluteus | 5 | 2.39 | ||||
| 红菇属 Russula | 17 | 8.13 | ||||
| 栓菌属 Trametes | 5 | 2.39 |