Abstract
The explicit knowledge contained in quality acceptance specifications is crucial for the quality management of infrastructure projects. As the digitalization of knowledge in the architecture, engineering, and construction industry accelerates, the need for automated knowledge mining to process large volumes of regulatory documents is increasing. However, challenges such as complex relationships, entity overlap, and long-distance entities in these specifications pose significant difficulties for knowledge extraction. In this research, an autonomous knowledge mining framework based on the cascade binary tagging framework (CasRel) and graph attention network (GAT) is proposed, which is designed to transform complex textual knowledge from quality acceptance specifications into graph-based knowledge representations. The framework utilizes the RoBERTa-wwm-ext layer to encode the input text, employs a subject tagger to extract entity start and end positions, and constructs a graph structure. Using the graph attention mechanism in GAT, features of neighboring nodes are aggregated to generate context-enhanced entity feature representations. A relation-specific object tagger then outputs complete knowledge triples. A Neo4j database is used to store and visualize the extracted knowledge triples, creating a knowledge graph with 3,762 nodes. Using the Chinese railway construction quality acceptance specifications as a test case, the results demonstrate that the suggested framework achieves excellent performance in sophisticated knowledge extraction and knowledge graph modeling. This research effectively transforms textual regulatory documents into structured knowledge bases, supporting automated knowledge acquisition and the digital management of compliance information.
Keywords
Get full access to this article
View all access options for this article.
