Abstract
Keywords
Introduction
The continuous generation of enormous amounts of health data poses several challenges to its management and governance. Currently, healthcare stakeholders have access to raw health data that must be processed to enable value creation (Arul et al., 2024; Genevieve et al., 2019). Healthcare systems adopt informatics systems that allow the analysis of health data retrieved and provided by different stakeholders (Genevieve et al., 2019). However, there is a lack of linkage and interoperability between these systems (Bates, 2005; Edwards et al., 2010; Genevieve et al., 2019; van Olmen et al., 2020).
As different and complex health informatics systems emerge, the need for integrating and linking data from various datasets also emerges. Data linkage can be defined as “a process of pairing records from two files and trying to select the pairs that belong to the same entity” (Bohensky et al., 2010; Winglee et al., 2005). The Organisation for Economic Co-operation and Development also defined data linkage as “a merging that brings together from two or more sources of data with the object of consolidating facts concerning an individual or an event that are not available in any separate record” (Harron, 2016). In the healthcare sector, data linkage has been applied, for example, to the integration of patient health records and death certificates (Christen, 2019). Other general health-related applications include epidemiological, managerial and service production studies (Holman et al., 2008). Data linkage’s purpose is to enable the integration of different datasets considering the identification and interconnection of records within an organisation with single datasets or multiple ones (Christen, 2019; Green et al., 2015). The multitude of settings and operations within healthcare information systems can threaten the obtainment of optimised linkage between diversified data libraries. For this reason, stakeholders need to continuously address the problem of interconnectivity between healthcare information systems to find facilitators and solutions (Hopf et al., 2014, 2016; March et al., 2020).
With the need for data linkage also comes the need for interoperability between systems. Interoperability is currently one of the top targets of researchers in the field of information technology (Torab-Miandoab et al., 2023). According to the IEEE Standard Computer Dictionary, interoperability is “the ability of two or more systems or components to exchange information and to use the information that has been exchanged” (Lehne et al., 2019). As different hospitals and physicians are increasing their adoption of digital health data records, the lack of interoperability between healthcare informatics systems may pose difficulties regarding communication processes (Lehne et al., 2019; Reisman, 2017). Therefore, healthcare stakeholders must promote research in this field, trying to understand and identify the core variables within different social, political and clinical challenges and contexts, which can be crucial for establishing an interoperable system. Moreover, there is an awareness of the need for a universal interoperability strategy between all healthcare stakeholders (Tiago et al., 2016). Other aspects, such as technology architecture, system governance and core dataset definition, may be crucial for implementing successfully an interoperable system (Azarm et al., 2017). Interoperability standards are also key in any functioning interoperable system (Gowda et al., 2022). Some benefits of interoperable systems include facilitated access to patient health data, more understanding of medical terms, medical bias minimisation, improved health cost management and integration of diversified types of health data (Iroju et al., 2013). However, barriers to interoperability implementation are still present and include the complexity of the healthcare environment, the lack of standardisation, the existence of legacy systems (e.g. an outdated electronic health record system that does not comply with current standards) and semantic compatibilisation issues (e.g. two systems that cannot recognise and interpret each other’s information) and resistance to change to digitalisation processes (Iroju et al., 2013).
Current study
System interoperability and data linkage issues span different types of healthcare environments, leading to difficulties in accessing patient health data generated in different settings. Research made about these concerns needs to be verified to evaluate the state-of-the-art of relevant potential needs and trends. The aim of the current research was to provide an overview of the current and future role of data linkage and system interoperability within the domain of health information management and governance by conducting a bibliometric analysis of relevant published literature.
Method
Search strategy
This bibliometric analysis was based on a search of research articles across three platforms: Google Scholar, PubMed and Web of Science, considered to be the most relevant databases for this analysis. Search terms used for each platform were: “health data management,” “health information governance,” “health information management,” “healthcare data governance,” “healthcare data management,” “healthcare information governance,” “healthcare information management,” associated with the terms “linkage” and “interoperability” (Supplemental material Table S1, online supplement). No year limitations were applied as the objective was to maximise the analysis of the evolution of publishing research articles over time.
Data acquisition
Data were collected from the three search platforms between December 2021 and February 2022. Search results from Google Scholar were first added to the “My Library” feature and then retrieved to an EndNote file using the export function. From the Web of Science search platform (Web of Science Core Collection), data were retrieved directly to an EndNote file. For PubMed, data were obtained using its citation manager, which created a compatible EndNote file. All data were extracted during February 2022. For the citation analysis, the number of citations per article was retrieved manually after completion of the search, and on the same day from Google Scholar.
Data selection process and analysis
All reference files obtained through each search platform were imported to EndNote, followed by a reference update using EndNote’s “Find Reference Updates” feature. With up-to-date references, another automatic EndNote function was used, namely “Find Duplicates,” to eliminate duplicated references. The refined data obtained in EndNote were transferred into Microsoft Excel (Microsoft Corporation, Washington, USA) for further analyses (Figure 1). Using this software, an article’s abstract analysis was performed. The first stage was to verify which results had an abstract and to eliminate the articles without abstracts. At this stage, only articles published by the end of 2021 were considered, as 2022 was an incomplete year and could have provided subsequent erroneous citation calculations and results. In the second stage, selected abstracts were analysed for the inclusion of the term “interoperability” or “linkage.” The results (articles with abstracts that included one or both of these terms) were then analysed according to their average yearly citations. The baseline value of the previous metric was calculated by the average number of citations and article years of the 212 selected articles, to narrow and obtain even more meaningful and impacting results. Research articles with a citation average of less than 4.94 per year were excluded, as this value represented the minimum specific threshold within the scope of the analysis of articles considered impactful (Emmer et al., 2022). Finally, the selected 72 references were screened to verify they met the criteria for the scope of analysis (through screening of articles’ PDF files). Following the earlier selection process, data were evaluated using mainly Microsoft Excel, WORDSTAT v. 9.0.11 and VOSviewer v. 1.6.18 software as schematised in Figure 1. VOSviewer tools (Leiden University, Leiden, Netherlands) were deployed to verify potential relationships between research terms and the strength of these relationships within the abstracts and article titles, and to assess their chronological order.

Data selection process and analysis.
Results
Overview of the data acquisition and selection process
Search results were refined through several processes, which led to the final sample. One of the first meaningful levels of data treatment was the abstract inclusion criterion, which revealed that from the 3977 unique references obtained in EndNote, only 685 (17.22%) had an abstract included in the exported results. After identifying the abstracts that included the terms, “interoperability” and/or “linkage,” the number dropped to 212 references (30.95%; 5.33% from the 3977 unique references). When the last criterion was applied (average citations per year ⩾4.94), only 72 references (33.96%; 1.81% of the 3977 unique references) remained. As to type of publications identified during the sample selection process, most were journal articles (Supplemental material Table S2, online supplement). In terms of chronological evolution (Figure 2), there were few publications on this research topic before 2005. From 2005, the number of publications continued to rise, reaching its peak between 2018 and 2021. In 2019, the number of refined publications (those with an abstract; and those with an abstract that included the selected terms) increased significantly. However, from 2020, these two publication categories began to diverge, as the number of publications with abstracts that included the selected terms continued to drop until the end of 2021.

Chronological evolution of publications by level of data selection.
Sample overview
According to our research methodology, three search platforms were chosen to provide results on our topic of analysis. Selected articles were searchable heterogeneously throughout each platform. Most articles were available on Google Scholar (63; 100%) and PubMed (58; 92%), while Web of Science results were scarce (14; 22%). As previously noted, 72 articles were selected for the preliminary sample. A detailed analysis of these research articles led to the exclusion of a further nine articles, which did not fit the scope of the present study. The remaining 63 selected articles are described and summarised in Supplemental material Table S3 (online supplement). Of these 63 articles, 51 (80.95%) contained the word “interoperability” in the abstract, followed by the word “linkage” with 11 (17.46%) results, and only 1 (1.59%) article had an abstract containing both “interoperability” and “linkage.” The average number of authors per article was 5, and the maximum and minimum number of authors per article ranged between 15 and 1, respectively. Most articles had between 2 and 5 authors (48 articles; 76.19%). According to Figure 3, 35 countries were associated with at least 1 article. The country with the highest representation was the United States, with 26 associated articles (41.27%), followed by the United Kingdom with 11 articles (17.46%) and Australia with 5 articles (7.94%). Iran and Switzerland were each represented in four articles, and Saudi Arabia in three articles. Six countries (Canada, China, Germany, India, South Korea and Spain) were each represented in 2 research articles, while 23 countries were each represented in only 1 article.

Geographical distribution of articles according to their authors’ origin.
Article citation analysis
According to the data retrieved and exported to Table 1, the highest number of citations was achieved by Koppel and Lehmann (2015) and the lowest by Ammar et al. (2021). As for the average number of citations per year, Gordon and Catalini (2018) had the highest average and Ammar et al. (2021) the lowest. The average number of citations of all 63 articles was 101.38, while the mean value of the average number of citations per year was 17.19. A reference global impact ranking was also established, which resulted from the multiplication of the average number of citations by the average number of citations per year. The top five publications of this ranking were: (1st) Gordon and Catalini (2018); (2nd) Mandel et al. (2016); (3rd) Koppel and Lehmann (2015); (4th) Kaplan and Harris-Salamone (2009) and (5th) Detmer et al. (2008). The lowest five included the references Genevieve et al. (2019), Ammar et al. (2020), Samra et al. (2020), Arul et al. (2024) and Ammar et al. (2021).
Reference characterisation and global impact ranking.
Publication analysis
According to Table 2, the 63 references selected for this study were represented across 45 different publications. Almost all of these publications were journals, with only two related to book series. The average number of references in these publications was 1.40. Regarding the citation analysis,
Publication characteristics and global impact ranking.
This part of the formula is not applied if the metric is n.a.
VOSviewer software title and abstract analysis
The database generated in Endnote was exported and evaluated using the VOSviewer software. The analysis was based on the information within the article’s title and abstract. Both abstract and title information are core information vessels within articles, crucial for article search optimisation within databases. Three dimensions were targeted for evaluation as they allowed an integrated analysis: (1) network visualisation, (2) overlay visualisation and (3) density visualisation. The first dimension showed the level of relationship between words, providing information about potential word clusters, and allowing their naming (via word analysis), for better understanding. The overlay visualisation showed which words and clusters were trending. The final dimension complemented the network visualisation by presenting simplified information about each word’s relevance.
The title network visualisation analysis found the existence of three clusters (Supplemental material, Figure S1, online supplement). The first cluster (
When observing the same data display through the overlay visualisation (Supplemental material Figure S2, online supplement), the three clusters appear to have an increasing chronological timeline, as the
The abstract analysis (displayed in Figure S4, online supplement) revealed that increasing relationship between words when compared to the title analysis. In accordance with the title evaluation, the network visualisation analysis also presented three clusters. The first cluster (
In terms of timeline analysis (Supplemental material Figure S5, online supplement), the e
An integrated view of the cluster information retrieved through the VOSviewer software was also established (Figure 4). This original view of the title and abstract clusters demonstrated their interconnectivity pattern and relationship with system interoperability and data linkage.

Theoretical framework to achieve high quality in interoperability and data linkage processes.
WORDSTAT software proximity of keywords analysis: interoperability versus linkage
A proximity analysis of the selected two keywords by WORDSTAT software (Provalis Research, Montreal, Canada) was performed, and the data retrieved were analysed in Microsoft Excel (Figure 5). According to data retrieved, the terms most associated (association above 0.013) with “interoperability” were (1) semantic, (2) systems, (3) standards, (4) data, (5) healthcare, (6) health, (7) information, (8) security, (9) exchange, (10) patient, (11) privacy, (12) integration and (13) lack. As for the word “linkage,” the following results emerged: (1) data, (2) matching, (3) linked, (4) dataset, (5) governance, (6) sail, (7) projects, (8) record, (9) Australia, (10) across, (11) quality, (12) research. The only word with a similar proximity to the two words in the analysis was “data.”

Proximity of keywords analysis by WORDSTAT: interoperability vs. linkage.
Discussion
Sample timeline results
The sample’s timeline provided mixed results. While article numbers increased from years 2016 to 2020, they decreased abruptly in 2021. Conversely, global results containing abstracts continued to grow in 2021, suggesting that topics related to the scope of this research may have been redirected into other core subject areas, or that the COVID-19 pandemic may have resulted in quick solutions being incorporated into the marketplace with limited concern for data linkage and systems interoperability.
Journal and article metrics
The reference global impact ranking metric attempts to eliminate the chronological bias that exists when only the number of citations of an article are taken into account. Usually, articles published earlier have an increased probability of achieving a higher number of citations, diminishing the importance of articles published more recently, with less time to accumulate possible citations. As the results of our study have demonstrated, the article citation metric does not correlate with the global impact ranking position; and that the ranking provides a more accurate method to evaluate the impact of research within the overall scope of this (data linkage and systems interoperability within health information management) research domain. The publication global impact ranking in our study produced similar results to the article global impact ranking, meaning the cross-combination of impact metrics and citation metrics did not correlate with standardised impact metrics. These findings suggest that worldwide acceptable metrics are not the best method to evaluate the influence and power of journals within specific disciplinary domains.
Interoperability versus Linkage
In this study, interoperability and linkage concepts appeared distinct. The abstract analysis performed in the sample selection procedure showed that most abstracts contained the word “interoperability” while “linkage” was present in only a few. Only one abstract had both words. This simple analysis showed that researchers were giving more importance to interoperability than to linkage, and they either did not value or they overlooked the potential symbiotic power within the connection between these two disciplines. A further analysis that points to this same finding is the proximity of a keywords analysis performed in WORDSTAT software. The collected data showed that the term “interoperability” had a stronger association with a greater number of words when compared with the word “linkage.” The lack of association between these two words was also apparent, as only a few words achieved a strong association.
VOSviewer cluster findings
The VOSviewer analysis originated three title clusters and three abstract clusters. As these clusters have interconnections, a theoretical framework was established as demonstrated in Figure 4. According to Figure 4, four levels were defined, representing different degrees of broadness. The outer layer is the most wide-ranging, and this complexity diminishes gradually until reaching the core. Interoperability and linkage issues in electronic health information management are uncertain as new trends emerge. To mitigate this uncertainty, governance models and stakeholders’ needs such as those explored by Witry et al. (2010), Alkraiji et al. (2013), Lavin et al. (2015), Abdekhoda et al. (2016) and Ammar et al. (2021) must be assessed so electronic health information possesses the highest degree of quality in which system interoperability and data linkage can achieve their purpose with efficacy and efficiency.
Limitations and advantages
The current study had some limitations. First, in the data-gathering process, results exported from the three search platforms did not generate the same quality of information when imported into Endnote, meaning some crucial data were missing (e.g. some Google Scholar results were without year, abstract and publication information). Second, the criteria applied in the sample selection process, such as the one associated with article citations and selected keywords included in abstracts, may have enhanced the elimination of important articles. Third, sample data extracted from Endnote presented limitations when imported into software such as VOSviewer. However, this study has at least three advantages. To the best of our knowledge, this is the first bibliometric research in which electronic health information has intersected with both interoperability and linkage domains. Second, even with the above limitations, it was still possible to assess the quality and evolution of research about the subject of analysis. Finally, the current research provides insights into the main topics and concerns of the role of interoperability and linkage in health information systems.
Conclusion
Results of this study have outlined theoretical and managerial implications of interoperability and linkage in health information management. One theoretical contribution is based on the need for more literature research about the combined role of interoperability and linkage in health information management, as the existent articles may suggest a lack of interest in the topic area. Also, new metrics and rankings were created to measure the real impact of articles and of journals within the scope of this research, minimising the biases provided by general impact factors and metrics that do not consider research subject specifications. In terms of managerial contributions, this research points to the necessity for the healthcare and information technology sectors to co-develop their solutions, to always consider linkage and interoperability concerns and to put the final consumer as a key player in their discussions. Moreover, these sectors should verify and evaluate stakeholders’ real-world needs so they can introduce their contributions to optimise the architecture of information technology solutions.
Supplemental Material
sj-docx-1-him-10.1177_18333583241277952 – Supplemental material for System interoperability and data linkage in the era of health information management: A bibliometric analysis
Supplemental material, sj-docx-1-him-10.1177_18333583241277952 for System interoperability and data linkage in the era of health information management: A bibliometric analysis by Tiago Costa, Teresa Borges-Tiago, Francisco Martins and Flávio Tiago in Health Information Management Journal
Supplemental Material
sj-docx-2-him-10.1177_18333583241277952 – Supplemental material for System interoperability and data linkage in the era of health information management: A bibliometric analysis
Supplemental material, sj-docx-2-him-10.1177_18333583241277952 for System interoperability and data linkage in the era of health information management: A bibliometric analysis by Tiago Costa, Teresa Borges-Tiago, Francisco Martins and Flávio Tiago in Health Information Management Journal
Supplemental Material
sj-docx-3-him-10.1177_18333583241277952 – Supplemental material for System interoperability and data linkage in the era of health information management: A bibliometric analysis
Supplemental material, sj-docx-3-him-10.1177_18333583241277952 for System interoperability and data linkage in the era of health information management: A bibliometric analysis by Tiago Costa, Teresa Borges-Tiago, Francisco Martins and Flávio Tiago in Health Information Management Journal
Supplemental Material
sj-docx-4-him-10.1177_18333583241277952 – Supplemental material for System interoperability and data linkage in the era of health information management: A bibliometric analysis
Supplemental material, sj-docx-4-him-10.1177_18333583241277952 for System interoperability and data linkage in the era of health information management: A bibliometric analysis by Tiago Costa, Teresa Borges-Tiago, Francisco Martins and Flávio Tiago in Health Information Management Journal
Supplemental Material
sj-jpg-5-him-10.1177_18333583241277952 – Supplemental material for System interoperability and data linkage in the era of health information management: A bibliometric analysis
Supplemental material, sj-jpg-5-him-10.1177_18333583241277952 for System interoperability and data linkage in the era of health information management: A bibliometric analysis by Tiago Costa, Teresa Borges-Tiago, Francisco Martins and Flávio Tiago in Health Information Management Journal
Supplemental Material
sj-jpg-6-him-10.1177_18333583241277952 – Supplemental material for System interoperability and data linkage in the era of health information management: A bibliometric analysis
Supplemental material, sj-jpg-6-him-10.1177_18333583241277952 for System interoperability and data linkage in the era of health information management: A bibliometric analysis by Tiago Costa, Teresa Borges-Tiago, Francisco Martins and Flávio Tiago in Health Information Management Journal
Supplemental Material
sj-jpg-7-him-10.1177_18333583241277952 – Supplemental material for System interoperability and data linkage in the era of health information management: A bibliometric analysis
Supplemental material, sj-jpg-7-him-10.1177_18333583241277952 for System interoperability and data linkage in the era of health information management: A bibliometric analysis by Tiago Costa, Teresa Borges-Tiago, Francisco Martins and Flávio Tiago in Health Information Management Journal
Supplemental Material
sj-jpg-8-him-10.1177_18333583241277952 – Supplemental material for System interoperability and data linkage in the era of health information management: A bibliometric analysis
Supplemental material, sj-jpg-8-him-10.1177_18333583241277952 for System interoperability and data linkage in the era of health information management: A bibliometric analysis by Tiago Costa, Teresa Borges-Tiago, Francisco Martins and Flávio Tiago in Health Information Management Journal
Supplemental Material
sj-jpg-9-him-10.1177_18333583241277952 – Supplemental material for System interoperability and data linkage in the era of health information management: A bibliometric analysis
Supplemental material, sj-jpg-9-him-10.1177_18333583241277952 for System interoperability and data linkage in the era of health information management: A bibliometric analysis by Tiago Costa, Teresa Borges-Tiago, Francisco Martins and Flávio Tiago in Health Information Management Journal
Supplemental Material
sj-jpg-10-him-10.1177_18333583241277952 – Supplemental material for System interoperability and data linkage in the era of health information management: A bibliometric analysis
Supplemental material, sj-jpg-10-him-10.1177_18333583241277952 for System interoperability and data linkage in the era of health information management: A bibliometric analysis by Tiago Costa, Teresa Borges-Tiago, Francisco Martins and Flávio Tiago in Health Information Management Journal
Footnotes
Author contributions
Declaration of conflicting interests
Funding
Supplemental material
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
