Abstract
Keywords
Introduction
E-Health is one of the key goals of the countries of the European Union (EU), 1 which have released successive Health and Information Technology (IT) policy documents. These documents are designed to foster a harmonious and complementary approach to different e-Health implementation. Recent recommendations of the European Commission (EC) 1 contained in the e-Health Action Plan 2015–2020 indicate that the interoperability of electronic health record (EHR) systems is a serious challenge in e-Health solutions. Interoperability in healthcare systems is important for delivering quality healthcare and reducing healthcare costs especially between health systems in different countries. Some of the important use cases include coordinating the care between healthcare professionals of many different e-Health systems in different EU countries. Interoperability has a wide range of meanings to different people and organizations. For example, according to the healthcare interoperability standards body Health Level Seven (HL7), 2 the semantic interoperability is defined as ‘the ability of two or more systems or components to exchange information and to use the information that has been exchanged’.
Standards 3 have been developed to address the various layers in the interoperability, including a number of standards addressing the interoperability challenges in healthcare systems. But the varieties of different competing standards that can be used by each participant are making the interoperability problem even more complicated. The focus on interoperability in European countries, 4 which was part of the national health strategies, was initially placed on technical interoperability. Today, semantic, organizational and legal interoperability are recognized as equally important as technical interoperability.
Service-oriented architecture (SOA) is a software architecture that defines an interface and builds the entire application topology as a topology of interfaces, interface implementations and interface calls. SOA represents a relationship between services and service consumers. Both software modules 5 (services and service consumers) are large enough to represent a complete business function. SOA provides methods for system development and integration where the group of system functionalities around business processes and the package was presented as interoperable services. Service consumers are represented as a piece of software that embeds a service interface proxy (the client representation of the interface). A Web service, that can implement an SOA, is defined as a combination of three technologies: the interface description (using Web Services Description Language (WSDL) documents), the communication protocol Simple Object Access Protocol (SOAP) (using Extensible Markup Language (XML)) and Universal Description, Discovery and Integration (UDDI) repositories (allowing users to find services).
One of the most popular approaches to data integration from different sources is Extract–Transform–Load (ETL) technology. Many research papers and web pages that describe the framework and the sequence of activities in ETL are using workflow approach to design ETL activities. Some groups use a database programming language called LDL (logic-programming, declarative language), 6 and other groups use approaches such as unified modeling language (UML) and data mapping diagrams 7 to define ETL activities. Traditional ETL process cannot support the exact real-time data updating. But, after the SOA technology appeared, the real-time acquisition based on SOA and Web service can be done. 5
The main goal of this article is to present a conceptual model of EHRs, with a focus on data warehouse (DW) component for servicing the cross-border interoperability.
The remainder of the article is organized as follows. In the next section, we provide a brief description of the efforts of the EU for making cross-border interoperability framework. The section ‘Standards and frameworks for EHR’ gives a description of our conceptual model of EHR with DW component for servicing the cross-border interoperability. The section ‘ETL processes – designing ETL tools based on the EHR’ gives a detailed description of the DW architecture and ETL processes. In the section ‘North Macedonian use case’, we present the usability of our model for cross-border data exchange. After that, in the section ‘Discussion’, we are presenting the advantages and potential challenges of our model. In the last section, we conclude the paper with a summary and plans for future work.
Background
EU efforts of cross-border interoperability
The use of electronic health data has been marked as an important strategic activity and policy to improve healthcare in European countries. The aim of the EU countries is to have a cross-border electronic healthcare system which will enable EU citizens to obtain the same healthcare data anywhere in Europe. The EU is committed to providing high-quality healthcare by improving cross-border cooperation between the EU’s member states. The EU has enacted the Cross-Border Health Directive, which enables EU residents to receive health services within the EU borders. Directive 2011/24/EU 8 pertains to the rights of European citizens to obtain healthcare in any EU member state. The Directive explicitly highlights the importance of ensuring patients’ health data flows from one EU member state to another, while safeguarding the data privacy. 8
Electronic identities (eIDs) are one of the key technologies that can deliver not only e-government services but better e-Health services. eIDs allow electronic transactions with secure, proven and legally valid identity information. It enables seamless digital processes eliminating any paper-based signatures or personal appearance for an identity check. Security, in means of identification and authentication, is key to many services such as e-government, e-commerce and especially e-Health. Several countries have issued national eID infrastructure to support such services. These initiatives, however, have often emerged as national islands.
The EC, in the Digital Agenda 2020, has set the goal for the creation of a single digital market. The approach promoted by the single digital market is to deal with the current heterogeneity of eID systems by fostering interoperability of different national systems. To achieve this, it is necessary to raise the interoperability layer, which will allow user-controlled transactions of identity credentials between a user from one country and a service provider from another country.
New ‘Regulation on Electronic Identity and Signature (eIDAS)’, 9 adopted in July 2014, provided the legal framework for cross-national identity federation. In the context of eIDAS, the European Service Directive is demanding member’s state to set up an e-government single point of contact for businesses or services from other European states. Furthermore, in order to develop the cross-border interoperability infrastructure, the EC has set up several so-called Large Scale Pilots (LSPs) like ‘Secure idenTity acrOss boRders linKed’ (STORK and STORK 2.0). 10 STORK has been launched from EU member states and associated countries (EU and European Economic Area (EEA) Member States) 10 and demonstrates interoperable services in online settings based on pilots 9 (e-learning and academic qualifications, e-banking, public services for business, e-Health). The main challenge lies in raising acceptance for the EU cross-border use of eID. STORK 2.0 deals with this issue by exploring requirements for sustainability, which include packaging cross-national authentication as a service for governments and businesses, developing a cost model and promoting the service.
The free movement of European citizens across EU member states adds an important level of complexity to the strategic efforts of health interoperability. The EC has recognized the provision of health interoperability and prepared the e-Health Action Plan 2012–2020 to promote the widespread adoption of information and communication technologies (ICTs) to ‘increase efficiency, improve quality of life and unlock innovation in healthcare’. 1
The EC has developed the e-Health Interoperability Framework for guidance, support and coordination among member states. It has created a foundation for the development of several projects like Call for Interoperability (CALLIOPE), Healthcare Interoperability Testing and Conformance Harmonisation (HITCH), Thematic Network on Quality and Certification of electronic health record systems (eHRQTN), NetC@rds, Smart Personal Health and the Network of Excellence in Semantic Interoperability. The SALUS Project 11 (Security and Interoperability in Next Generation PPDR Communication Infrastructures), co-financed by the EC’s 7th Framework Programme (FP7), aims to create the necessary semantic and functional interoperability infrastructure 12 in order to enable secondary use of EHR data in an efficient and effective way. Also, one of the most relevant efforts for building the Interoperability Framework is the European Patients Smart Open Services (epSOS) Project. 13 EpSOS is developed aiming at designing, building and evaluating an e-Health framework and ICT infrastructure for patient data to be securely exchanged among different European healthcare systems. Within the epSOS project, there is an OpenNCP (Open National Contact Points) framework that offers a comprehensive set of interoperability services. These services enable national and regional e-Health platforms to set up cross-border health information networks compliant with epSOS.
The aim of these projects is to provide high-quality healthcare by improving cross-border cooperation between the EU’s member states proposing the usage of cross-border eID for the secure exchange of patients’ data. The focus on interoperability in presented projects is on the practical implications of national regulations and needs.
Standards and frameworks for EHR
The development of ICT allows paper-based health record systems to move towards electronic formats (EHR). EHR systems provide efficient in real-time online access to patients’ data. At the same time, it creates improvements in quality, flexibility and patient safety. After years of EHR development, the integration of health records from different EHR systems has become one of the main priorities for the policymakers, making interoperability standards necessity that assures security, privacy and interoperability of EHR. Different countries have adopted various standards. 14 In the United States, HL7, Health Insurance Portability and Accountability Act (HIPAA) and Health Information Technology for Economic and Clinical Health Act (HITECH) are used. Canada Health Infoway is used in Canada, Healthcare Information Secure Network Consortium (HEASNET) is used in Japan and International Organization for Standardization’s/ Technical Committee 215 (ISO/TC 215) and European Committee for Standardization/ Technical Committee (CEN/TC) are used in Europe. The development of standards for data models of EHR is an attempt to tackle the storage and exchange of clinical data. Standards like openEHR, 15 ISO 13606-1, 16 HL7 2 and EuroRec 17 are developed for storing and exchange of patients’ data with structured formats.
HL7 2 is a set of standards internationally accepted by most of the healthcare organization to transfer medical and administrative data. These standards focus on the application layer 7 of the OSI (Open Systems Interconnection) model system representation. HL7 makes possible the exchange of demographic, medical, administrative and other textual information. One of the HL7 standards is the clinical document architecture (CDA). 18 This standard defines an exchange model for clinical documents specifying the encoding, structure and semantics of clinical documents in XML 19 format. The XML format organizes patient information in a hierarchy of elements, attributes and values. A CDA can contain any type of clinical content, such as EHRs, in which medical information of patients is stored and transmitted between hospital systems.
In HL7 strategy description, the interoperability is defined in three different contexts that affect how the software is designed, and how the data are stored and used: 2
Technical interoperability is based on information theory and is concerned about neutralizing the effect of distance.
Semantic interoperability is domain-specific and is needed to understand, interpret and use data.
Process interoperability allows human understanding to be shared and enables work process coordination.
HL7 standards are grouped into categories depending on their usage: 2 primary standards, foundational standards, clinical and administrative domains, EHR profiles, implementation guides, rules and references, and education and awareness.
The OpenNCP is a novel framework to build ‘National gateways’ 20 that foster an e-Health ecosystem across Europe. The framework can be used to enable cross-border e-Health data exchange by allowing services that provide security and data alignment requirements. The components developed within the OpenNCP project must adhere to the standards, protocols and other technical profiles defined by epSOS 21 specifications. The OpenNCP architecture does not imply changes in the National Infrastructure operations. So, the interfaces of national infrastructure for OpenNCP should be defined to operate in conjunction with the components developed in the project. The NCP follows the general SOA paradigm where each component must operate as a service. Thus, each component is externally scoped by the NCP but internally is available to the other components.
In order to support OpenNCP architecture, we will use the previously defined collaborative model of EHR. 22 The model presents a module of more complex architecture, consisting of Interface Access Layer, Application Layer and Data Management Layer. The Interface Access Layer represents the entry point to the EHR. This layer controls the access to the health data contained within the EHR. Interoperability module integrates the proposed EHR system with all external EHR subsystems, electronic medical record (EMR), electronic patient record (EPR) and in this case OpenNCP. Detail description of the model of EHR is given in the previously mentioned model of EHR. 22 For the purpose of this article, we will focus on DW submodule as a part of Data Storage module.
The design of a DW for healthcare data storage and analysis is an ambitious undertaking due to the heterogeneity and complexity of the domains that supply the relevant data and the corresponding data sources. In order to have a successful implementation of the DW, the DW design must be flexible and readily extensible. It should be easy to be modified in order to accommodate data from new domains and new data structures. The ETL process of data from different sources into the integrated staging area is a fundamental component of DW design. The staging area of DW serves as a primary entry point into the DW, where data are cleansed of nonessential, incorrect, inconsistent and redundant entries. In our research on the design of healthcare DW, we are focusing on ETL design and the design of the staging area.
In this article, we define a framework that interconnects HL7, as the communication and structure standard for conveying medical information, and OpenNCP in order to provide a modular, scalable and inter-operating architecture. Presented approach, besides the technical, incorporates the organizational interoperability as well.
ETL processes – designing ETL tools based on the EHR
Traditional ETL processes
ETL processes are pieces of software responsible for the extraction and integration of data from multiple sources or applications. It refers to a process in data warehousing that extracts data from several sources and transforms it to the operational needs, which can include quality checks, cleansing, customization, reformatting, integration, insertion into a DW and loading into the end target database. ETL activities are one of the most important processes in the DW. The three phases of the ETL process are extract, transform and load.
The concepts behind these functionalities and relationships between specific ETL phases can be concluded in one framework diagram as shown in Figure 1. Data stores (data sources) that are involved in the overall process are depicted in the Data Layer. The original data providers (typically relational databases and files) are shown at the top in Figure 1. The data from these sources are extracted (as shown in the middle part of Figure 1) by extraction routines. Then, these data are propagated to the Data Staging Area (DSA) where it is transformed and cleaned before being loaded to the DWs. The DW repositories are depicted in the right part of Figure 1 and comprise the target data stores. The data loading to the central warehouse is performed through the loading routines as shown in the middle right part of Figure 1.

Traditional ETL processes.
There are a number of tools that facilitate the ETL process. Some of them are Ab Initio, IBM Infosphere, SAP Data Integrator, Oracle Warehouse Builder, Microsoft structured query languages (SQL) Server Integration Services and Informatica PowerCenter for Enterprise Data Integration. 23 Besides these, there are some open source ETL products 23 like Apatar, Talend Open Studio, Pentaho Kettle and CloverETL. Most of these ETL tools provide a graphical user interface to create a workflow of ETL activities and automate their execution.
ETL tools based on the EHR
One of the main challenges in the construction of the DW system is to design and develop ETL tools. These tools are responsible for the integration of data provided by multiple information systems in a common target schema. In the healthcare area, the EHR presents an important infrastructure that provides information for the healthcare status of a patient in order to support physicians and other professionals in the delivery of care services. In our case, the data from EHR (data from a central repository of EHR, EPR, EMR or some parts of EHR) can represent one of the main sources of information to fill the DW.
Designing methodology for an ETL tool
In this section, we are presenting a methodology for designing an ETL tool of a DW architecture based on different documents and structures of information stored in the healthcare information system. The information contained in DW will be used by OpenNCP 20 for cross-border e-Health data exchange. We will present the conceptual model by describing the ETL process that uses SOA. In the proposed model of ETL, we use results and concept of novel design for the staging area of a scientific DW. 24
Our proposed model of ETL is based on open standard Web services and XML technologies. Web service primarily indicates an application available on the Internet by a service provider and accessible by customers through standard Internet protocols. It enables applications to remotely communicate via Internet, independently from the used platforms and programming languages. The Web service is an autonomous software component, and it uses message exchange norms based on XML (SOAP and WSDL). SOAP 25 is the open standard messaging protocol used by applications to define a mechanism for the exchange of structured and typed information. A Web service draws a set of features exposed on the Internet or on Intranet, either by or for applications, in real time without human intervention.
Data in DW are imported from several sources and transformed into a staging area before they are integrated and stored in the production of DW for further usage. In our case, data sources are IT subsystems in Health Insurance Fund, Central National EHR system, Ministry of Health, GP’s (general practitioner) IT systems, hospitals’ IT system and other IT systems (participants in healthcare). The architecture of a healthcare DW based on a Web services architecture and utilizing XML is presented in Figure 2. The architecture defines the exchange health data format of all XML documents that are pulled from online transaction processing (OLTP) systems into the staging database. The web services (WS) loader (as shown in Figure 2) is a web service application that takes the WSDL and issues SOAP messages to the web services. The data extraction and transformation from the operational database management system (DBMS) into XML format is implemented by Web services. A Web service can perform several functional blocks: extraction, partial transformation, full transformation, validation and RPC (remote procedure call) interface. SQL to XML transformation performs the extraction from the operational DBMS or the partial transformation of the data into an intermediate XML data format. The intermediate XML document is then transformed, using XSLT (eXtensible Stylesheet Language Transformations), into an XML document that conforms to the exchange schema. To verify conformance to the data exchange schema, the XML document is checked using an XML validation engine.

Architecture of a healthcare data warehouse based on a Web services architecture and utilizing XML schema and a Web services architecture in the staging area.
The WS loader is an application that stands next to the staging database. It requests from Web services the extraction and sending of new data. After the XML instances with the new data are received, the data need to be stored in the staging database. It triggers the necessary activities for integrating the newly arrived data with the data in the DW repository. To achieve this, the healthcare WS loader sends a SOAP message (requesting updates) to a Web service. When the data are extracted and the corresponding XML data format is prepared, it is sent back to the WS loader through another SOAP message. The WS loader takes the XML file and moves it into the staging area by using the XML database programmatic interface. So, the XML file is physically stored in the object-relational database, which data model is defined by the exchange schema. In order to move the data from its object-relational structure into the data marts (see Figure 2), the SQL with object references is used. This last step is the bridge between the Web services architecture and the moving of data into more conventional data warehousing structures.
North Macedonian use case
The number of North Macedonian citizens crossing the borders of EU countries has increased exponentially over the past two decades. Many of these citizens need healthcare in the countries they currently stay or travel. In addition, many citizens from the countries of the EU, as well as from other European countries, travel or stay in North Macedonia. These citizens need healthcare protection. North Macedonia, as a candidate for membership in the EU, should aim to join these initiatives and activities for obtaining healthcare data anywhere in Europe. North Macedonia has signed agreements with several countries for providing social and health protection to its citizens in these countries as well as citizens from these countries on the territory of North Macedonia.
Figure 3 provides a representative example of using some services (like epSOS services) and use of healthcare services while the patient is abroad. For example, North Macedonian patient with a broken leg receives care in a foreign country (Country B) from a Health Care Professional (HCP). The HCP uses his usual point-of-care system and requests access to the patient’s health record stored in North Macedonian e-Health infrastructure. At this point, the epSOS services ensure that the HCP (authenticated by the Country-B infrastructure) gets the patient summary data (administrative and medical data) in a language she or he can understand. It is important to mention that the HCP must have patient consent. Forward information, produced during the medical examination or treatment, might be inserted back to the patient’s record (stored in North Macedonian e-Health infrastructure). Depending on whether the data are requested or recorded, various parts of the EHR are accessed (see Figure 3). The cross-border services are handled by clinical gateways called National Contact Points (NCP).

Sequence diagram illustrating the use of epSOS services for care provision abroad.
In the context of the epSOS framework, OpenNCP solves the problem of secure data exchange for healthcare provision abroad by using the health professional language and maintaining the clinical/legal value of the original documents. The legal grounds for the exchange of patient data are supported by a formal agreement between countries (like EU’s Member States), which sets the responsibilities of the participants in a peer-to-peer model, forming the concept of a Circle of Trust. This circle of trust is, in fact, the cornerstone of the OpenNCP Interoperability Architecture. 18 Modified and customized version of OpenNCP Interoperability Architecture for our case is shown in Figure 4.

OpenNCP architecture.
The main components of the presented architecture (Figure 4) are as follows:
The described use case for providing a healthcare service can be used for any case of providing healthcare services to North Macedonian citizens outside of North Macedonia.
Discussion
In this article, we are using the concept and design of a scientific DW in the staging area based on Web services architecture. In a conventional DW such as healthcare DW, data arrive at the staging area in independent formats. After that, data are interpreted, transformed and loaded into the warehouse. The integration of the data from the operational sources to the staging area is not completed until the transformed data are actually loaded into the staging area. By restructuring the ETL framework to include an extra component without adding additional complexities, such as the Classified-Fragmentation service 26 and additional web services, greater flexibility is established. In addition, by using standards for the structure of the health and medical data and standards for data exchange, we are facilitating the ETL design process. Exploiting this, the restructured ETL framework based on the SOA is developed. This framework can be further extended by adding extra components to suit new business needs of the enterprises.
The described use case can also be used as a reference case for providing heаlthcare. With a similar approach (as the presented use case), other cases of healthcare treatment of North Macedonian citizens can be treated when staying abroad. In addition, by providing access to DW data (instead of data from the production database) through OpenNCP and the healthcare data, the presented solution offers a new approach in creating an e-Health ecosystem across Europe.
Conclusion
In this article, we have presented a novel design for the staging area of a healthcare DW that supports servicing national OpenNCP in cross-border data exchange. The proposed design expands the ongoing efforts in the standardization of healthcare data exchange formats by using a Web services architecture to organize ETL activities and XML Schema as an integration schema.
The use of an SOA and standardized XML data exchange schema, as well as classification components and indexing, allow this architecture to be used in different developing applications other than DWs.
Based on the presented model, a use case using a specific set of services was presented. The use case includes the main components of the architecture as Data Discovery Exchange Services, Trust Services, Transformation Services, Audit Services and Support Services. Although the described use case is based on the North Macedonia’s healthcare system, it can be easily adapted by other healthcare systems as well.
