Abstract
Introduction
Research performance is one of the issues examined in higher education institutions, universities, faculties, institutes and research centers. Many countries base a large proportion of research funding for higher education institutions on performance evaluation (Bazeley, 2010). As higher education institutions face financial constraints, the demand for productivity and accountability of academics and institutions has increased (Law & Chon, 2007). Government grants to institutions with high research performance have also increased (Geuna & Martin, 2003).
There are organizations such as THE, ARWU, and QS that measure and rank the performance of universities in various fields. Studies have shown that the results of these rankings attract prospective students to elite universities (Bowman & Bastedo, 2009), affect activities such as organizational mission, strategy, decision-making and recruitment (Hazelkorn, 2007, 2008), and affect the reputation and prestige of institutions (Bastedo & Bowman, 2010). A prestigious university has social and economic benefits for the country and its region (Fritsch & Slavtchev, 2007; Pressman et al., 1995).
For these reasons, universities and academics have started to spend their time on higher productivity (Law & Chon, 2007) and a strong motivation to improve the research performance of universities has emerged in recent years (Åkerlind, 2008). This motivation has led to a growing interest in examining the research output and hence performance of universities and academics (Abdul-Majeed et al., 2021; Barney et al., 2022; Preut et al., 2022).
This growing interest is reflected in studies on the performance of different units of universities and researchers. In the literature, there are studies on the performance of universities (Abdul-Majeed et al., 2021; Abramo et al., 2008, 2012; Maral, 2023, 2024a, 2024b, Maral & Çetin, 2024; Buela-Casal et al., 2009; Johnes & Yu, 2008; Kao & Pao, 2009), faculties (Dean et al., 2011; Ence et al., 2016; Holliday et al., 2014; Lowe & Gonzalez-Brambila, 2007), academic departments (Brocato & Mavis, 2005; Fox & Mohapatra, 2007; Khan et al., 2014, 2019; Law & Chon, 2007; Saxena et al., 2023), researchers (Abramo et al., 2020; Khan et al., 2019; Tanya et al., 2022).
When it comes to the research performance of universities, researchers working in this field are curious about the measures of research performance. When the research studies are examined, it is seen that research performance is generally examined in two different dimensions. The first one is productivity, which depends on the number of publications, and the other is impact, which reflects the quality of publications. As a measure of productivity, the number of publications (Bansal et al., 2023; Chang et al., 2020; Holliday et al., 2014), the number of publications per academic staff (Johnes & Yu, 2008), the h index (Lai et al., 2022; Praus, 2018; Sadeghi-Bazargani et al., 2019), the proportion of publications among the best publications (Craig et al., 2021), the number of publications in Scientific Journal Ranking (SJR) journals (Saxena et al., 2023), g index (Ke et al., 2016), the ratio of publications to total publications (Abramo et al., 2008), i10 index (Cvetanovich et al., 2016; Susarla et al., 2015). To determine publication impact, which is an indicator of the quality of publications, measures such as the number of citations (Aldieri et al., 2020; Sanmorino et al., 2022), the number of citations per publication, the h-index and g-index (Ding et al., 2020; Tahira et al., 2018), and the i10 index (Susarla et al., 2015) are frequently used. In addition to these measures, there are studies that consider research grants in the evaluation of research performance (Goldstein, 2011; Marisha et al., 2017; Valadkhani & Ville, 2010). Apart from these objective measurement methods, performance is also evaluated based on the opinions of researchers and academic administrators (Izuagbe, 2021; Law & Chon, 2007; Martin-Sardesai et al., 2017; Nguyen Quoc et al., 2021; Ryan, 2014).
One of the important issues in measuring research performance is the method by which the performance will be measured and analyzed. It is seen that descriptive statistical methods based on complete counts are generally used in research (Brocato & Mavis, 2005; Holliday et al., 2014; Khan et al., 2014; Saxena et al., 2023). However, other methods such as data envelopment analysis (Abramo et al., 2008; Hung & Chou, 2013; Johnes & Yu, 2008; Lee et al., 2012; Mutz et al., 2017), bibliometric method (Cardoso et al., 2020; Lee et al., 2012) have also been used.
These methods used to evaluate the research performance of universities have some limitations in measuring the research performance of universities. In addition, the use of multi-criteria decision making methods (MDCM) in the evaluation of research performance is important to overcome these limitations. First, research performance is a multidimensional and complex process that needs to be measured with more than one criterion. MCDM has the ability to evaluate more than one alternative by considering more than one criterion. Second, in previous methods, the performance criteria were assumed to be of equal importance. This is an assumption that causes bias in research results. The performance criteria of universities may not be of equal importance. In this case, the importance level of the criteria should be determined statistically. At this point, arbitrary weight assignments can be made. However, such an approach may reduce the reliability of the results. However, MCDM methods have the ability to weight performance criteria based on a certain statistical and theoretical background. Therefore, in this study, the weighting of the criteria was carried out using methods based on a statistical and theoretical background. Third, in previous studies, research performance has been evaluated based on a single method. Such an approach may cause concerns about the reliability of the results. The use of MCDM methods by integrating more than one method in a hybrid approach has the advantage to overcome this problem. In this research, more than one MCDM method was used to weight the research performance criteria and rank the universities. Such an approach is an effective method to utilize the advantages and reduce the limitations of MCDM methods with different theoretical backgrounds. In addition, the use of more than one MCDM method leads to more reliable results by increasing methodological diversity.
The aim of this study is to propose a new method by examining the research performance of universities with multi-criteria decision making (MCDM) methods. There is no research in the literature that measures the research performance of universities with multi-criteria decision making method. In this study, objective multi-criteria decision-making methods were used both in weighting the research performance measures and in creating the rankings. In addition, more than one MCDM method was used for both the weighting of performance measures and the ranking of universities. The main rationale for this is that each MCDM method has its own statistical background. Therefore, more than one MCDM method was used to ensure methodological richness and reduce bias. Another contribution of this research is to show that the criteria used to evaluate the research performance of academic units may have different levels of importance. This is because not all criteria used to measure the research performance of an academic unit may have the same importance.
As a result, this study seeks to answer two main questions:
What are the weights of research performance measures (criteria)?
What is the ranking of universities in terms of research performance?
Method
In this study, MCDM methods were used to evaluate the research performance of universities. MCDM emerged as a part of operations research, which is concerned with using computational and mathematical tools to help decision makers evaluate performance criteria (Zavadskas et al., 2014). MCDM methods are used to identify alternatives, categorize and group them into fewer categories, and rank alternatives. MCDM is a concept that encompasses all methods available to assist the decision maker in situations where there are multiple and conflicting criteria (Ho, 2008).
The method model used in the research is shown in Figure 1. In this research, firstly, the criteria for research performance were weighted by three different objective CRCDM methods: (1) Criteria Importance Through Intercriteria Correlation (CRITIC), (2) Method Based on the Removal Effects of Criteria (MEREC), (3) Entropy. Then, the weights of each alternative (university) were ranked with three different ranking methods. These methods are (1) Additive Ratio Assessment (ARAS), (2) Multi-Attributive Border Approximation Area Comparison (MABAC), (3) The Technique for Order of Preference by Similarity to Ideal Solution (TOPSIS). By including each weight value in the analysis with each of the ranking methods, nine different rankings of an alternative were obtained. These rankings were combined with the Borda Function to obtain a single ranking value for each alternative.

Research model.
CRITIC Method
The first method used for weighting the criteria is the CRITIC method. This method was developed by Diakoulaki et al. (1995) and is used for objective weighting of the criteria in a CRITIC problem. The CRITIC method uses the correlation coefficient between attributes to determine the relationship between attributes. The steps of the method are as follows (Alinezhad & Khalili, 2019):
The decision matrix in the CRITIC method is as follows:
The following equations are used to normalize the positive and negative values of the decision matrix, respectively.
Where
The correlation coefficient between attributes is found with the help of the following equation:
where
The standard deviation of each attribute is estimated by the first equation below. The
The weights of the attributes are calculated with the following equation:
The weights of the attributes are obtained by sorting them in descending order.
MEREC Method
The second method used to weight the criteria is the MEREC method. This method is a method developed by Keshavarz-Ghorabaee et al. (2021), which is used for the objective weighting of the criteria in the MCDM problems. In this method, the removal effect of each criterion on the overall performance of the alternatives is used to determine the weights of the criteria. The MEREC method is based on the concept of causality. The following steps are used in the application of the MEREC method (Keshavarz-Ghorabaee et al., 2021):
A decision matrix is created that shows the rating of each alternative or the values for each criterion.
A simple linear normalization is used to scale the elements of the decision matrix. The following equation is used for normalization.
A logarithmic measure with equal criteria weights is applied to calculate the overall performance of the alternatives. The following equation is used for this.
The performance of the alternatives is calculated by removing each criterion. The performance of the alternatives is calculated as follows:
In this step, the removal effects of the
The objective weight of each criterion is calculated using the lifting effects from step 5. The value
Entropy Method
The concept of entropy was first introduced by the German physicist R. Clausius in 1865 and describes the complexity of a thermodynamic system. The entropy method uses the magnitude of the entropy value from information theory. It calculates the extent to which each evaluation attribute conveys decision information and reveals the relative weight between attributes. The entropy method uses the degree of difference of the criteria to calculate the information entropy of the criterion, to measure the effective information found and the criterion weight. The application steps of the entropy method are as follows (C. H. Chen, 2020):
M alternatives and n criteria are organized for the initial matrix of the entropy weight evaluation.
Due to the differences between units, each criterion needs to be standardized to eliminate the influence of different units on the evaluation results. At this stage, the conversion method is applied with the following formulas.
The entropy value of ej is calculated with the following equation:
The information utility of the
The weights of the criteria are calculated with the following equation.
The measurement values of the criteria are obtained with the following equation:
ARAS Method
The ARAS method is a MCDM method developed by Zavadskas and Turskis (2010), which is used to select the best alternative with different attributes. In this method, the ranking of alternatives is done by determining the degree of contribution of each alternative (Alinezhad & Khalili, 2019). The following steps are followed in the application of the ARAS method (Zavadskas & Turskis, 2010):
A decision matrix is created. The decision matrix is as follows:
The initial values of all criteria are normalized and a normalized decision matrix is obtained.
The criteria to be maximized are normalized as follows:
The criteria to be minimized are normalized as follows:
In this step, the normalized weighted matrix is defined as follows.
The normalized weighted values of all criteria are calculated with the following equation:
The equation below shows the values of the optimality function.
Si denotes the value of the optimality function of alternative
MABAC Method
The MABAC method is an MCDM method developed by Pamučar and Ćirović (2015). This method is based on determining the distance of each alternative to the border approximation area of the criterion function. The application steps of the MABAC method are as follows (Pamučar & Ćirović, 2015):
The decision matrix is as follows. Where
The normalized version of the initial decision matrix is as follows:
The elements of the normalized decision matrix are determined by the following equations:
(a) For the utility criterion;
(b) For the cost criterion;
The elements in the matrix are calculated with the help of the following equation:
In this equation,
The border approximation area matrix is determined by the following equation:
After calculating the
The distance of the alternatives to the border approximation area (
The belonging of alternative
The values of the criterion functions for the alternatives can be calculated as the sum of the distance of the alternatives from the boundary zoom areas (
TOPSIS Method
The TOPSIS method is one of the first MCDM methods developed by Yoon (1980) and Hwang and Yoon (1981). This method is used to rank and select alternatives by measuring Euclidean distances. This method is based on the idea that alternatives should be closest to the positive ideal solution and farthest from the negative ideal solution (Zhang et al., 2011). The TOPSIS method consists of the following steps (Tzeng & Huang, 2011):
The decision matrix is created with
At this stage, the decision-making matrix is normalized. For this, the following equations apply:
For the utility criterion;
For the cost criterion;
The weighted normalized decision matrix is obtained by the following equation:
The following equation is used to determine the positive ideal point.
The following equation is used to determine the negative ideal point.
The PIS and NIS discrimination values between alternatives are measured using Euclidean distance. The following equation is used for this purpose:
Similarities to PIS are obtained with the following equation:
The value of
Advantages and Disadvantages of MCDM Methods
In this study, the CRITIC, MEREC and Entropy methods were used to weight the research performance criteria and the ARAS, MABAC, and TOPSIS methods were used to rank the universities. Each of these methods has its advantages and disadvantages. The reason for using more than one MCDM method in this research is to combine the advantages of different methods and to minimize the disadvantages of these methods.
The advantage of the CRITIC method is that it evaluates the relationships between the criteria and provides a different perspective on the problem. It weights the criteria objectively. It removes the influence of subjectivity. The disadvantage is that it requires data on alternatives for each criterion and involves relatively complex calculations. The MEREC method removes each criterion and focuses on its impact on the overall performance of the alternatives. It provides a different perspective on the problem. It weights criteria objectively. It removes the effect of subjectivity. The disadvantage is that it requires data on the alternatives for each criterion and has relatively complex calculations. In the entropy method, the weight of the criteria is determined according to the degree of dispersion (Qu et al., 2022). Provides a different perspective on the problem. Weighs criteria objectively. Removes the effect of subjectivity. However, it may not be sufficient on its own to determine model weights (Qu et al., 2022). It needs the data of alternatives for each criterion. In the ARAS method, the priority of alternatives is determined according to the value of the utility function. It uses the relationship with an optimal alternative when ranking alternatives (Hatefi et al., 2021). This method has limitations in dealing with problems where the degrees of belongingness and non-belongingness of an alternative can add up to more than 1 (Mishra et al., 2022). The MABAC method determines the values of the criterion functions of the alternatives and defines the distance of the criterion function to the boundary approximation region (Pamučar & Ćirović, 2015). It provides a different perspective on the problem. It provides consistent results when the units of measurement of the criteria change (Torkayesh et al., 2023). The TOPSIS method proposes to rank alternatives based on the shortest distance to the positive ideal solution (PIS) and the negative ideal solution (NIS) (Hwang & Yoon, 1981). This method represents the preferences of decision makers in a logical way. It can identify the best and worst alternatives simultaneously. It is easy to program and compute. Polyhedral graphs can be used to visualize the performance of all alternatives (Kim et al., 1997; Shih et al., 2007). However, when an alternative is added or removed from the decision problem, the order of preference of the alternatives changes (García-Cascales & Lamata, 2012).
Integration of Rankings: Borda Function
The weights of the criteria for the research performance of universities were analyzed by CRITIC, MEREC, and Entropy methods. The weight values obtained from these three methods were ranked separately by ARAS, MABAC, and TOPSIS methods. As a result of these analyses, nine different ranking values were obtained for each university. Borda function was used to obtain a single rank value from these rankings.
The Borda function is expressed as the most fair method to reach a common goal when different opinions need to be combined (Dummett, 1998). The Borda score obtained from the Borda function expresses the superiority of each alternative over the other alternatives in the preference ranking. Borda function is calculated by the following formula (Y. K. Chen et al., 2014):
Here
Validation and Sensitivity Analysis
In studies where a model is proposed using MCDM methods, it is important to test the validation and sensitivity of the model. The purpose of these tests is to determine the reliability of the results of the proposed model, to examine the sensitivity and stability of the findings to changes (Mukhametzyanov & Pamučar, 2018). There are various methods to test the validation and sensitivity of a model. This research adopts an approach that follows previous studies. This approach examines the extent of change of results by changing the criteria weights for sensitivity analysis in a systematic approach (Biswas, Pamučar et al., 2022; Hezam et al., 2023; Pamucar et al., 2017; Salimian et al., 2023). Therefore, in this study, scenarios were generated by assigning the weight of the most important criterion to the other criteria respectively and all analyses were repeated. The difference between the ranking results produced by different scenarios means that the model is highly sensitive to change. Kendall’s W test was applied to test whether there is a significant difference between the scenarios.
To test the validation of the results of the model, the method in previous studies was followed (Biswas, Majumder, & Dawn, 2022; Pamucar et al., 2022). For this purpose, a different ranking method, EDAS (Keshavarz Ghorabaee et al., 2015), was used to rank the universities and these results were compared with the existing results using Spearman rank correlation test.
Research Performance Criteria
Measuring research output is challenging. Research has certain characteristics such as productivity, quality, impact, progress and reputation. Different measures reveal different aspects of performance (Auranen & Nieminen, 2010). Research performance is traditionally based on publication output, citations as a measure of impact, and sometimes work quality assessments and reputation indicators of researchers (Bazeley, 2010).
In this research, research performance will be considered in two dimensions. The first one is productivity and the other is impact. Productivity refers to the amount of research output. A high number of research outputs indicates that the researcher or institution is productive. Impact is related to how effective these outputs are. It is therefore often associated with the quality of outputs.
Quantity has an important place in the evaluation of research performance. However, it should be evaluated in terms of quality as well as quantity (Kao & Pao, 2009). The number of publications cannot be used for research performance without knowing the scientific impact of publications. In addition, the number of citations alone is not a reliable performance indicator without considering the number of publications and researchers. Similarly, the citations per article indicator is an acceptable indicator, but it is not recommended for performance measurement as it neglects the size of the university and shows a bias toward large universities (Abdul-Majeed et al., 2021).
In this study, three different measures were used for productivity: Number of documents, number of documents in the top 10%, and number of documents in Q1. However, these measures yield results in favor of large universities when comparing universities with each other. This is because universities with more academic staff are able to publish more. For this reason, these indicators have been adjusted for the number of staff in order to accurately compare universities. As a result, the following three criteria were used as measures of productivity:
(1) Number of documents per academic staff (DPA): Total number of publications divided by the number of academic staff.
(2) Number of documents in the top 10% per academic staff (TPA): The top 10% of the most cited documents in a given subject, year and publication type divided by the total number of publications in a given publication group (Incites, 2022).
(3) Number of documents in Q1 per academic staff member (Q1PA): The number of publications in the top 25% of the Journal Impact Factor (JIF) in a given year (Incites, 2022).
As impact measures, h-index excluding self-citations, Category Normalized Citation Impact, and Impact Relative to the World were used. However, since the h-index excluding self-citations measure produces a result in favor of large universities, this measure was adjusted for the number of staff. As a result, three impact measures were used:
h-index excluding self-citations per academic staff (HPA): The h-index based on the number of citations, calculated by subtracting self-citations (Incites, 2022).
Category Normalized Citation Impact (CNCI): The actual citation value of documents with the same publication type, year, and subject area divided by the expected citation rate (Incites, 2022).
Impact Relative to the World (IRW): It is obtained by dividing the citation impact by the world citation impact. It shows the impact of research on the basis of global research impact. The world average is equal to one. If the IRW value is above one, it means that the unit under study performs above the world average (Incites, 2022).
Universities Included in the Analysis
In certain academic disciplines, more articles are published and these articles receive much higher citations. Therefore, publication impact, which is associated with citation frequency, varies according to the discipline to which the publication belongs (Radicchi & Castellano, 2012). Engineering disciplines have been found to have lower citations than Physics and Chemistry. Therefore, it is stated that researchers in the field of engineering should publish more articles in order to catch up with the h-index of researchers in Physics and Chemistry (Czarnecki et al., 2013).
Aside from these differences in different fields of science, this is also the case for sub-disciplines or fields within the same discipline. Since the number of citations per article is higher in fields with more publications, normalizations are required (Czarnecki et al., 2013; Simko, 2015). As a result, realizing research performance without taking disciplinary differences into account does not allow for a healthy comparison.
This study focuses on the educational research of universities in order to provide a more robust basis for comparing universities. The important point here is that some of the research performance criteria are free from the influence of personnel. Therefore, it is clear that determining a measure by taking into account all the staff of a university will lead to wrong conclusions. As a result, only 88 universities with faculties of education were included in this study and the number of faculty staff was taken as the basis for the calculations. Universities with faculties of education established in 2018, which is the starting year of the data, and later were not included in the study.
Data Collection
The data on research performance used in this study is from InCites Benchmarking and Analytics by Clarivate (December 12, 2022). This data covers the last 5 years (2018–2022) of universities. Only articles and review articles are included in the data. Books, book chapters, proceedings and other publications are excluded. The data covers four categories in the field of education: (1) Education Eduactional Research, (2) Education Scientific Disciplines, (3) Education Special, (4) Psychology Education. Data on the number of staff were taken from the Statistics database of the Council of Higher Education, the supreme organization of higher education institutions in Türkiye (YÖK, 2022).
A decision matrix was created to evaluate the performance criteria of 88 universities based on the data collected. The matrix combines the alternatives used in MCDM with the criteria values for the alternatives. Alternatives are listed in rows and criteria in columns, with each cell displaying the performance of a particular alternative in specific criteria. The decision matrix compares and evaluates the performance of each alternative on each criterion. In this study, 88 universities are listed in rows, and their performance values in each research performance criterion are listed in columns to form a decision matrix. This decision matrix serves as the initial step of the MCDM methods employed in this research.
Results
Findings on Criteria Weights
In order to answer the first question of the research, productivity criteria were weighted with CRITIC, MEREC and Entropy methods. The weight results of the criteria for productivity are shown in Figure 2. When the weight values of the three different MCDM methods are analyzed, it is seen that each criterion reaches the maximum level in different methods. While DPA value is found to be maximum in CRITIC method, Q1PA value has maximum weight in MEREC method and TPA value has maximum weight in Entropy method. While CRITIC and Entropy methods calculated TPA and Q1PA weights closer to each other, MEREC method calculated DPA and TPA values closer to each other. When the weight values of the three criteria made by the three CRCDM methods are evaluated together, it can be said that there are not very serious value differences between the criteria. When the average weight values are examined, the most important criterion is Q1PA, followed by TPA and DPA, respectively.

Weight values of productivity criteria.
The weight values of the impact criteria weighted by CRITIC, MEREC and Entropy methods are shown in Figure 3. CRITIC and MEREC methods have ensured that these criteria have close values. However, in the Entropy method, while CNCI and IRW have close values, HPA criterion has a higher weight value. While CRITIC and MEREC methods gave maximum weight to CNCI, Entropy method gave maximum weight to HPA. HPA criterion has minimum values in MEREC method, CNCI criterion has minimum values in Entropy method and IRW has minimum values in MEREC method. When the average weight values are examined, HPA is the most important criterion, followed by CNCI and IRW, respectively.

Weight values of impact criteria.
Findings Regarding the Ranking of Universities
Table 1 shows the ranking results for the research performance of universities. Three different weighting methods were included in three different ranking methods and analyzed separately. A total of nine rankings were obtained for a university. The Borda score of each ranking value was calculated and these scores were summed to obtain a total score for each university. All universities were ranked according to the total scores.
Research Performance Ranking of Universities.
AC = ARAS-CRITIC; AM = ARAS-MEREC; AE = ARAS-Entropy; MC = MABAC-CRITIC; MM = MABAC-MEREC; ME = MABAC-Entropy; TC = TOPSIS-CRITIC; TM = TOPSIS-MEREC; TE = TOPSIS-Entropy; BS = Borda score; TBS = total Borda score; FR = final rank.
Validation and Sensitivity Findings
The ranking results obtained from six different scenarios for the sensitivity analysis of the findings of the proposed model for evaluating the research performance of universities are given in Figure 4.

Ranking results of different scenarios.
Figure 4 shows the ranking results of 88 universities obtained from six scenarios. When Figure 4 is analyzed, it is seen that the scenarios give very close results to each other. However, Kendall’s W test was performed to test this statistically. The Kendall’s W test results in Table 2 show that there is no significant difference between the ranking values of the six scenarios.
Kendall’s W Test Results.
In order to test the validation of the results, the correlation between the original ranking values and the ranking results obtained from the EDAS method was analyzed. Table 3 shows the results of Spearman’s rank correlation test. The findings revealed the existence of high correlation between the different methods. This shows that the results obtained by different methods are consistent with each other.
Spearman’s Rank Correlation Test.
Correlation is significant at the .01 level (two-tailed).
Discussion and Conclusions
Higher education institutions face financial constraints and government grants are adjusted according to the research performance of universities. The research performance of universities affects their reputation and prestige. High reputation and prestige of universities generate social and economic benefits for their countries and regions. For these reasons, universities and academics strive for higher research performance. International ranking organizations, which play an important role in university rankings, also use research performance as a criterion. All these reasons have resulted in a growing interest in recent years for universities to improve their research performance.
In this study, a new research performance evaluation method is proposed for evaluating the research performance of universities by utilizing MCDM methods. In this direction, the aim of the research is to determine the research performance of universities by using MCDM methods. The study included 88 universities in Türkiye with faculties of education that started their activities before 2018. The performance of these universities in global education research was analyzed. Data on research performance were obtained from InCites Benchmarking and Analytics, and data on the number of staff of universities were obtained from the statistical database of the Council of Higher Education, the supreme organization of universities in Türkiye. Since the research aims to focus on the actual performance of universities, the data covers the period between 2018 and 2022. The criteria for research performance were first weighted by CRITIC, MEREC and Entropy methods, and then ranked by ARAS, MABAC and TOPSIS methods using these criteria weights. Each criterion weight was ranked with three different ranking methods and a total of nine different ranking values were obtained for a university. These different ranking values were combined using the Borda function.
This research proposes a new a new method to measure the research performance of universities. This method is based on multi-criteria decision making and is completely objective. In both the weighting and ranking of the criteria, the available data were used objectively, not subjective evaluations such as expert opinion. In the literature, descriptive statistical methods are generally used to analyze data for measuring research performance. However, since MCDM methods are suitable for comparing conflicting criteria and multiple alternatives, they provide an advantage in measuring the research performance of universities. In addition, the use of more than one MCDM method allows to benefit from the advantages of methods with different theoretical backgrounds and to reduce bias. Therefore, the use of different methods in both criteria weighting and ranking has an important advantage. Indeed, the fact that the criteria weights and final rankings differed justified the authors’ decision to use different MCDM methods together.
In measuring the research performance of universities, it is only an assumption to assume that each criterion is of equal importance. However, this research has put forward a scientific method for weighting the criteria by weighting the criteria with the use of MCDM methods. The criteria weights set out in this research may change in different countries, universities or in measuring the performance of researchers. However, this change will be based on a scientific method rather than assigning an arbitrary criterion weight.
In this study, the weighting results for the research performance of universities showed that the most important criterion in the productivity dimension is the number of publications in Q1 journals per academic staff. This criterion emphasizes the excellence of a university’s research performance. Considering today’s interest in global university rankings and increasing competition, this criterion is a distinctive criterion that university management should consider when tracking their performance. In the impact dimension of research performance, the most important criteria are h-index without self-citations per academic staff and category normalized citation impact. Although the h-index has been criticized in some cases, it is an indicator frequently used in research performance. However, the use of this indicator adjusted for university size may eliminate the bias in favor of large universities. In this study, the h index was normalized according to the number of academic staff. Similarly, Category Normalized Citation Impact is an impact indicator normalized by publication year, publication type and subject area. This research has revealed that this criterion is another criterion that plays an important role in the performance of universities. Taken together, these findings suggest that both university managers and researchers should take these criteria into consideration and give them a higher weight value in the performance evaluations of universities.
When the findings regarding the weighting of research performance are evaluated, it is seen that it fills the gaps in previous studies. This study evaluated the research performance of universities based on more than one criterion. This reveals the advantage of MCDM over other methods. Because MDCM methods have the ability to selecting, sorting and ranking multiple alternatives by considering multiple criteria. Performance evaluation based on multiple criteria also sheds light on the complex structure of universities. Research performance has a complex structure that needs to be measured based on multiple criteria. In this study, the criteria are weighted based on a theoretical and statistical background and the research performance of universities is evaluated using these weight values. This approach is complementary to previous studies and clearly helps to address the research performance of universities more comprehensively. In addition to these, the methodology presented in this study eliminates the limitation of adhering to a single method in previous studies. Because the use of more than one MCDM method reduces concerns about the reliability of the findings.
The MCDM methods used in this study are not the only and unique methods that can be used to measure research performance. In addition to these methods, research performance can also be measured with different MCDM methods. In addition, although subjective evaluations are subject to criticism by universities or researchers, ranking can also be done by determining criteria weights with expert opinion-based MCDM methods. Research performance can even be evaluated by using hybrid MCDM methods that combine subjective and objective methods. In addition to using different MCDM methods, the research performance of universities can also be analyzed with different research performance indicators.
As a result, this study proposes a model based on multi-criteria decision-making methods to evaluate the research performance of universities. This study contributes to the literature in two ways. First, this research proposes a new model that goes beyond the traditional approaches to evaluate the performance of universities. This model can be used to evaluate the research performance of universities, faculties and academics. In addition, such an approach has a significant advantage in performance evaluation with multiple criteria and alternatives. Because traditional performance evaluation methods have limitations in evaluating performance by considering more than one criterion. Secondly, the analyses conducted in this study revealed that the proposed model is robust, consistent, stable and reliable. Such a model can help university administrators and policy makers in their strategic decisions. Especially in recent years, policy making based on research performance has become widespread. This model can comprehensively evaluate the research performance of academics, departments, faculties and universities before the decisions of university administration and policy makers. The reliable, robust and consistent structure of this model can be seen as a source of performance evaluation for decision makers and policy makers. In future researches, research performance of academics, departments, institutes and countries other than universities, which is the scope of this research, can be compared by using MCDM methods. By analyzing the performance of universities with different research performance criteria, the results can be compared with this research model. In this study, objective methods were used to weight the criteria. In future research, subjective methods based on expert opinion can be included to redetermine the importance level of the criteria. After the criteria are weighted by different MCDM methods, universities can be ranked by different MCDM methods and the results obtained can be compared with those of this research.
This study has some limitations. In this study, only the data obtained from the Incites database based on Web of Science were used. There may be databases other than this database that provide research performance data. However, Web of Science is recognized as a comprehensive database for reflecting the international performance of a university system (Auranen & Nieminen, 2010).
