Abstract
Keywords
1. Introduction
As publishing becomes more competitive, researchers and editors are looking for new ways to ensure the quality of published research (Aguinis and Vandenberg, 2014; Ashkanasy, 2010) and to provide new insights into phenomena allowing better connection to industry. One solution offered to increase the relevance of research, particularly in positivist research (theory testing using quantifiable measurement) in the social sciences, is the inclusion of objective measures. In this article, we define objective measurement as the collection of evidence that is not affected by respondent or researcher bias during the data collection (see Andrews et al., 2006; Cai et al., 2021; Jones, 2019). This has been promoted in a range of work-related studies at the individual level drawing on disciplines such as neuroscience using quantitative electroencephalogram (QECG) or functional magnetic resonance imaging (fMRI) to determine the charismatic leadership potential of individuals (Waldman et al., 2011), ergonomics using measures of arm strength over time to measure worker fatigue (Zhang et al., 2014), eye movement detection to measure fatigue in safety science (Li et al., 2019), and an individual’s blood pressure (Bostock et al., 2019) or cortisol patterns (Gonzalez-Mulé and Yuan, 2022) to examine stress. The idea of including objective data is not new. The original Hawthorne experiments (Mayo, 1933) examined how changes to the work environment (measured by a reduction in lighting intensity) influenced employee performance (measured by number of tasks completed), thus incorporating objective measures for both the independent and dependent variable being examined. Indeed, in some areas of business research such as in the commerce disciplines of accounting, finance, and economics, which mostly operate at a more macro level (large phenomenon in scale or scope), the use of objective measurement is well established with company success operationalized by changes in stock market share prices (Reiss, 2017), company profitability (Davis, 2023), and company financial outcomes (Tehrani and Noubary, 2005) being common examples.
However, in some business-related disciplines such as organizational behavior (OB), human resource management (HRM), and marketing, there has historically been a greater reliance on subjective measurement (data collected that are open to respondent or researcher bias), which are primarily based on qualitative or quantitative self-report or other report data, for example, in the form of interviews, focus groups, and surveys (Spector, 1994) or observation studies by researchers (including systematic literature reviews that are intended to reduce the potential for researcher sample selection bias, see Linnenluecke et al., 2020). While the methods for collecting subjective data in these three disciplines has become more rigorous (Harrison et al., 2020), there has also been an uptick in objective data collection with attempts to “objectify” measurement in these fields (Angrave et al., 2016; Lamers et al., 2024; Melin et al., 1999) in the last couple of decades. This is also reflected in scope statements and recent calls for articles in top tier journals. For example, the scope statement for the
Despite some researchers noting that using objective measures over subjective measures adds to the methodological rigor of research (Aguinis et al., 2020), we caution against researchers making assumptions that the inclusion of objective data in their research design is superior (Baruah and Panda, 2020) and automatically improves validity (the accuracy of measurement). Indeed, despite the increased frequency with which researchers are including objective measures in research (Evans and Steptoe, 2001; Gonzalez-Mulé and Yuan, 2022), a search of the literature reveals no applicable guidelines (at least what we could find) to assess the appropriateness of objective measurement, particularly in practical business settings. This is an important issue when conducting research in field settings where objective measurement methods may not be as easy to implement as they can be in the laboratory or other more controlled experimental settings.
We also note the increasing availability of measures at macro (societal, organizational, industry, sector) level and micro (individual, team) level. At the macro level, the availability of big data in the collection of information ranging from the usage of social media sites or emails to traffic data and medical records is increasing (George et al., 2014). At the micro level, objective measurement in applied settings is becoming easier to implement with access to wearable devices and biometric vests (see MacQuarrie et al., 2023) increasing data collections in work settings (Hamilton and Sodeman, 2020).
Our aim in this article is (1) to provide a framework for clearly differentiating between objective and subjective measurement in the management sciences; (2) to outline potential types of objective data in applied settings for business research; (3) to emphasize the strengths and weaknesses of subjective and objective data using criteria of (a) bias (i.e. cognitive distortion producing systematic deviations based on respondent prejudices), (b) validity (i.e. the extent to which the measure captures the construct it is meant to be assessing) and reliability (the consistency of the measure both internally and across time or raters), (c) sensitivity (the degree to which a measure is able to detect phenomena), and (d) feasibility (the potential to achieve a successful data collection); and finally (4) to provide a guiding framework or decision tree to assist researchers in selecting the most suitable type of measure for their research question (see Figure 1). We focus particularly on the broad applied management sciences as covered by the

Decision criteria for selecting objective and/or subjective measures.
2. Objective and subjective measurement
2.1. What is objective measurement in applied research?
A good place to start considering objective measurement is to start with a definition of objectivity. Starting at the micro level, objectivity is defined as “freedom from personal bias” (Reiss and Sprenger, 2017: 1). Definitions of objective measurement in the business disciplines are not common, with researchers often making a broad assumption that objective measures are simply data collected independently from the observer (Guterman et al., 2012). While this may describe the method of collecting information, it is important to emphasize that it does not mean that these measures are free from potential bias (e.g. participant social desirability bias, memory bias) and, thus, they are not strictly objective. By contrast, objective measurement is more clearly defined in the Science and Technology disciples with a clear distinction made between objective and subjective data. In the medical sciences, Dogan et al. (2017) note that objective data involves measures that gather information via behavioral or physiological monitoring. In their systematic review, “these objective parameters included physiological data (heart rate variability), behavioral data (phone usage, physical activity, voice features), and context/environmental information (light exposure and location)” (Dogan et al., 2017: 1). Essentially, by using objective data at the micro level, medical researchers typically seek to analyze variation either within or between subjects (which can also occur over time) that can be assessed against broadly accepted parameters established by benchmarks within the discipline for that measure (e.g. baseline heart rate variability for the participant, average heart rate or blood pressure by age, core body temperature). Objective data at the macro level in business disciplines includes measures such as financial returns (Gallagher and Nadarajah, 2004) and stock market share prices (Reiss, 2017). This results in judgments about whether the recorded measures are high or low or average depending on context and/or market/economic conditions. In this article, we argue that objective measurement for the applied management sciences should describe the collection of evidence that is not affected by respondent or researcher bias during the data collection. Indeed, a better understanding of what makes data objective may be achieved by juxtaposing it against subjective data.
2.2. What are subjective measures in applied research?
Subjectivity seeks to provide an understanding of the inner life of entities (could be individuals or groups) by recording the experiences, feelings, and responses of those entities (Luhrmann, 2006). It is important to note that this can be assessed qualitatively through interviewing, focus groups, and behavioral observation or quantitatively through surveys and questionnaires. For quantitative measures, a major way to collect subjective data is through the self-report of the subject with a focus typically on the respondent’s personal experiences or attitudes. Clark and Watson (2019) note that quantitative self-report data are typically collected using either a dichotomous (yes/no, true/false) or a continuous format (Likert-type scales) designed to answer research questions that seek to capture conscious thoughts and behavior. Using these formats, Hazlett and Hazlett (1999) remind us that these subjective methods are generally not designed to capture unconscious emotional reactions. Alternative subjective formats include checklists and forced choice formats (where the participant selects from a range of alternatives), which have been shown to have difficulty in establishing reliability (Clark and Watson, 2019). Like objective data, subjective data can be analyzed by inferential statistics that make general assumptions about a population based on sample data. Furthermore, based on the research question and underlying theory, subjective data (depending on the measure) can be valid, reliable, and sensitive to variations within and between individuals. The key issue, however, is that subjective data are generally prone to participant and/or researcher bias.
While not the focus of our article, it is important to acknowledge that subjective qualitative data (similar to subjective quantitative data) can be collected at the micro and macro levels to increase our detailed understanding of business phenomena. Data collected through research methods such as focus groups, ethnography, and interviews offer invaluable insights, particularly when scholars aim to engage in exploratory research, understand phenomena within specific contexts, appreciate people’s rich lived experiences, and be culturally and contextually sensitive. These data can provide a depth of understanding to phenomena that may not be available through objective data, which typically focusses on more tangible artifacts (Hentzen et al., 2023). For instance, attitudes, perceptions, and beliefs are not easily addressed by objective measurement of data.
2.3. Objective and subjective measurement (what it is and is not?)
Although our earlier definition suggests the difference between objective and subjective data are clear (i.e. focussed on respondent and researcher bias), published research suggests this is not always been the case. For instance, one approach to collect more objective data has been to develop measures based on a common understanding of what constitutes the correct answer to a questionnaire item. This is typically based on expert opinion or population norms. A good example of this are intelligence tests. There are two broad types of intelligence, fluid intelligence (completing problem-solving tasks such as spatial and numerical problems) and crystallized intelligence (based on social knowledge and verbal knowledge; Cattell, 1963). While fluid intelligence may be considered to provide clear objective data (based on right and wrong solutions to puzzles), crystallized intelligence is based on culturally focussed assumptions about important information that may change between as a result of socio-economic status, or education or across cultures and over time (Horn and Cattell, 1967).
Some factors of crystallized intelligence such as social intelligence and emotional intelligence, which encapsulate social knowledge (a broad understanding of social norms and cues) are open to conscious bias or faking (Christiansen et al., 2010), while other factors such as spatial and verbal skills may be less susceptible. Based on the definition we have provided, fluid intelligence tests could be considered as providing objective data, while crystallized intelligence tests do not. Confounding this is the Flynn Effect (Flynn, 2020), which identified a substantial increase (approximately 30 points) in the assessed levels of intelligence historically, which is then recalibrated (to 100) every 10 years or so to suggest there is a standard level of intelligence (Pietschnig and Voracek, 2015). The explanation for the change over time is individuals’ exposure to improving levels of education, increasingly common testing regimes and improvements in general health (e.g. better nutrition). We also note criticisms of cultural and gender bias in ability formats for measurement (Cheryan and Markus, 2020) such as ability-based emotional intelligence tests. These biases suggest that ability testing of crystallized intelligence needs to be assessed carefully to determine if it provides objective or subjective data.
Within business disciplines, a classic example of confusion in claiming objectivity emerges in studies using an ability measure of emotional intelligence, the Mayer Salovey Caruso Emotional Intelligence Test (MSCEIT; Mayer et al., 2012), compared against self-report measures (e.g. Workplace Emotional Intelligence Profile (WEIP); Jordan and Lawrence, 2009). The MSCEIT uses a situational judgment test format where the correct answers are determined by a panel of experts on emotions and psychology. While this adheres to the standards of crystallized intelligence (social intelligence, see Schipolowski et al., 2014) and may remove some aspects of self-report bias, we argue, on the basis of our definition for objective measures, that the potential for faking (a positive bias) and respondent bias (in relation to answers to the questions which may vary from actual behavior or social desirability bias) results in it being a subjective measure.
In other research, we note that researchers have described using an “objective response format” (see Kahn-Greene et al., 2006) in their research when referring to self-reported behaviors (in this case sleep behaviors) instead of attitudes or preferences. However, we contend that asking individuals about their actual behavior does not mean the measure is objective. These data are still self-reported and thus subject to memory issues and individual biases; and it is evident that there remains a subjective element to this type of measure. At the macro level, this would be the equivalent of a finance researcher drawing on a chief executive officer’s (CEO’s) assessment of the success of their company as a measure of company performance.
As shown in Figure 1, we propose that the measure (either objective or subjective) collected and methodological approach taken to gather it need to be initially assessed for the potential for bias. We then argue that measures may best be considered on a multidimensional continuum from high to low in relation to validity, reliability, sensitivity, and feasibility. Bias is the amount of distortion respondents and researchers introduce into measurement, which potentially influences our understanding of a study variable impacting our interpretation of relationships between variables. Validity can be broadly defined as how accurately a measure represents the underlying construct being researched (for a full discussion of validity, see Newton, 2012). Reliability is the amount of internal and external consistency of the measure, while sensitivity describes the strength with which the measure can detect changes in stimuli (Gold and Ding, 2013). Finally, feasibility refers to the potential for a measure to be used in a successful data collection. For example, in seeking to understand a respondents’ stress, while self-reports (subjective data) are easily collected through a survey, cortisol testing (objective data) may be more complex and require specialized expertise, time, and money for both data collection and analysis and may introduce a level of variance (sensitivity) that makes the interpretation of data difficult. Based on these five criteria, Figure 1 provides a decision tree framework for researchers that systematically guides them in their choice of subjective and/or objective measurement. Notably, it highlights the importance of the research question as the overarching focus and starting point.
3. Measurement decision criteria
3.1. The research question as the guiding light or avoiding the tail wagging the dog
It is important to highlight that a research question, likely underpinned by theory, should always be the guiding light for scholars, not the measure. This tenet forms part of most doctoral research training, yet in the excitement of being presented with new forms of measurement, this is sometimes overlooked by researchers who get overly focussed on the measure and do not take a step back to consider the theoretical foundations. Instead, what we see (and we see this a lot!) is ‘the tail wagging the dog’—a new and/or innovative way of measuring phenomena, or collecting big data, that is retrospectively fitted back to a research question/s and theory. Based on this observation, we thought it would be useful to develop a guiding Decision Tree (see Figure 1) that requires researchers to consider a theory/phenomena-driven research question as a first step, followed by a series of decision points that lead to a more appropriate selection of objective or subjective measures. In expanding on this framework, we now move to consider bias as it has consequences for making decisions on what type of data to collect (Donaldson and Grant-Vallone, 2002).
3.2. Decision criteria 1—bias of measurement
As noted earlier, response bias refers to the way in which respondents and researchers can distort measurement responses intentionally and unintentionally. As shown in our decision tree (see Figure 1), bias of data should be an important initial consideration of all applied business researchers in deciding about whether to use objective or subjective measures. As we overview in this section, there are some main forms of bias that an applied business researcher needs to consider when selecting a measure: implicit and explicit (self-report/other report) bias (including researcher bias), common method bias (CMB), and other measurement bias (e.g. positively or negatively skewed questions and memory bias).
3.2.1 Self/other-report biases
Survey measures can include self- and other-administered questionnaires, which seeks to examine a subject’s (e.g. as a shareholder, employee, leader) behaviors, attitudes, beliefs, or intentions or decision-making responses. As a form of subjective measurement, self-report formats tap into a range of personal biases (Paulhus, 2017) and are affected by memory shortcomings (Kahneman et al., 2004). Furnham and Henderson (1982) list intentional (explicit) biases, which include faking good (positive bias), also known as social desirability bias; faking bad (negative bias); and faking mad (random answer).
Of similar concern, are unconscious (implicit) biases in self-report measures. Oberai and Anand (2018) identify a range of implicit biases including the halo effect (a broad generalization from one positive characteristic); cloven hoof effect (a broad generalization from one negative characteristic); affinity bias (favoring those who are “like us”); conformity bias (agreeing with the dominant view); attribution bias (positive results are linked to self, whereas negative results are linked to others); beauty bias (attractive physical attributes are linked to positive assessments); and confirmation bias (subconsciously looking for evidence that supports a subjective assessment).
Another category of implicit bias related to self-report data is the impact of errors with memory. Sato and Kawahara (2011) found memory biases in retrospective self-reports of negative mood with an overestimation by many respondents. This has been recognized widely in the literature and formats developed such as the Day Reconstruction Method (Kahneman et al., 2004) in efforts to overcome memory biases in self-report surveys. There is potential for these biases to be minimized with experience sampling methodology (Fisher and To, 2012) and momentary objective measurement using real-time measurement (e.g. wearable devices; Barnes et al., 2023). Equally, in self-report, there is a tendency (linked to memory and recall) for respondents to generalize experiences. For instance, research suggests individuals are unable to clearly differentiate between affect measured as a general experienced state compared with affect currently experienced or experienced in the last hour, day, week, month, or year (Watson and Clark, 1997). We all experience significant variation in emotion over the course of a day, but to make sense of this we apply an implicit bias and generalize across time to provide an average experience, even when the researchers seek to determine a specific time frame (e.g. now, weekly, in the last month) for the instrument (Watson and Clark, 1997).
In sum, while we note that Furnham and Henderson (1982) identify a number of these self-report biases as limitations in practical applications of management such as recruitment and selection, these issues are also potential limitations in qualitative or quantitative self-report (i.e. subjective) measurement. This points to the need to triangulate subjective data where possible. For instance, the multi-trait multi-method matrix by Campbell and Fiske (1959) was offered as a solution to achieve this triangulation in quantitative research. More recently, this has also been the focus of mixed methods studies (Campbell et al., 2020).
3.2.2 Other reports
It is understandable why confusion exists among business researchers about what constitutes objective data. There are several competing viewpoints and definitions, particularly regarding the role of other-report ratings. The use of other-report data (e.g. supervisor or peer ratings of respondent) is common in OB and HRM research. For example, in research examining organizational citizenship behaviors (OCBs), other-report ratings are routinely used to address some of the limitations of self-ratings (Carpenter et al., 2014). Furthermore, in applied business research, the focus for identifying objective data has historically been centered around the independence of the source (distinguishing it from self-ratings). Cattell (1958) juxtaposed objective data (broadly interpreted as including other ratings) against self-appraisement and Taris (2006) and Frese and Zapf (1988) have since proposed a continuum between objective and subjective data that is determined by the independence of the data source. We argue that this has led to confusion about what is (and is not) considered objective data and that issues of independence are confounded with objectivity (i.e. measures that remove respondent/rater bias). Voukelatou et al. (2021) agree, noting there has been a tendency in research to avoid defining objectivity and to focus instead on the independence of sources (e.g. other report) as a way of overcoming reliance on self-report and its inherent biases. Another example is Kompier (2005), who while proposing that objective measures are those free from cognitive and emotive processing, describes objective measurement ranging from archival data (e.g. sick leave, worker’s compensation, resignation records—which we agree is objective) to observation by non-incumbent others (e.g. supervisors). However, in line with Spector (1987), we underscore that other-report is impacted by respondent bias and on that basis, we consider these as subjective measures.
Although other-report measures (e.g. supervisor report rather than self-report) may eliminate self-assessment bias, these types of assessment are still open to human biases and limitations of the rater (Allen et al., 2000). For instance, the potential for attributional biases about the ratee (i.e. a cognitive bias that influences how individuals assess others and events) still exists, which introduces subjectivity to the data (see Ladder of Inference, which explains the limited data upon which most decisions or judgments are made: Agryris, 1982). Furthermore, other-ratings may also be limited in reflecting the full extent to which an employee engages in behaviors based on other raters’ opportunity to observe all of a subject’s behaviors (Allen et al., 2000; Carpenter et al., 2014). We acknowledge that in some cases other-report and supervisor-report ratings may help overcome issues around CMB (see Jordan and Troth, 2020) and provide a way to establish convergent validity for variables under study. These data, however, are still subject to a range of bias (e.g. limited information, attributional bias), and based on our definition of objective data, we consider these subjective data. On the contrary, recording actual events such as sick leave and resignations from archival data are actual counts and, therefore, not subject to respondent bias. We acknowledge that, despite these data providing actual counts, they can be used by researchers in a way that reduces their validity, particularly if the measure is used as a proxy variable (e.g. interpreting sick leave and resignations as a proxy for job satisfaction).
3.2.3 Common method bias
Beyond aspects of over and under estimation due to personal biases (implicit or explicit), we also note that, depending on the research design, self-report data may be susceptible to CMB (Spector, 1987) which relates to variance that is attributable to the measurement method, rather being individual variance linked to the identified construct (Podsakoff et al., 2003). Indeed, Jordan and Troth (2020) argue that the inclusion of objective measures may help to overcome common method issues, as including an objective measure in a research design may mitigate this type of consistency bias.
Having identified the core element of respondent/rater bias in determining whether data are objective or subjective, it is now reasonable to move to consider factors that may assist researchers in deciding which measure to use as shown in Figure 1.
3.3. Decision criteria 2a—validity of measurement
Validity refers to the accuracy of a measure in relation to the construct to which it pertains (Newton, 2012). As shown in our decision tree (see Figure 1), the validity of a measure is an important consideration and both objective and subjective measures should be subject to validity considerations.
3.3.1 Objective measurement
Some well-known examples of objective and valid measures of health and stress at the micro level, include evaluation of heart rate variation (HRV), galvanic skin response (GSR), and cortisol levels. These are generally accepted as objective and valid indicators of physical health (i.e. using HRV) and stress (i.e. cortisol, GSR). Research has also linked these variables to self-reported measures. For instance, a recent meta-analysis of medicine literature has shown a clear relation between low HRV and higher self-reported psychological stress (Kim et al., 2018). In a meta-analysis, a clear relationship has also been demonstrated of increases in GSR and cortisol levels during stressful simulations (Van Dammen et al., 2022). This means that using objective measures appropriately in research, and with knowledge of the associated limitations (e.g. invasive nature of collection, effects of time of day or individual characteristics such as weight, fitness levels, gender, and age), can improve the veracity of claims about the relationship between variables as these types of measures may be less prone to respondent bias. From our review of the OB literature, it seems that work and stress researchers have made some of the greatest strides in the use of this type of objective measurement to advance their work. That said, there are limitations including using proxy variables (measures that do not directly measure the underlying construct being observed, for example, using hours of sleep to measure fatigue (Darwent et al., 2015) and determining the sensitivity of measures, which we discuss later when considering how researchers use objective measures.
At the macro level, our colleagues in the fields of accounting, finance, and economics have long drawn on macro level objective data that are valid. For example, finance research examining the financial returns of superannuation funds (Ainsworth et al., 2022), economic research examining the fluctuating price of water (Easton and Pinder, 2022), and accounting research on the relationship between consultant fees and CEO pay (Grosse et al., 2020). While the arguments linking the variables to phenomena may be open to argument, the measures themselves are accepted as valid measures of profitability (e.g. financial returns) and costs (e.g. water costs) and as a result comparisons can be drawn across industry and across sectors. The area that has lagged in these disciplines is the burgeoning discipline of behavioral economics. From a strong start with the development and testing of prospect theory by Kahneman and Tversky (1979), which employed an experimental design to test risk-taking by using an objective measure of risk (based on the amount of money potentially lost or won), the field acknowledges that it is now in the process of determining the best way to collect objective data (Chetty, 2015).
3.3.2 Subjective measurement
In terms of subjective measures, there is a rich history of quality research that has used validated forms of self-report measures administered through surveys and questionnaires examining aspects of motivation (Latham and Budworth, 2007), emotion (Jordan et al., 2006), cognitive and attitudinal variables (De Clercq and Pereira, 2021) and personality (Wright and Constantin, 2021). These types of quantitative collections allow for establishing generalizability. There are also clearly defined protocols in place for validating self-report measures (for a discussion, see Strauss and Smith, 2009), which can be assessed for prima facie validity, construct validity, concurrent and discriminant validity. Similarly, Payne et al. (2003) outline methods for validating archival data. In the discipline of economics, Drerup et al. (2017) note the efficacy of subjective beliefs predicting economic models. In the field of accounting, Woods (2012) outlines the use of subjective data in performance appraisals.
We also note that Paunonen and O’Neill (2010) who conducted a study examining other- and self-ratings of personality found extremely low correlations across a range of personality variables, and note the selection of the other-rater and their intimate knowledge of the individual being rated was important to increase the validity of this type of measure. The main advantage they reported in other report ratings was in cases where social desirability may be a factor. Again, the lack of agreement between self and other report in many circumstances leads to questions in relation to validity (potentially as a result of bias).
At the macro level, subjective measures have been found to be problematic in determining bonuses and rewards in an accounting study (Ittner et al., 2003), while valid subjective measures have been promoted in looking at stock market movements (De La and Myers, 2021). The availability of valid (broadly accepted in the discipline) objective measures at a macro level appear to minimize the use of subjective measures. Beyond validity (the accuracy of the measurement), researchers also need to consider the reliability of measurement.
3.4. Decision criteria 2b—reliability of measurement
Reliability refers to the consistency of measurement over time (Hill, 1965). Hill (1965) expands on this and notes that consistency refers to the predictable performance of measures, across individuals, across contexts, and across time. We also note that reliability can also refer to the internal consistency of specific measures, often measured by multiple items in that measure (Cronbach, 1951).
3.4.1 Objective measures
In terms of reliability and objective measurement, Van der Vleuten et al. (1991) note that objective data may not be able to be assessed for reliability in the ways available to subjective researchers. While machines can be calibrated, the methods of the researcher in collecting objective data (particularly at the micro level) may impact the reliability of data. For instance, the very act of asking a respondent to provide a mouth swab to assess cortisol levels may affect the consistency of that measure, particularly if the individual is distracted from completing a work task. In a meta-analysis, Elliott et al. (2020) note that task—fMRI studies have significant problems with reliability due to significant variance introduced by individual and contextual differences in testing and on that basis are not suitable for biomarker research or individual differences research as test–retest reliability is low. Other studies, however, report reliability in the objective measurement of sleepiness of long-distance truckers using polysomnography (Sunwoo et al., 2012).
Moving to more applied work settings, in a sales context, an objective measure of individual sales performance could be commission earned or bonuses earned, and in a manufacturing business the data may include number of items produced or products repaired or serviced. In terms of reliability, however, we accept that these measures may be subject to contextual variations (e.g. time of year or performance of the economy). Similarly, another study focussing on the performance of academics (Gary et al., 2023) used publication output as an objective measure of performance. Again, this measure is subject to significant differences depending on the context (respective workload policies at the universities within which the academics work). If reliability is about consistency, then context matters and needs to be a consideration in any conclusions drawn from objective data.
3.4.2 Subjective measurement
Subjective methods have a long history of testing the reliability of a measure by asking multiple questions about a single construct (internal reliability) or assessing the stability of the measure by seeking consistency over time (test–retest reliability) or seeking consensus across independent raters (inter-rater reliability for qualitative data). Assessing reliability is important given that the nature of subjective data means that individual interpretation is a part of the process of answering questions. In positivist research, asking about the construct in different ways allows the researcher to test the consistency with which the respondent has answered (Cronbach, 1951). Reliability, however, of itself does not ensure the validity of conclusions. Indeed, while reliability is essential in determining the value of a subjective measure, as with objective data collection, any measure can only be fully assessed by considering validity and sensitivity as well.
3.5. Decision criteria 3—sensitivity of measurement
Sensitivity refers to the strength with which any measure can detect changes in stimuli (Gold and Ding, 2013). Measures can be oversensitive introducing too much variance making it difficult to find statistically significant or meaningful results. They can also be blunt revealing no variance between variables resulting in multicollinearity. In either circumstance, it is difficult to draw conclusions from such measures.
3.5.1 Objective measurement
While objective measures can improve our understanding of phenomena, the use of objective measures is not without its problems and the appropriateness for using a measure can be influenced by the sensitivity of a measure. For instance, at the micro level, Souza et al. (2021) note that baseline recordings for HRV can be affected by cardiovascular fitness and by gender and age, which would need to be considered in drawing meaningful conclusions from the data. Similarly, at the macro level, broadly accepted measures such as profitability or stock returns may be influenced by broader economic fluctuations or require cross-company or within-sector comparisons (Ferrat et al., 2022) to achieve an acceptable level of sensitivity.
A good example of differences in the sensitivity of measures of the same concept when using objective and subjective measures is provided by Prince et al. (2020) who examined the relationship between self-reported sedentary behavior (sitting) and the same variable measured by a wearable device. They found a consistently low correlation between self-reported sedentary behavior and the same behavior measured by instruments providing objective data with a much lower incidence of sedentary behavior measured by instruments. While this may infer problems with the accuracy of self-report, we also note that if the tolerances for the objective measurement instrument (e.g. accelerometers or inclinometers) were too sensitive, this could contribute to an over-reporting of movement. Understanding these issues is important in determining the veracity of the measure, and any conclusions drawn from these data. The central issue here is whether the method of measurement is appropriate to answer the research question. For instance, a research question that seeks to solve the extent to which sedentary behaviors create injury may require carefully calibrated objective measurement, while a research question that seeks to understand the impact of sedentary behavior on employees’ networking and collaboration (as opposed to staying at a workstation) may better suit a subjective measure.
3.5.2 Subjective measurement
The nature of subjective measures means that sensitivity is sometimes difficult to achieve. In quantitative research, the aim is to use the data to make generalizable assertions about the phenomena. In attempting to introduce sensitivity into measurement (seeking finer tuned measures for constructs), researchers have been warned about construct proliferation. Brown (2015) notes that construct proliferation (and associated measurement proliferation) is often justified by researchers to increase the sensitivity of measures, but instead has often resulted in redundancy. Shaffer et al. (2016) note that creating fine distinctions between constructs and measures is not easy and standards for establishing discriminant validity mean that the sensitivity of some measures may be lost as they cannot be statistically distinguished from other constructs. This can also happen within measures. As noted in an earlier instance, research suggests individuals are unable to clearly differentiate between affect measured as a general experienced state and affect measured across time (Watson and Clark, 1997). This could also be a sensitivity issue as respondents may not interpret the fine distinctions between affective states in the same way (for instance, distinguishing between feeling distressed and upset or strong and determined in a single survey (Watson and Clark, 1997).
One technique in subjective research that seeks to achieve sensitivity is experience sampling method (for a discussion, see Fisher and To, 2012) in which respondents are asked subjective survey questions over time (perhaps reminded five times a day to respond to a short questionnaire) using mobile devices such as a cell phone. Recent studies such as that by Troth et al. (2023) examined front-line managers’ attempts at emotion regulation and their stress levels at the within-person level. This type of technique can introduce sensitivity into quantitative data collections where individual experiences of their daily lives are captured as they happen.
3.6. Decision criteria 4—feasibility of measurement
Feasibility refers to an assessment of potential for a successful data collection. Feasibility is not just about the ease of data collection for researchers. Feasibility includes consideration of the cost of collecting the measure, the expertise of the researcher for a particular measurement instrument, the impact on respondents and return on investment in relation to the effort required by the researcher to implement measures.
3.6.1 Objective measurement
Van der Vleuten et al. (1991) note that when deciding between using subjective and objective measures, researchers must consider factors such as the effort and cost required to collect data and the restricted range of behaviors that may be available for objective measurement. Many of these forms of data collection also require considerable expert training and/or analytical skills to interpret the data that are yielded (e.g. amount of data from sensory vests to capture movement) or alternatively, might only allow small sample sizes to be tested (e.g. fMRIs), which limits the power for statistical analysis and generalizability. Furthermore, as Dickerson and Kemeny (2004) note, when looking at studies of stress and cortisol levels, researchers need to consider the impact of the method of data collection and the impact of the researcher on the responses of the individual. fMRIs provide extensive information from which to draw conclusions, but the technique may not be feasible for business researchers studying employees due to both the cost and the level of expertise required to take the measurement and to analyze the results. The other issue that may impact on the feasibility of a study is an organization’s willingness to participate in such time-intensive study, which may remove employees from their work responsibilities. There are also concerns beyond the technical issues about individual privacy and the potential for unwanted surveillance (Weston, 2015), which warrant attention as a part of any feasibility assessment.
3.6.2 Subjective measurement
There is a rich history of focussed qualitative questioning providing a direct understanding of phenomena that could not be collected in any other way (Tucker et al., 1995). As researchers, we have published a lot of studies using subjective measures (e.g. self-report, qualitative methods, and other-report) and clearly for many researchers, collecting subjective data is feasible. We consider that subjective measures provide useful insights, but acknowledge they can be prone to significant limitations, particularly in relation to bias and sensitivity (some of which are rarely acknowledged).
3.7. Summary of validity, reliability, sensitivity, and feasibility
After considering the research question, we recommend that the decision to use either objective or subjective measure(s) needs to be determined against the criteria of validity, reliability, sensitivity, and feasibility. Our preceding discussion suggests that answering research questions relating to attitudes, or preferences, or personality are more easily achieved using subjective data and may be more valid in relation to seeking an individual’s thoughts. Objective data, on the contrary, may be more appropriate for research questions where researchers are keen to eliminate the question of respondent bias or are examining phenomena where they are examining very small variance (high sensitivity). A summary of the strengths and weaknesses of both objective and subjective measures in relation to validity, reliability, sensitivity, and feasibility are contained in Table 1.
Comparison of objective and subjective measures.
In determining the right measure for a study, the challenges broadly include (1) identifying the right measure that accurately reflects the variable of interest (validity and reliability); (2) deciding the tolerances (upper and lower values based on previous research or benchmarks) for that measure, which may include being aware of the impact of individual natural differences and timing of the data collection (sensitivity); and (3) finally being aware of the time and expertise required of the researcher, their impact on the participant, and the potential to achieve a successful data collection (feasibility). Often this can be an iterative process where the choice of the best quality instrument (e.g. using polysomnography to record sleep, which requires a technician to set it up versus actigraphy that involves a smart watch recording sleep or seeking a subjective self-report of sleep patterns) is a balancing act. Acknowledging these challenges, researchers need to be careful about the conclusions they draw from these data. In the next section, we drill down more deeply to examine the different ways in which objective data can be collected in business settings.
4. Objective measurement in applied settings
In terms of research in applied settings, Rousseau (1985) argued that phenomena can be examined at multi-levels. On this basis, we consider objective measures can be collected at the individual (micro level), the team (meso level), and the organization (macro) level. In Table 2, we outline broad ways in which data could be collected at these levels in applied business settings. We do not purport Table 2 to be comprehensive, but given the rapid expansion of new objective data sources, we consider it provides enough information to guide researchers interested in including objective data in their research designs. For instance, Liao et al. (2019) and Mardonova and Choi (2018) provide an excellent summary of wearable technologies for researchers. Gupta et al. (2022) also note that Apple has an open-source framework called Researchkit, which allows researchers to develop applications to collect real-time dynamic activities. Objective data at the macro level includes standard economic, finance, and accounting measures, but researchers may need to examine the contributions of big data and artificial intelligence (AI) to these traditional measures.
Examples of objective measurement in applied settings.
HRV: heart rate variation; GPS: global positioning system.
Following these practical insights, in the next section, we finalize our article by examining the issues researchers may encounter or need to consider when using objective data.
4.1. Challenges when using objective measures
As we noted in the introduction, the availability of objective measures is increasing due to the availability of new technologies and new forms of data collection and artificial intelligence resulting in broader big data. In this section, we outline four challenges researchers need to address to ensure they make the most of these new measures. Specifically, we focus on clarifying research questions, the importance of understanding measures, encourage caution in the use of proxy measures, and finally, we look at ethical issues that may emerge when using objective measures.
4.1.1 The primacy of the research question
As noted, the first issue researchers need to consider is the development of a well-defined research question, clear construct definitions and a theoretical framework that explains phenomenon. Having this framework allows for a decision on whether objective measures are appropriate and which objective measure can be used to answer the research question. Echoing Occam’s Razor, in terms of selecting an objective measure, the simplest method is often the most feasible way to ensure a successful data collection. The inclusion of objective data in studies should be done for a purpose (inevitably to improve the quality of answers to the research question such as increasing the sensitivity of measures). At the individual level, dependent variables can be collected that directly relate to a specific research question. For instance, in researching task performance (see Table 2) of call center workers, this might be number of calls answered or length of time to resolve a call or number of complaints. In other settings, research questions may consider the core achievements such as the number of sales or number of days late delivering completed tasks. At the macro level, again measures selected should be able to be clearly linked to the research question.
4.1.2 The importance of understanding measures
The second issue to consider is developing a good understanding of the measure being used, including understanding a measure’s limitations. For example, GSR is often used in studies examining stress at work. Researchers, however, acknowledge that the signal quality of GSR can be affected by external factors such as sweat, moisture, and movement (Montagu and Coles, 1966). Similarly, we may find that data collected from subjects may vary depending on the time of day (e.g. cortisol can vary dramatically during the day), vary between individuals (e.g. HRV can vary dramatically between individuals depending on baseline health factors such as weight, age, fitness), and often the tolerances allowed in analysis may require an understanding of the instrument and human physiology. For example, in relation to the study of sedentary behavior mentioned previously, Prince et al. (2020) found self-report significantly different to objective measurement. In that case, depending on the tolerances for the instrument collecting the objective measure, being fidgety may have been recorded as activity or rocking in a chair may be recorded as activity, whereas the desired measure (for a study on sedentary behavior) may be more appropriately the difference between sitting and standing. Determining the correct tolerances for instruments is a key issue for researchers, and in some cases the instrument may require individual calibration (in this case differentiating fidgety versus calm individuals). Similarly at the organizational level, collecting data on organizational performance such as sales volumes or profitability need to acknowledge the context and understand the limitations of the data being used in relation to the time of year or the economic conditions. Similarly, measuring socio-economic indices provides accurate objective data on social disadvantage but is too blunt an instrument for more fine-tuned conclusions (Poirier et al., 2020). Understanding validity, reliability, sensitivity, and feasibility are essential aspects of the research decision-making process of selecting measures.
4.1.3 Caution on using proxy measures
Next, we urge caution for researchers who consider using proxies (an indirect measure then inferred as a direct measure of a variable) for specific constructs. An example can be drawn from stress research, where measures of increased adrenaline, a physiological reaction to stress-creating stimuli, are described as a stress hormone (Babisch, 2003). These relationships, however, are not always clear. Adrenaline is a hormone and a neurotransmitter, which has a direct impact on cardiovascular responses that may result in fight-or-flight behaviors. Wortsman (2002) urges caution in referring to adrenaline as a stress hormone, when we do not understand the physiological mechanisms through which this relationship works. Similarly, we note that HRV and GSR are often interpreted as stress variables, but they do not measure stress. They are physiological reactions, which may be produced by a range of stimuli from cardio workouts to sleep apnea to consuming alcohol to illness to stress. While research suggests they may be related and physiological explanations may be able to link these to stress, but the variable as measured (HRV or GSR) is a physiological reaction and not specifically a stress reaction. On this basis, researchers need to be careful about the conclusions they draw from such measures. Similarly, while cortisol levels have been used in studies of fatigue as an objective measure (Klaassen et al., 2013), this does not make cortisol (a stress hormone) a proxy for fatigue.
For instance, trying to examine neuroscience in business research, researchers state they can show that effective and less effective leaders differ neurologically (Rock, 2010) inferring links between human behavior and neurological activity in the brain (Murray and Antonakis, 2019). Lee et al. (2012) quite reasonably state that this research makes reductionist leaps of faith making links between leadership and neuroscience without having an explanation for how the two are linked (a basic premise of good research). Lindebaum and Jordan (2014) note that management scholars undertaking neuroscientific studies link activation in different brain regions (and in some cases, the lack of activation) to task-related outcomes, despite neuroscientists themselves acknowledging that brain activation is not linked to how important that region is for a specific behavior (Bassett and Gazzaniga, 2011: 208). In this case, using neurological activity as a proxy for leadership cognition is not warranted.
This also occurs when researchers use organizational-level data or social data. In an example of using social media, Yam et al. (2022) used number of searches on job recruiting websites to measure general job insecurity despite them acknowledging that it is not a “perfect” proxy for the underlying construct of job insecurity. Judged against the definition for objective data we have provided, both these sources (brain activation and job searches) provide objective data as the measures are based on a direct count of activity. They are not, however, direct measures of the constructs being researched (cognition and job insecurity) and despite being objective data are not valid measures of the constructs discussed.
4.1.4 Considering ethical issues in measurement
An overarching issue that we wish to draw to researchers’ attention are ethical implications and potential risks related to respondents’ safety and well-being, as well as the potential for researcher interference impacting on the data collection. At the micro level, we need to be aware at every point of the decision tree that the use of some instruments can be physiologically invasive for participants and affect the results of the research. For instance, Choi et al. (2017) note that wearing a full vest to collect physiological responses may impact the subject more than wearing a watch. Similarly, measuring something like cortisol, which may require the subject to stop what they are doing to spit into a container or provide a swab may affect the individual’s workflow and contribute to increasing experimenter interference in an applied setting. Finally, the psychologically invasive nature of the measurement also must be considered with issues around privacy and ethical data collection being important.
Ethical challenges are also evident when making research decisions at the macro level. Given the increasing availability of big data and increasing use of information technology and artificial intelligence in the collection of data, we need to be cognizant of the impact of this on the collection of archival and organizational data (Herschel and Miori, 2017). While informed consent is at the heart of any research protocol, a question emerges around how researchers gain consent from individuals who provide data for other purposes such as their day-to-day employment. For instance, given the prominence of social media in organizations and big data and artificial intelligence, how do researchers deal with data that had dubious consent when it was provided for alternate purposes (Andreotta et al., 2022)? At present, speedy advances in technology are outstripping our ability to resolve this question. This is an issue that deserves attention.
5. Conclusion
Based on our discussion in this article, we encourage researchers to consider the definitional guidance we provide for differentiating between objective and subjective data. Our overriding message is that the research question and existing body of empirical findings on a topic should drive what type of measurement is used, whether objective or subjective. Improving research quality is not just about using more objective measures. We hope that the decision-making tree (see Figure 1) may assist researchers in this decision-making process. Researchers are encouraged to assess the quality of measures (both subjective and objective data) using a lens of bias, validity, reliability, sensitivity, and feasibility to make a decision on the type of measurement used and data collected. In making this decision, researchers need to be clear about what constitutes an “objective” measure to ensure it is free of respondent and researcher bias and that this type of data is critical for their specific research inquiry. Objectivity (a lack of bias) is not the same as assessing the independence (e.g. other-report data) of the data. Although other reports might be particularly useful when examining some constructs such as aspects of performance (e.g. supervisor rating), we do not consider these to be objective measures. Finally, we drilled down further into types of objective measures at the individual, group, and organizational levels to allow researcher to discover the wide range of variables available for study.
Although this article has focussed on the inclusion of objective measures in research, as already noted in our introduction, the inclusion of objective data in research is not new. However, given we now have greater access to and more user friendly technology, our intent has been to encourage greater consideration of when to use these types of measures before they are adopted in applied business settings. Indeed, such measures may contribute to the increasing relevance of academic research for industry. Given exponential growth in the development and impact of new technologies, new possibilities for objective measurement will continue to emerge in the future. We encourage researchers to engage with the potential sources of objective data we have identified in this article for their own applied research.
Key practical and research implications
A multi-lens framework (bias, validity, reliability, sensitivity, feasibility) offers a structured approach for selecting between objective and subjective measures in business research contexts.
Both objective and subjective measures have limitations. While objective measure reduce bias, their appropriateness depends on the specific research context, rather than being inherently superior.
Careful consideration is advisable when using proxy measures as an objective measure. Understanding measure limitations helps avoid unwarranted interpretations.
