Sage Journals: Discover world-class research

Abstract

“Data sharing” describes the process of making research data available for reuse. The availability of research data is the basis of transparent, effective research systems that democratize access to knowledge and advance discovery. Despite a broad recognition of the value of data sharing across the sector, many researchers are not yet engaging meaningfully with data-sharing behaviors. Through a behavioral lens, in this qualitative Registered Report, we aimed to identify the barriers and enablers to data sharing experienced by researchers working at a UK university. Data were collected using a theoretically informed 26-item interview schedule (capability, opportunity, motivation–behavior [COM-B] model; theoretical-domains framework [TDF]). Fourteen participants across a range of career levels and disciplines were recruited to take part in semistructured interviews focused on data-sharing behaviors and their influences. Transcripts were analyzed using thematic template analysis based on the COM-B constructs and TDF domains. Results indicated that quantitative data-sharing behaviors were performed differently to qualitative behaviors, which affected the required skills. However, the barriers experienced were similar across all disciplines. These barriers included a lack of time to undertake data-sharing activities, concerns over General Data Protection Regulation/correct deidentification of data, and limited infrastructure. Enablers included researchers’ drive to be seen as open researchers. This identity matters to them for both the good of research and what it signals about them. It is a key enabling factor, potentially driving behavior even in the absence of other factors. Mandating data-sharing activities could encourage more widespread behaviors. However, such mandates need to be both discipline-specific and supported by institutions providing adequate resources.

Keywords

behavior change open research open science COM-B TDF data sharing data stewardship open data open materials preregistration

Access to research data supports a central tenet of open research, that “access to scientific knowledge should be as open as possible” (UNESCO, 2021, p. 36). Data availability enables the verification of past findings and accelerates the discovery of new findings through reanalysis and evidence synthesis (Fecher et al., 2015; Hardwicke et al., 2018; J. N. Towse, Ellis, & Towse, 2020). Accordingly, data availability is the basis of transparent, effective research systems that create credible conclusions, democratize access to knowledge, and underpin equitable innovation (Concordat on Open Research Data, 2016; G7 Science and Technology Ministers, 2023; UNESCO, 2021). The research community is increasingly recognizing the value of data sharing in these pursuits. Most significantly, the recent UNESCO Recommendation on Open Science positions open research and data as a global research priority that can improve the reliability of evidence needed for decision-making and policy (UNESCO, 2021). However, despite the role of shared data in addressing global environmental, economic, and social issues (UNESCO, 2021), many researchers are not yet engaging meaningfully with such behaviors (see Gabelica et al., 2022; Hardwicke et al., 2018; J. N. Towse, Ellis, & Towse, 2020). In the present qualitative research, we use a behavior-change framework to determine the barriers and enablers that researchers experience when (considering) engaging with data-sharing behaviors with a view to informing the design of future interventions.

Although formal data sharing has existed for more than 100 years (Branney et al., 2019; Karhulahti, 2023; Sieber, 2015), it was the digital age and electronic access to data that created the conditions to facilitate widespread sharing. The broad recognition of the value of data sharing has occurred simultaneously across the sector, and funders, journals, societies, universities, and researchers have all advocated for data sharing and creating top-down initiatives. In the UK, the largest national funding agency, the UKRI (formerly RCUK), has had their Common Principles on Data policy since 2011 (UKRI, personal communication, June 6, 2023). Likewise, the country’s largest charity funder, Wellcome, launched its policy in 2007, the current iteration of which actively encourages data-management and data-sharing costs to be included in grant applications (Wellcome, 2017). A diverse group of stakeholders, including funders and publishers, developed the FAIR (Findable, Accessible, Interoperable, and Reusable) Data Principles, a set of guidelines for enhancing the reusability of data (Wilkinson et al., 2016). Many publishers have their own data-sharing policies (e.g., Bloom et al., 2014), and it is also a key component of the Transparency and Openness Promotion (TOP) guidelines (Nosek et al., 2015), a tool to support the implementation of open-research practices at journal level. Simultaneously, universities and other organizations have institutional-level policies and are providing support for data storage through managed external data archives (e.g., UK Data Service), institutional data repositories, or general-purpose services (e.g., Zenodo).

Although researchers are only part of this wider data-sharing ecosystem (Borgman & Bourne, 2022), ultimately, it is individual researchers who are responsible for the act of data sharing (Bezuidenhout & Chakauya, 2018). Research has consistently shown that overall, researchers view data sharing as positive and important (e.g., Cheah et al., 2015; Digital Science et al., 2022; Farran et al., 2020; Fleming et al., 2022; Soeharjono & Roche, 2021; Van den Eynden et al., 2016) and that lack of access to data is an impediment to research progress (Tenopir et al., 2011). Measures of engagement show progress, as illustrated by the global 2022 the State of Open Data survey, in which 35% of respondents reported being familiar with FAIR principles,¹ up from 28% the previous year and the highest percentage since the question was first asked in 2018. Yet despite this positive momentum, implementation is often low (e.g., Farran et al., 2020; Fleming et al., 2022; Rowhani-Farid & Barnett, 2016) and may fall short of accepted standards. For example, when authors of articles with data-availability statements (indicating that data are available on request) were asked to share their data, 93% failed to reply or declined to share their data, and only 6.8% shared the requested data (Gabelica et al., 2022). This attitude-behavior gap raises questions about the barriers preventing researchers from sharing their research data (Fecher et al., 2015).

In the present research, we use UNESCO’s (2021) definition of open-research data as

data that include, among others, digital and analogue data, both raw and processed, and the accompanying metadata, as well as numerical scores, textual records, images and sounds, protocols, analysis code and workflows that can be openly used, reused, retained and redistributed by anyone, subject to acknowledgement. Open research data are available in a timely and user-friendly, human- and machine-readable and actionable format, in accordance with principles of good data governance and stewardship, notably the FAIR (Findable, Accessible, Interoperable, and Reusable) principles, supported by regular curation and maintenance. (p. 9)

This broad definition allows us to consider data as the evidence that underlies research publications and therefore applies across a range of academic disciplines. In the present research, we use the term “data sharing”² to also include data that by necessity (i.e., for various legal, ethical, or commercial reasons) are not openly available but that are made accessible to specific users according to defined access criteria.

Benefits of Data Sharing

There are many potential benefits of data sharing at individual researcher (C. Allen & Mehler, 2019; McKiernan et al., 2016), research-community (e.g., Milham et al., 2018), and societal levels (e.g., Besançon et al., 2021). Access to data leads to a more equitable distribution of opportunities and promotes inclusion (Digital Science et al., 2022; UNESCO, 2021). Reuse of data facilitates greater efficiency, effectiveness, and innovation by using the same resources multiple times to create new knowledge (Burgelman et al., 2019; DuBois et al., 2018) rather than duplicating research efforts. Increased transparency and more focus on reproducibility enables verification of findings and reanalysis when improved methods are developed. Specifically at the researcher level, sharing data enhances the visibility of research, and this can lead to a citation advantage (Piwowar et al., 2007; Piwowar & Vision, 2013) and more opportunities to collaborate (McKiernan et al., 2016). However, many of these potential benefits are distal compared with the more proximal challenges posed by sharing data and the ever-present pressure to publish frequently and thereby increase the chances of employment, promotion, and funding (Munafò et al., 2017).

Concerns About Data Sharing

Debates about data sharing commonly focus on qualitative human data (Karhulahti, 2023) and point particularly to concerns over epistemology, informed consent, and privacy (e.g., Parry & Mauthner, 2004). Issues of epistemology relate to the reflexive, subjective, and contextually bound nature of qualitative research that suggests that reuse could lead to misinterpretation (e.g., Broom et al., 2009). The key concerns raised about informed consent are whether researchers are less willing to be candid about sensitive topics (MacLean et al., 2019) and whether participants truly understand the implications of consent (Parry & Mauthner, 2004). Relatedly, concerns have been raised about ensuring anonymization of qualitative data, particularly for sensitive data or small, potentially reidentifiable communities (Broom et al., 2009; Parry & Mauthner, 2004). However, it is possible for these issues to be overcome with careful planning and sufficient resources (for proposed solutions, see Bishop, 2005; Branney et al., 2019, 2023; DuBois et al., 2018; Karhulahti, 2023). Furthermore, the majority of participants consent to share their deidentified data (e.g., Mozersky et al., 2020), even for research on sensitive topics such as abortion (VandeVusse et al., 2022) and general-practitioner–patient conversations (Amelung et al., 2020; discussed in Whitaker, 2021), citing helping others as their primary motivation (VandeVusse et al., 2022).

Data Sharing as Behavior

The term “data sharing”’ encompasses a range of behaviors that occur across the research life cycle, taking place before (e.g., preparing consent forms), during (e.g., recording exclusions), and after the research (e.g., depositing the data in a repository). Behaviors do not occur in isolation but in systems of behaviors that interact with and depend on one another (Michie et al., 2014). This interdependence means that if one data-sharing behavior does not occur, this may ultimately prevent data from being shared (see Norris & O’Connor, 2019). For example, omitting information about future data sharing from participant-consent forms or failing to secure suitable funding for data archiving may preclude the data from being shared.

For the purposes of the present research, we are interested in individual researchers’ data-sharing behaviors. Here, we provide a synthesized list of the key behaviors that comprise an idealized data-sharing process at the individual-researcher level.³ Not all behaviors listed are required to meet the overarching behavior of data sharing (e.g., ethics is not required for all research); essential behaviors are noted. We used our interviews to explore this list of behaviors and check that we had not missed any behaviors:

Seek out skills and resources: seeking out and engaging with educational resources and/or participating in training to learn about what constitutes “data”; the benefits of sharing; how to share data within ethical, intellectual property (IP), and commercial constraints; and how to handle sensitive data. Reading and complying with university and funder mandates. Seeking practical, financial, or motivational support from peers, colleagues, ethics committees, prebid teams, funders, and other facilitators. For example, applying for funding to support data preparation and storage.

Create a data-management plan: creating a data-management plan that outlines what types of data will be collected and how researchers will handle the data during and after the study. The plan should address all stages of the research life cycle from planning through sharing. Data-management plans are required for some funding applications.

Obtain ethics: submitting an ethics application that includes plans to share data and details of how this will be done. For example, anticipating terms of access.

Precursor behaviors: carrying out data-sharing precursor behaviors throughout study design and the active project phase. For example, preparing participant information sheets and consent forms to gain consent from participants to share their data or acquiring agreement from other stakeholders to share the project data. Then during the active project phase, collecting and analyzing data with reuse in mind.

Prepare and manage data (essential behavior): preparing data for sharing by following relevant standards (e.g., FAIR) and disciplinary norms to ensure that data will be findable, accessible, interoperable, and reusable. This behavior includes storing, naming, and versioning the data in a format that can be shared and creating documentation and metadata. For personal sensitive data, this would include anonymizing it (i.e., removing identifying information to protect participants’ identities), or for commercial data/IP protection, this might include aggregation.

Deposit data (essential behavior): depositing the data and metadata in a repository and providing reuse guidance by adding a license. For sensitive data, shielding may be required in the form of access control, that is, specifying the conditions under which the data can be accessed. The data may be placed under a reasonable embargo, for example, to delay the release of the data to coincide with a publication or end of project or to protect first-use rights.

Ultimately, the aim of data sharing is to facilitate reusability and subsequent new knowledge. To enhance the value and reusability of data (A. S. Towse et al., 2021), it should comply with the FAIR data principles (Wilkinson et al., 2016). Therefore, the core data-sharing steps—preparing (5 in list above) and depositing data (6 in list above)—should be carried out with reuse in mind: ensuring that data are stored in a suitable permanent repository, with rich metadata, clearly labeled and described to ensure it can be independently understood, in a future-proof and ideally nonproprietary format, with a global persistent identifier and an appropriate, preferably open, license (e.g., CC BY). Without these provisions, data have limited reusability (J. N. Towse, Ellis, & Towse, 2020).

Whether researchers decide to adopt data-sharing behaviors is a behavioral question (Norris & O’Connor, 2019; Osborne & Norris, 2022), and behavior-change theory has the potential to help understand and improve adoption and maintenance of such behaviors (Norris & O’Connor, 2019). The present research has been developed using the capability, opportunity, motivation–behavior (COM-B) model from the behavior-change wheel (BCW; see Fig. 1; Michie et al., 2011, 2014): The BCW is a layered framework designed to guide the development of theory-based behavior change from analysis to intervention design (Michie et al., 2014). We selected this framework because it can be applied to behavior across different fields and contexts and was developed based on overcoming the limitations of 19 multidisciplinary frameworks (Michie et al., 2011). It has recently been applied in the domain of open research to develop interventions to increase the uptake of preregistration among researchers (Osborne & Norris, 2022) and to investigate the barriers and enablers to implementing the TOP guidelines (Naaman et al., 2023).

Fig. 1.

The behavior-change wheel from Michie et al. (2014). The green ring shows influences on behavior, the red ring shows intervention types, and the gray ring represents policy options.

The COM-B model is at the center of the BCW (Fig. 1, green ring) and is used to perform a behavioral diagnosis. This process involves identifying a target behavior; investigating individual, sociocultural, and environmental influences (i.e., barriers that decrease the likelihood of the behavior occurring and enablers that increase the likelihood); and assessing what needs to change in terms of capability, opportunity, and motivation. These three components are part of an interacting system and must be present in sufficient amount for the behavior to occur: Capability is the individual’s physical and psychological ability to enact a behavior, opportunity refers to the physical and social environment that enables behavior, and motivation constitutes the reflective (i.e., rational choice) and automatic (i.e., feelings, habits) mechanisms that activate or inhibit behavior (Michie et al., 2011, 2014). To change behavior, one or more of the components must change to reconfigure the system. The choice of behavior-change intervention should be evidence-based and informed by the factors that influence current behavior to develop something that might be most effective in the specific setting (Hulscher & Prins, 2017).

In addition to COM-B, the theoretical-domains framework (TDF; Atkins et al., 2017; Cane et al., 2012) was used in the current study for the development of the interview schedule and analysis. This validated integrative theoretical framework (Cane et al., 2012) comprises 14 domains (knowledge; skills; memory, attention, and decision processes; behavioral regulation; social/professional role and identity; beliefs about capabilities; optimism; beliefs about consequences; intentions; goals; reinforcement; emotions; environmental context and resources; and social influences; Cane et al., 2012), which map to the three COM-B components (see Fig. 2) and can provide a granular understanding of behavior (Michie et al., 2014).

Fig. 2.

The theoretical-domains framework (TDF) mapped to the subconstructs of capability, opportunity, and motivation from the capability, opportunity, motivation–behavior (COM-B) model. Reproduced from Chater et al. (2022).

Barriers and Enablers to Data Sharing

Despite important reasons to share data, including individual career-based reasons (C. Allen & Mehler, 2019; Markowetz, 2015; McKiernan et al., 2016), many researchers do not share their data because of perceived costs (Abele-Brehm et al., 2019; Miyakawa, 2020) and lack of incentives (Adimoelja & Athreya, 2022; Chawinga & Zinn, 2019). With data sharing becoming an increasing priority across the sector, the determinants of researchers’ attitudes and behaviors to data sharing have received some scholarly interest. Existing research, spanning various disciplines and geographical areas, has largely focused on real and perceived barriers and has used survey formats. Below, we discuss current evidence categorized according to the three COM-B components.

Barriers

Opportunity

Lack of resources is regularly reported as a barrier to data sharing (Fecher et al., 2015). For example, in a survey of more than 13,000 scientists conducted in 2009–2010, insufficient time and funding were the most frequently named barriers to data sharing, cited by 55% and 40% of respondents, respectively (Tenopir et al., 2011). The fact that time is a frequently highlighted barrier (Astell et al., 2018; Chawinga & Zinn, 2019; Cheah et al., 2015; Farran et al., 2020; Houtkoop et al., 2018; Van den Eynden et al., 2016) is unsurprising because it is well acknowledged that academics have increasingly untenable workloads (Hostler, 2023; Long et al., 2020). Data sharing has the potential to increase research efficiency in the medium to long term at a systems level, but in the short term and at the individual level, such behaviors increase workload and require more time and effort compared with “closed” research (Gomes et al., 2022; Hostler, 2023). Other opportunity-related barriers relate to physical resources: In low- to middle-income countries, lack of specialized data-management expertise (Cheah et al., 2015) and infrastructure issues, such as lack of current hardware, software, and suitable internet access (Bezuidenhout & Chakauya, 2018), also pose a challenge.

Capability

Acknowledged barriers also include lack of knowledge and skills (Chawinga & Zinn, 2019), resulting in researchers not feeling fully equipped to complete data-sharing tasks (Tenopir et al., 2015). Participants report that they have not learned how to share data (Houtkoop et al., 2018) and lack knowledge about how to share data in a useful way (Astell et al., 2018). The variety of available repositories and the lack of integration between them also poses a challenge in terms of selecting the most suitable storage (Astell et al., 2018). Researchers report a lack of knowledge about copyright, licensing (Astell et al., 2018; Farran et al., 2020), ethics, and confidentiality issues that can affect data sharing (Gownaris et al., 2022).

Motivation

In a survey of 600 psychologists asked about 15 barriers, data sharing being uncommon in their field was selected as the most relevant reason for not sharing data (Houtkoop et al., 2018). Other studies have shown that researchers might not share data because of fear of the implications, for example, the possibility of compromising confidentiality and harming research participants if they can be identified, particularly for sensitive data or stigmatized communities (Cheah et al., 2015). Researchers are also concerned that their research reputation could be harmed (Cheah et al., 2015) if they are scooped (Bezuidenhout & Chakauya, 2018; Soeharjono & Roche, 2021) or if others who have insufficient information and context to understand the data misinterpret or misuse it (Bezuidenhout & Chakauya, 2018; Gomes et al., 2022; Sayogo & Pardo, 2013; Soeharjono & Roche, 2021; Tenopir et al., 2015; Van den Eynden et al., 2016) or even find errors in the data (Gomes et al., 2022). Furthermore, previous research has found that lack of credit and appropriate attribution when others reuse data is a barrier (Cheah et al., 2015; Farran et al., 2020; Gownaris et al., 2022).

Enablers

Opportunity

To be able to share data, researchers require opportunities, including suitable infrastructure, that is, technical, legal, financial, and time-allocation support from institutions and funders (European Commission, 2017). For example, the availability of a data repository has a significant influence on STEM researchers sharing data (Kim & Zhang, 2015), and Wellcome-funded researchers cited funding to cover the costs of data preparation as their biggest motivator (Van den Eynden et al., 2016). Researchers who work solely on research and do not have time-consuming teaching obligations are more likely to share their data (Tenopir et al., 2011). Likewise, researchers were more likely to share their data if minimal effort was required (Wallis et al., 2013). Opportunity also includes social opportunities, such as institutions providing a positive research culture in which data sharing is recognized and rewarded (Huang et al., 2012).

Capability

Researchers must have the necessary skills to carry out the various subbehaviors that comprise data sharing. This includes not just knowledge and skills about how to share data but also planning during study-design phases. Reanalysis of data from Tenopir et al. (2011) found that having data-management skills increased the likelihood of data sharing (Sayogo & Pardo, 2013).

Motivation

Researchers who perceive career benefits to data sharing are more likely to have positive attitudes toward it and engage in more data-sharing behaviors (Kim & Zhang, 2015). Direct personal benefits, such as data sharing being looked on favorably in funding and promotion decisions, and enhanced reputation are also motivating factors (Van den Eynden et al., 2016). In the aforementioned survey of psychologists, mandates to share data from funders or institutions were ranked top of the conditions most likely to encourage data sharing (Houtkoop et al., 2018). Increased impact, visibility, and opportunities for collaboration are cited as incentives to share data (Digital Science et al., 2022; Farran et al., 2020; Van den Eynden & Bishop, 2014). When their data are reused, researchers consider acknowledgment or citation to be essential (Digital Science et al., 2022; Sayogo & Pardo, 2013; Tenopir et al., 2015). But researchers also recognize broader incentives of public benefit and transparency and reuse (Farran et al., 2020).

Research Questions

The majority of research on factors influencing researchers’ data-sharing behaviors is based on survey data and focuses on barriers; a more comprehensive and nuanced understanding is missing. For example, in survey responses, one cannot disentangle the often cited barrier “lack of time” from a lack of motivation to prioritize data sharing because it is not incentivized. Like other behaviors, data sharing is not stable within an individual and may vary across time (Corker, 2018; Norris & O’Connor, 2019) based on internal factors, such as motivation and habit, and external factors, including resources and project priorities (Kwasnicka et al., 2016; Norris & O’Connor, 2019). Therefore, for researchers who are currently engaging or have engaged with data-sharing behaviors, we are interested in understanding what facilitated these behaviors and what needs to change in the system to ensure maintenance and adoption by others.

Given the centrality of shared data in accelerating knowledge and solving global social issues (UNESCO, 2021), more thorough insight into the barriers and enablers to data sharing is important. Such an understanding can help facilitate the future development of effective behavior-change interventions. From this perspective, we are particularly interested in participants from one university because the insights from this study will be used by the university to develop future interventions to encourage data sharing. The overall aim of this study is to draw on the COM-B model and TDF to explore the factors that help and hinder researchers in sharing their research data. To do so, we conducted qualitative interviews and analyzed them using thematic template analysis. Our research question is as follows:

Research Question 1: What are the barriers and enablers to data sharing experienced by researchers?

The results are presented in written format and synthesized visually in the form of a behavioral map that plots data-sharing behaviors and their dependencies within the broader university system and shows relationships between actors, behaviors, and influences (barriers and enablers).

Method

Design

The study is a qualitative Registered Report (Henderson et al., 2023). It consists of semistructured qualitative interviews with researchers carried out during November and December 2023. An interview design was selected to allow an in-depth exploration of the topic that extends beyond the strictures of quantitative surveys and enables participants to talk about their individual experiences and the barriers and enablers that are particularly pertinent for them. Interviews help to ensure that voices across different disciplines and career levels are given equal opportunity to be heard, and a semistructured approach allows for prompts to help obtain further details. Furthermore, because open-research-related terminology differs between disciplines, a one-to-one approach would help minimize misunderstandings that might have occurred in a focus-group setting or via a survey.

We supplemented the methodological details below by completing the Consolidated Criteria for Reporting Qualitative Research (COREQ; Tong et al., 2007), a 32-item checklist for reporting key aspects of qualitative research (see “Materials & Procedures” component on OSF, https://osf.io/w3sfq/files/osfstorage). We note that the COREQ is controversial; criticisms include the inability to replicate the development of COREQ (Buus & Perron, 2020) and a focus on data saturation (Braun & Clarke, 2021c). In the present study, the COREQ checklist did not guide our decisions but provides a quick summary of the research. In addition, we supplement interviewer characteristics by providing positionality statements (see “Positionality” component on OSF, https://osf.io/d4sjk/files/osfstorage).

The research received a favorable opinion from E. L. Henderson and R. Abrams’s university’s Research Ethics Committee (FHMS 22-23 072 EGA).

Recruitment and participants

Purposive sampling was used to recruit research-active staff and PhD students working at a university in the south of England. We deliberately recruited only researchers who are aware of or practice open research to ensure that participants could talk about their experiences of barriers and enablers to data sharing. Inclusion criteria included the following: researchers who produce potentially shareable data in their research or work in a team that does so and self-report one or more of the following: (a) have shared data once or more; (b) have experience using one or more of the following open-research practices: open software/code, preregistration or Registered Reports, preprints, open monographs, open educational resources; or (c) are aware of two or more of the aforementioned open-research practices and have considered data sharing but have not yet engaged with it.

Statistical generalizability is not the goal of qualitative research; rather, we aimed to provide rich knowledge that reveals the breadth of participant experiences (Smith, 2018). To maximize diversity in our target group, we recruited participants to include a range of the following characteristics: career stages, genders, disciplines, and experience with data sharing (the latter being as per the inclusion criteria above). As a minimum, we ensured that our final sample included one female and one male participant from each of the four career stages (see Table 1), one participant from each of the three broad research discipline (STEM, social sciences, and humanities), two participants from ethnic groups other than White British, and two participants who had not shared data.

Table 1.

Participant Demographics

Characteristic	n	%
Gender
Woman	5	36%
Man	8	57%
Nonbinary	0	0%
Prefer to self-describe	1	7%
Prefer not to disclose	0	0%
Ethnicity
White	11	79%
Arab	1	7%
Indian	1	7%
Prefer to self-describe	1	7%
Career stage
Junior (e.g., PhD candidate, research assistant)	2	14%
Early (e.g., research associate or fellow, lecturer)	6	43%
Mid (e.g., senior lecturer or researcher, reader)	3	21%
Established (e.g., professor, principal fellows)	3	21%
Broad research discipline
STEM	9	64%
Social sciences	4	29%
Humanities	1	7%
Research methods used
Quantitative	9	64%
Qualitative	4	29%
Mixed	1	7%
Number of open-research practices used^a
0	3	21%
1–2	10	72%
3–4	1	7%
5	0	0%

The open-research practices we consider relevant are open software/code, preregistration or Registered Reports, preprints, open monographs, and open educational resources.

The first round of recruitment was conducted before submitting the Stage 1 Registered Report because an apt opportunity occurred for people to express interest in taking part in the study: Initially, potential participants were identified based on their contribution to a prior survey, led by the UK Reproducibility Network (UKRN), that ran in early 2023 and investigated attitudes toward and experience in open research. After completing the UKRN survey, if the potential participants were interested, they were directed to a short, separate sign-up survey in which they were asked, “How important do you believe open research is to your field?” and “Thinking about one of your recent research projects, did you/do you plan to make your research data open (i.e., information you collect, observe, generate or create as part of your research)?” Twenty people indicated their interest in being interviewed (one of whom did not work with data and was therefore not eligible). To ensure diversity on the characteristics mentioned above, we recruited additional participants by advertising the study internally at the university via email (see “Materials & Procedures” component on OSF, https://osf.io/w3sfq/files/osfstorage). This round of recruitment was conducted after in-principle acceptance of the Stage 1 Registered Report. All potential participants (including individuals that had already shown interest) completed a short screening survey to assess them against the inclusion criteria and to collect demographic information relating to our characteristics of interest: career stage, gender, discipline, and additional demographics, that is, age and ethnicity (see “Materials & Procedures” component on OSF, https://osf.io/w3sfq/files/osfstorage). Answers were assessed against the inclusion criteria. If all criteria were met, participants were invited for interview. To ensure that pseudonyms were allocated respectfully, the survey asked participants to provide their own pseudonym (R. E. S. Allen & Wiles, 2016).

Personal data from the recruitment and screening survey were password protected and stored in a separate folder to the pseudonymized participant interviews.

Sample-size justification

A priori, we set a minimum sample size and a maximum stopping rule. As described in the Data Analysis section below, our use of template analysis sits on the spectrum between codebook and reflexive thematic analysis, and therefore, data saturation is theoretically incoherent (see Braun & Clarke, 2021c). Data saturation, originally coined by Glaser and Strauss (1967) as a specific element of grounded theory, is broadly defined as the point at which no additional themes, information, or insights are yielded from the data (Braun & Clarke, 2021c). However, reflexive thematic analysis does not have a saturation point because it assumes that meaning occurs during the interaction between the data and the researcher, and therefore, new insights are always possible (for a more nuanced discussion of data saturation in relation to thematic analysis, see Braun & Clarke, 2021c). An alternative option suggested by Braun and Clarke (2021c) is information power (Malterud et al., 2015). We do not believe that there is an ideal single approach to determine qualitative sample sizes a priori (for discussions against a priori estimates, see Sim et al., 2018), but a combination of methods can provide a reasonable estimate. Therefore, we used three approaches as the rationale for our sample size: (a) information power, (b) previous research, and (c) resource constraints.

Information power

Information power proposes that the more relevant information a sample holds, the fewer participants are required (Malterud et al., 2015). Five dimensions affect information power: (a) study aim—information power increases with a narrower research question and decreases with a broader question; (b) sample specificity—a sample comprising participants with characteristics and knowledge highly relevant to the research has high information power; (c) established theory—applying established theories increases information power; (d) quality of dialogue—if the data are rich, fewer participants are required; and (e) analysis strategy—single-case or cross-case analysis decreases information power (Malterud et al., 2015). In summary, studies with focused research questions, participants specific to the study aim, and rich data that are supported by theory and analyzed using in-depth exploration of narratives have higher information power and require smaller samples (Malterud et al., 2015).

In this study, we had dense sample specificity because participants were purposively recruited based on their knowledge and/or experience of data sharing, the semistructured interview format promoted good quality of dialogue, we used established theory to design and interpret the study, and we did not use single-case or cross-case analysis. However, our research question was neither broad nor narrow because although the topic—data sharing—is narrow, we asked it in the context of researchers across disciplines and career stages. Overall, information power considerations suggest a smaller sample size.

Previous research on qualitative sample sizes

Braun and Clarke (2013) typically recommended a sample size of 10 to 20 for a medium thematic analysis project using interviews. Notwithstanding our comments above about data saturation, we note that a recent systematic review of qualitative sample sizes found that on average, 12 to 13 interviews reached saturation (Hennink & Kaiser, 2022), confirming previous work that also reported saturation at 12 interviews (Guest et al., 2006).

Pragmatic resource constraints

We also considered pragmatic constraints related to funding (limited internal funding) and time (Emma L. Henderson’s (ELH) temporary contract and the time pressure that researchers, our participants, are under). Because of these resource constraints, we set the maximum number of interviews to 20.

Sample size

Our aim was to capture the depth and nuances of the topic in relation to the research questions while avoiding research waste in terms of funding and participant time. Based on the above three considerations, we set an anticipated lower sample size of 12 and an upper sample size of 20. The final sample size was decided in situ via discussion with the research team, who considered “the adequacy (richness, complexity) of the data for addressing the research question” (Braun & Clarke, 2021c, p. 211). A discussion was led by R. Abrams after 14 interviews were collected. This deviated from the original Stage 1, which intended for a discussion to take place after 12 interviews. However, because 14 interviews were necessary to fulfill our sampling criteria, the discussion took place after 14 interviews. At this point, the team agreed that the sample size was adequate based on patterns in the data demonstrating a range of perspectives and similarities.

Participant demographics of our final sample are presented in Table 1. In total, we interviewed 14 participants (10 from our original recruitment method and four from additional recruitment) ages 32 to 83 years. Eleven participants had previously shared their data, and three had not.

Materials

A 26-item interview schedule was used to identify the barriers and enablers to data-sharing behavior (see Table 2). Interview questions were created informed by the COM-B model (Michie et al., 2011, 2014) and TDF (Atkins et al., 2017; Cane et al., 2012) and developed to extend previous work that suggests that opportunity-related factors, such as time and resources; capability-related factors, such as knowledge and skills; and motivation-related factors, such as incentives, are barriers and enablers to data sharing. The schedule covered all COM-B constructs and TDF domains apart from physical capability because we assume that if researchers are physically capable of conducting research, they are also capable of sharing data. The interview schedule was piloted in May 2023 with a participant who is familiar with open-research practices. The questions were subsequently modified to ensure clarity. For details of how the interview was introduced and closed, see the “Materials & Procedures” component on OSF, https://osf.io/w3sfq/files/osfstorage.

Table 2.

Interview Schedule Informed by COM-B and the TDF

COM-B construct	TDF domain	Interview questions
		Different disciplines use different terms to describe data, for example, some people conceptualize it as the evidence underpinning their conclusions. What does the term “data” mean to you?
		What kinds of data do you usually work with or produce in your research?
		How would you describe your attitude toward data sharing? [prompt if necessary] Do you generally feel positive or negative? [follow up] Is it something you consider to be important?
		To what extent is data sharing something you normally do with your research data? [follow up] Are you or have you been responsible for sharing the data yourself?
		When did you start (thinking about) sharing your data and why? [prompt if necessary] For example, is it something you feel/don’t feel is important?
		Have the reasons you share/don’t share your data changed?
Reflective motivation	Social or professional role and identity	Do you consider yourself to be an open researcher? [prompt if necessary] What does that term mean to you? [prompt if necessary] Does the term “open researcher” have any positive or negative connotations?
Reflective motivation	Social or professional role and identity	Do you feel that sharing your data is part of your role? [follow up] Does it ever change, for example, based on the team or project you’re working on?
Psychological capability	Knowledge	The university and funders have policies on research data management; do you know what is expected from you in terms of data sharing by the university and funders? [follow up if yes/maybe] Can you summarize those expectations for me? [current behavior] And how do your practices compare with those expectations—do you generally meet the requirements? [if behaviors differ] Can you tell me about why your practices differ?
		Now I’m going to describe the various steps a researcher might follow when they’re sharing data. While I’m talking, think about whether there is anything missing from my list: For someone to share data really well, first you’d need to know why and how to share your data, you might need to seek or use practical or financial support, create a data management plan, include data sharing in your ethics application, and then you need to apply data-sharing practices throughout planning, study design and data collection (for example, mentioning data sharing in your consent forms), and finally manage the data by anonymizing it and saving it in a shareable file format, and then actually share the data in a repository according to disciplinary standards. Did I miss anything in terms of the behaviors that make up data sharing?
Psychological capability	Knowledge	Was there anything on that list that’s unfamiliar to you?Do you follow all those steps?
Psychological capability	Memory, attention, and decision/behavioral regulation	When in the research process do/would you typically start to think about data sharing?
Psychological capability/reflective motivation	Cognitive and interpersonal skills/belief about capabilities	Do you feel that you have the necessary skills to complete all the data-sharing practices I described? [follow up] Do you feel confident in those skills? [follow up] Are there any parts you find particularly difficult? [follow up] Have you ever received any training in data sharing?
Automatic motivation	Emotion	Are there any stages of data sharing that particularly worry you or anything more broadly that concerns you about data sharing?
Reflective motivation	Optimism	To what extent do you think you can overcome these barriers or challenges?
Reflective motivation	Beliefs about capabilities	What support do you need to overcome those challenges? What do you think needs to change?
Physical opportunity	Environmental context and resources	How does your environment at work influence your data-sharing behaviors? [follow up] Do you have sufficient resources in terms of things like time and infrastructure (e.g., a repository for your data), but also organizational culture, in order to complete all those data sharing practices? [follow up if time is mentioned] Where does/would the time come from that you spend on data-sharing activities?
Physical opportunity	Environmental context and resources	How would your work environment, resources, or environmental culture need to change to help you engage more with data-sharing practices?
Social opportunity	Social influences	To what extent do the people you work with help or hinder you with data sharing? [prompt if necessary] What type of support is that?
Social opportunity	Social influences	How would the people around you need to change to help you engage more with data-sharing practices?
Physical opportunity	Environmental context and resources	To what extent is data sharing prioritized and valued by your department/school, the university, and funders?
Automatic motivation	Reinforcement	Do you think that there are any incentives or benefits for you personally to share data? [follow up] Has data sharing ever affected your career?
Reflective motivation	Goals/intentions	Where in your order of priorities does sharing your data come? [follow up] Would you like to change that? [follow up] What would need to change to make that shift?
Reflective motivation	Beliefs about consequences	Have you ever thought about who might use your data? [follow up] Do you know whether any of your data have been accessed or reused? [follow up] Would you like to know?
Reflective motivation	Beliefs about consequences	What do you think would happen if you were to always share your data? Would there be positive or negative implications for you, the wider research community, or society?

Note: COM-B = capability, opportunity, motivation–behavior; TDF = theoretical-domains framework.

Procedure

One-to-one semistructured interviews were conducted by R. Abrams) online via Teams.⁴ Participants were provided with the information sheet and consent form (see “Materials & Procedures” component on OSF, https://osf.io/w3sfq/files/osfstorage) via email a minimum of 3 days before the interview. Participants were advised that they may withdraw their data at any point and up to 1 month after interview completion without providing a reason. The information sheet explained that pseudonymized transcriptions of the interviews would be made openly available.

Interviews lasted approximately 1 hr, during which both participants and the interviewer had their cameras on. At the start of the interview, the researcher explained the purpose of the research, mentioned that participants may ask for a break or withdraw at any time, and reminded them that the interview is being recorded. Questions were asked in the same fixed order for all participants (Table 2) the majority of the time. However, given the semistructured nature of the interviews and the need to respond to participants, there were times when questions were also asked at different points and earlier questions returned to if they had not been covered in the intended order. After completion of the interview, participants were thanked and provided with a debrief (see “Materials & Procedures” component on OSF, https://osf.io/w3sfq/files/osfstorage). They were offered the opportunity to review their pseudonymized transcripts for the purpose of highlighting any parts that they did not wish to share. Four participants took up this opportunity, but none wished for any redactions. Participants received a £50 Amazon voucher via email in return for participation.

Data analysis

Interviews were video recorded and transcribed using Otter.ai and Word’s automated audio transcription and stored in the university’s research folder. Transcriptions were checked by the R. Abrams against the recordings to ensure verbatim accuracy of all verbal utterances and that punctuation was used to preserve the original meaning (Braun & Clarke, 2006). The shortest interview lasted 38.54 min, and the longest lasted 65.47 min. The average interview duration was 50.13 min. The recordings were deleted at the point that the Stage 2 manuscript was accepted.

Pseudonymization

Pseudonymization was carried out by R. Abrams and followed the UK Data Service’s guidance for qualitative data (UK Data Service, n.d.). We used the following steps: (a) When possible, we did not collect disclosive data. For example, we did not ask for names of people, departments, universities, or companies. For cases in which a participant volunteered this information, we deidentified this in the transcript and indicated this. (b) We had intended to use the UK Data Service’s Text Anonymization Helper Tool (UK Data Service, n.d.) that runs MS Word macros to help find any disclosive information. However, information-technology security prevented this tool from being downloaded. Therefore, three members of the team (R. Abrams, E. L. Henderson, and E. K. Farran) reviewed all transcripts to identify any disclosive information instead. (c) Pseudonymization occurred once transcription was complete. The original, unedited version of the transcription was kept for use within the research team. (d) Finally, we replaced any identifying information rather than blanking it out. Replacements were clearly indicated using brackets. For example, “My colleague Indiana Jones” would have been edited to “My colleague [name].” We kept a pseudonymization log of any edits and an identifying key, stored separately from the pseudonymized transcripts.

Thematic template analysis

Broadly, the purpose of thematic analysis is to develop themes in the data set in relation to the research question (Braun & Clarke, n.d.). There are three main types of thematic analysis (Braun & Clarke, 2019, 2023) that appear on a spectrum from “coding reliability” that prioritizes coding accuracy to “codebook,” in which the coding structure is developed based on both the data and a priori theory, through to “reflexive,” which emphasizes “the inescapable subjectivity of data interpretation” (Braun & Clarke, 2021a, p. 37). Template analysis is a flexible form of thematic analysis that uses a hierarchical coding structure (Brooks et al., 2015). This style of thematic analysis was selected because it allows the theoretical underpinnings of the research, in this case, COM-B and TDF, to be used to develop a priori themes, but these themes remain flexible. The coding template is further developed based on a subset of the data and then refined and advanced as it is applied to the full data set (Brooks et al., 2015).

Where template analysis sits on the spectrum of thematic analysis depends on researchers’ epistemological position. Because in this research, we aimed to explore what factors influence researchers’ data-sharing behaviors, we adopted a critical-realist ontology assuming that a meaningful reality exists but that one’s experience of it is subjective and socially influenced (Braun & Clarke, 2013). The analysis was also underpinned by a contextualist epistemology: Contextualism aims to understand truth but views knowledge as contextually located and influenced by the researcher’s position, and therefore, truth is bound to the context in which data are collected and analyzed (Madill et al., 2000). This position is consistent with a data-focused approach to thematic analysis that acknowledges the active role of the researcher (Brooks et al., 2015). From this philosophical position, template analysis sits on the spectrum between codebook and reflexive thematic analysis and on the continuum between deductive (initial themes are established before coding) and inductive (themes are developed and refined through engagement with the data) thematic analysis.

The pseudonymized transcribed data were coded using the software package NVivo (Version 12). The template analysis followed Brooks et al.’s (2015) and King et al.’s (2018) guiding framework. Initially, template analysis is typically carried out on a subset of the data. The subset should capture the variety of experiences covered in the full data set; therefore, the precise number cannot be determined in advance. We anticipated that it would be a nonrandom sample of around five interviews; in actuality, we analyzed four. We used King et al.’s understanding of “codes” as comments linked to extracts of text, indicating that they are relevant to the research question. Codes develop into themes, and “coding” is the process of assigning codes and themes to the text.

Stages 1 through 4 below were carried out independently by R. Abrams and A. Marcu. Both R. Abrams and A. Marcu are experienced qualitative researchers. However, both would typically engage in inductive coding, and neither had experience working with COM-B and TDF. Therefore, both researchers discussed the analysis in two meetings and involved the wider team in shaping the findings write-up. Throughout the process, coding was discussed with all authors for the purpose of developing a richer understanding of the data.

In Stage 1, we anticipated an iterative approach to the development of the coding template based on the data. However, because the interview questions had been mapped to COM-B and the TDF, no refinement was required. To see specific details of the original coding plan, see the Stage 1 document (Henderson et al., 2023). For deviations from our original Stage 1, see Table 3.

Table 3.

Deviations From Our Original Stage 1

Deviation	Description
Updated links	Replaced view-only OSF links with links to the main, public project page.
Updated our process of identifying our sample size	Specifically, we planned to review the sample size after 12 interviews, but we did so after 14, and we added:“A discussion was led by RA after 14 interviews were collected. This deviated from the original Stage 1, which intended for a discussion to take place after 12 interviews. However, as 14 interviews were necessary to fulfil the sampling criteria, the discussion took place after 14 interviews. At this point, the team agreed that the sample size was adequate, based on patterns in the data demonstrating a range of perspectives, as well as similarities.”We therefore removed:“An initial discussion (led by ELH) will occur after 12 interviews are completed. If further interviews are required, the research team will discuss sample size again after every two additional interviews until we reach the upper limit of 20. If we reach 20 interviews and have concerns about the adequacy of the data, we will note these as a limitation of the study.”
Table 1	We updated the gender descriptions to align with the demographics question we asked and since populated this with participant information.
Procedure	(a) We added in a note to transparently report the change in who collected data after the Stage 1 acceptance and updated our process. Specifically, we added, “In the Stage 1 manuscript, the joint first author (ELH) was originally responsible for conducting the interviews and analysis. However, after the Stage 1 manuscript received in-principle acceptance, ELH left academia, and RA took over these responsibilities.”(b) We also reported that questions adhered to the topic guide but may have been reordered during the conversation to respond to participants. Specifically, we added, “Questions were asked in the same fixed order for all participants (Table 2) the majority of the time. However, given the semi-structured nature of the interviews, and the need to respond to participants, there were times when questions were also asked at different points, and earlier questions returned to if they had not been covered in the intended order.” We removed the following sentence because it did not apply: “If during the interviews it becomes apparent that participants understand a question differently from how it was intended to be understood, we will amend the question and transparently report this.”(c) We also report on how many participants opted to review their transcripts following interviews, adding the following sentence: “Four participants took up this opportunity but none wished for any redactions”; we removed the sentence, “Any such parts will be reported transparently as ‘redacted.’”
Pseudonymization	We added the following sentence to our first point: “Where a participant volunteered this information, we have deidentified this in the transcript and indicated this.”
Thematic template analysis	(a) In this section, we transparently report on the steps that differed in process from what we outlined in the Stage 1. This includes the following difference: “We anticipated that it would be a non-random sample of around five interviews - in actuality we analysed four.”(b) We have also reported on differences in the application of the thematic template analysis. Specifically, across Steps 1 through 6, we did not need to amend our template as much as anticipated because of the preexisting mapping of the topic guide onto COM-B/TDF, which substantially aided data analysis. We also report on the approved changes in the team (i.e., changing E. L. Henderson to R. Abrams and fourth author to third [A. Marcu]).
Table 3	In our Stage 1, we indicated that we would map our findings visually using software to do this. However, we did not feel this the most appropriate way to represent our findings, and instead, we have included Table 3 to report these, which is a transparent and accurate reflection of the narrative findings.
Credibility strategies	We report on how we undertook this, adding “AM coded a sub-set of 4 transcripts, and these were discussed with RA before the remaining data set were analysed” and removing “version 2 of the coding template will be developed by the first author and last author (EKF) whose familiarity with COM-B and the TDF are limited to this study, and the research team will input at various stages.”
Added to/updated the disclosures section	(a) We completed the Authors’ Contributions section.(b) We updated the conflict-of-interest section.(c) We updated the data-accessibility links.

Note: COM-B = capability, opportunity, motivation–behavior; TDF = theoretical-domains framework.

Step 1: familiarization with the data

Familiarization was a key step because template analysis requires that extracts of text are interpreted in the context of their meaning within the participant’s complete account. The coders became immersed in the data by listening to the interview recordings and reading the transcripts while looking for meaning and patterns. Informal notes were made, for example, noting quirks and connections in the data and broadly what was going on in the data.

Step 2: preliminary coding

Preliminary coding was carried out based on what appeared interesting in the data in relation to Research Question 1. We used a coding template of initial themes (Version 1) developed a priori based on the COM-B constructs and TDF domains (see “Materials & Procedures” component on OSF, https://osf.io/w3sfq/files/osfstorage). Although in the Stage 1 we had anticipated removing or modifying a priori themes, we found that because the interview topic guide had already mapped questions onto COM-B and TDF, the data were characterized against this from the outset, leaving little room for inductive interpretation.

Step 3: clustering

As mentioned above, despite anticipating the grouping of codes and a priori themes into meaningful themes at Stage 1, in actuality, the COM-B became the themes, and TDF become the subthemes. Thus, the process became one of checking that data did not overlap, repeat, or duplicate across themes, similar to the intended purpose of sorting, collating, and combining similar codes into clusters of meaning to capture significant patterns in the data set.

Step 4: developing the coding template

Having identified clusters, themes, and their relationships, the coding template was applied across all data. Because we did not change the a priori template, there was not a Version 2.

Step 5: apply and modify the coding template

R. Abrams applied the Version 1 coding template to all remaining interviews and considered whether the themes captured the meaning of all interviews. No changes were made to the coding template during this process. Because of the nature of the interview-topic guide, the COM-B and TDF structure remained intact throughout the coding process. For cases in which data did not map directly onto the a priori codes, the codes were left as intended, until all analyses were completed.

Step 6: finalize the coding template

Steps 5 and 6 were not two distinct stages in the analysis because the coding framework left little to no opportunity for inductive coding. Therefore, the coding template was considered final at Step 4 and applied to all data.

Step 7: writing up

Findings are presented theme by theme. We had anticipated presenting the theme and subtheme in a table at the start of each thematic section. However, we instead report all themes, subthemes, and the corresponding barriers and enablers in a holistic table. In the write-up, we focus on reporting the (sub)themes most relevant to the research question. Vivid examples of themes that capture their core meaning are used to illustrate each of those themes.

Credibility strategies

As described above, coding was led by R. Abrams. A. Marcu coded a subset of four transcripts, and these were discussed with R. Abrams before the remaining data set was analyzed. We did not use consensus coding or interrater reliability because these methods are inconsistent with the philosophical assumptions that underlie more reflexive or codebook-types of thematic analysis (Braun & Clarke, 2021b). For example, interrater reliability assumes that there is a single accurate reality that should be coded in the data, whereas reflexive thematic analysis holds that the researcher is an active participant in meaning making and that codes are derived via a situated interaction between the researcher and the data (Braun & Clarke, 2013; Braun et al., 2019). The researcher’s subjectivity is embraced as a resource that “sculpts the knowledge produced, rather than a must-be-contained threat to credibility” (Braun & Clarke, 2021b, p. 334). Explicating a researcher’s motives, background, and perspectives via a positionality statement allows the reader to consider the researcher’s influence on data collection and analysis, thereby increasing transparency and rigor (Steltenpohl et al., 2023). We have provided prestudy positionality statements (see “Positionality” component on OSF, https://osf.io/d4sjk/files/osfstorage). A second positionality statement, in which R. Abrams reflected on how their assumptions and position might have shaped the coding process, was completed once the data had been analyzed and written up. In addition, to establish the rigor and dependability of the work, we shared the raw transcripts (of all 14 consenting participants).

Data availability

The study materials are available in the OSF repository, https://osf.io/w3sfq/. To help maximize adherence to FAIR principles, the pseudonymized transcripts were archived with the UK Data Service (Abrams, 2025). For further details on how we ensured adherence to FAIR principles see the “Data” component on OSF, https://osf.io/ejcp5/files/osfstorage.

Results

Following our coding template, we mapped themes to five of the six COM-B components (physical capability was not present in the interview schedule) and 12 of the 14 TDF domains (behavioral regulation and emotions were not present). Themes are presented below, supported with participant quotes, and summarized in Table 4.

Table 4.

Combined COM-B and TDF Analysis of the Influences on Data-Sharing Behaviors.

COM-B construct	COM-B subconstruct	TDF domain	Perceived barriers and enablers to data sharing	Barrier/enabler	Recommendation
Capability	Psychological capability	Knowledge	Knowing where to locate relevant information	Enabler	Actively encourage knowledge sharing across research groups to maximize enablers via seminars, tutorials, and/or research showcases (both formal and informal).
			Knowing when in the research process to act on knowledge	Enabler
			Not knowing where to share data (i.e., which repository)	Barrier
			Not knowing how to complete data-management plans	Barrier
		Skills (cognitive and interpersonal)	Having the skills to seek out necessary information to support data sharing	Enabler	Actively encourage and make time for upskilling across departments and research groups via formal and informal training.
		Memory, attention, and decision processes	Thinking about data sharing at the beginning of a project	Enabler	Prompt researchers to consider data-sharing processes at the intention to bid stage.
		Memory, attention, and decision processes	Retrospectively considering data sharing	Barrier
Motivation	Reflective motivation	Beliefs about capabilities	Perceiving data sharing to be straightforward	Enabler	Consider communicating the significance of data sharing as a behavior and find ways to embed it in processes.
			Perceiving data-sharing activities to be competing with other academic demands	Barrier
		Beliefs about consequences	Perceiving data to be novel, useful, and/or usable	Enabler	Consider communicating the significance of data sharing as a behavior including the value added that it can bring to science.
			Perceiving data sharing as “of value” to both themselves and the wider community	Enabler
			Perceiving data as messy or high risk	Barrier
			Concerns about imperfect data	Barrier
			Concerns about data not being of use or value to others	Barrier	Consider upskilling individuals working in open research and ring-fencing a portion of their time and resources to support researchers with their data concerns (and/or having a specific role to do this).
			Concerns about identifying data	Barrier
			Concerns about time and resources available	Barrier
			Concerns about consequences (i.e., what others might do with their data)	Barrier
			Concern about repositories disappearing	Barrier
			Concern about ideas being stolen/scooped	Barrier
			Concern about data being misinterpreted	Barrier
		Optimism	Wanting to uphold principles of openness and trustworthiness	Enabler	Consider communicating these as strong values when seeking to promote or enhance data-sharing behaviors.
		Intentions		Enabler
		Goals	Wanting to share data from an altruistic standpoint	Enabler
		Social/professional role and identity
	Automatic motivation	Reinforcement	NA
	Automatic motivation	Emotions	NA
Opportunity	Social opportunity	Social influences	Expectations within a research community to share (quantitative disciplines)	Enabler	Consider communicating these as strong values when seeking to promote or enhance data-sharing behaviors.
			Expectations within a research community to not share (qualitative disciplines)	Barrier
			Being restricted through sensitive aspects (i.e., commerciality/ confidentiality)	Barrier
					Respect different disciplines and avoid a blanket approach with policies.
	Physical opportunity	Environmental context and resources	The expense of storing large data sets	Barrier	Consider upskilling individuals working in open research and ring-fencing a portion of their time and resources to support researchers with their data concerns (and/or having a specific role to do this).
			The lack of time available to engage in data sharing	Barrier
			Not appropriately costing resources into grants/bids	Barrier
			Funding guidelines and reporting templates specifically relating to data sharing	Enabler
			Data-sharing policies from funders and journals to set out expectations	Enabler
			Data-sharing plans in funding applications	Enabler
			Specific requirements to make data open (i.e., biological data)	Enabler
			Public places (e.g., observatories) putting time limits on how long data could remain private to the team/researcher who collected it (i.e., propriety)	Enabler
			Providing specific repositories to host data	Enabler

Note: COM-B = capability, opportunity, motivation–behavior; TDF = theoretical-domains framework.

Capability

Psychological capability

Capability is internal to individuals and reflects their ability to engage in a behavior (Michie et al., 2011). In our data, psychological capability was evident, reflected in the TDF domains of knowledge, skills, memory, and attention and decision processes (behavioral regulation was not evident). We did not ask participants about physical capability because we assumed that if they were physically capable of collecting data, they were capable of sharing it.

In STEM disciplines in which researchers worked with quantitative data, they were typically familiar with data-sharing expectations from funders, journals, and research communities. This familiarity appeared to enable sharing behaviors and, in turn, facilitated the knowledge for how to share data effectively. However, in disciplines such as social sciences and humanities and/or researchers working with qualitative data, expectations about data sharing were less established, encouraged, and recognized and therefore occurred less frequently or were harder to enact. For participants sharing data already, their decision processes about when and how to share was largely driven by the need for shared data to be both useful and usable.

Knowledge

The TDF domain knowledge refers to awareness of something’s existence (i.e., data sharing) and procedural knowledge—understanding how to do it (Atkins et al., 2017). In cases in which data sharing was reported as being more embedded (e.g., researchers do it as part and parcel of their role), researchers reported more knowledge about which repositories were available to them and what processes they needed to follow. This appeared to be a crucial step in enabling data sharing. For example, knowing where to go to gain the knowledge required and gaining this knowledge early in the research cycle enabled researchers to tailor their outputs and format them as part of the research process rather than trying to do it retrospectively or once the funding period had ended: “For us, a lot of it’s very clear, we know exactly what [repository] everyone uses,” said Jennifer.

When researchers perceived data sharing as a normal and integral part of the research process, then it was not seen as an additional activity but one that was incorporated into what they do as part of a project’s life cycle:

So I guess I just see it as part of the normal publication process. So given that I have all this data anyway, it’s not that much of an extra step really, to you know, once you know where you’re sharing it or which repository you’re submitting it to, or whether it’s just a journal table, it doesn’t take too long. I think we’re pretty well set up to get all that sorted. (Amelia)

My PhD students don’t even question it, it’s not even something that they’d have to think about. . . . As we develop code, I’m going to put it on GitHub. It’s part of that pipeline. (Jennifer)

We don’t even consider open research. No, like we don’t even have that conversation. . . . It’s just how we do it. You publish your paper, you put your code and add the data in the repository. (Rick)

For individuals in social sciences/humanities and/or working with qualitative data, knowing where to share data and what their first step should be presented the initial barrier that was hard to work around. In these cases, researchers reported needing to think much harder about what to share and how because they did not have the background knowledge on which to act:

I think there’s something that I have to actively think about and seek out and kind of think about what is the data that I could share and how could I make it available? I don’t feel that something that’s naturally in the process of my work at all. . . . I have no idea how to share it. Yeah. I’m still finding out how to do the first step and then hopefully, I’ll find out about the sharing but I know basically, nothing. (Lara)

And then where do I put it? Am I creating my own archive? Where does that go? (Sophie)

How do you develop the right Research Data Management Plan, and there’s sort of a muddle to me in that perspective, and it’s taking on a lot of my initiative as a researcher to go and find. (Beckham)

Broadly speaking, researchers across all disciplines felt that it was the combination of both knowledge and skill that supported data sharing. Therefore, in the theme below, we discuss associated skills that enable or prevent data sharing.

Skills (cognitive and interpersonal)

“Skills” refers to the ability or competence to perform a behavior, developed through practice (Atkins et al., 2017). Researchers predominantly reported finding it easy to acquire the skills needed for data sharing. They also indicated that related guidance on how to share data had increased over time, which had consequently enhanced their skill set:

It’s probably quite simple to do. It’s just a matter of, you know, finding out how to do it and finding the time and pages to kind of go through the process. I wouldn’t necessarily say it’s, you know, it’s a tricky thing to do, that requires sort of complex skills. And, but it’s a matter of finding out how and putting the time into it. (Zainab)

There wasn’t any guidance in terms of what to do with the data. So I just literally plugged it into my spreadsheet and it was fine. Which is interesting, because then 2 years later, I was submitting my second paper to the same journal, it was a completely different story. Because they had loads of guidelines. Again, the same thing, I plugged my dataset into whatever it was and then they go back to me saying like, “Oh, this is wrong. This is wrong. This is wrong. You should do the ABC,” and I was like, oh, okay, well, this has changed. . . . I can now describe my data in a way that makes it reusable. So I’ve put together like a README file where I describe my variables. (Zainab)

Taken together with findings in the knowledge theme, the majority of participants felt that should they need to, they had the appropriate skills to increase or enhance their ability to share data.

Memory, attention, and decision processes

This TDF domain refers to the ability to retain information, focus attention, and make decisions or choices (Atkins et al., 2017). For researchers making decisions about what data to share and when, they reported that this often happened at the outset of a project, largely driven by a compulsory section in a funding application about how data would be shared once the project was finished. This requirement focused researchers’ attention on data sharing at the beginning of the research process and encouraged them to prioritize data sharing in their project plans:

It all has to come in the beginning. The plan that we’re writing for a bid right, which is what funders now [want], if you look I think most will have a section on and it will fall with ethics. (Jennifer)

However, for participants working with qualitative data, researchers reported that data sharing would sometimes be a retrospective decision. This might be because of aspects that arose or changed throughout the course of the project or because of circumstances that enabled data sharing, such as available funding or the importance of the data found:

So it’s almost not until the end that you kind of go wow, okay, this is really important. Oh my God, I wish we’d done X, which was exactly where we found ourselves . . . and I think that is not something I would have anticipated at the beginning . . . in the messy reality of research, things change. . . . The data you think you want to share is different to the data you’re actually sharing all those things can morph but maybe you know what I know now I do the next project differently. (Sophie)

In some cases, data sharing was reported as being considered a lesser priority, one that did not drive the work but might be considered at a later point to leverage project data:

So the first and foremost priority is the integration of research, as you know. So this is the most important thing and the completion of it correctly, ethically, successfully. These are important priorities to be considered, sharing data comes second to that. (Beckham)

But it’s like if I can publish a dataset, how can I do it and what would the data be? It doesn’t drive, normally how I start the work . . . at some point when you’re thinking about okay, this is the experiment that we’re going to run and then it’s like, well, okay, what could we maybe do to leverage that data for a wider purpose? (Frisby)

Ultimately, when a decision was made to share data, it was done so with the need to make it both useful and usable:

When we think of a shareable format, I guess the main thing is that just because you share the data, doesn’t mean that it’s usable by the people. So if you’re using headings, you know, what are those headings? When we have code, like how are we annotating that code, right? So I can share code very easily, but that’s not usable for people. . . . But if you’re going to put in the repository, then you have to clean it up a little bit. You’re going to put some comments on it. And actually, you know, you can always read almost out of self-interest. It doesn’t need to be because of other people. (Rick)

Having data that were both useful and usable required attention to detail and careful planning from the outset.

Motivation

In the COM-B model, motivation has two subcomponents. Reflective motivation involves conscious processes that influence behavior (Michie et al., 2011). In our data, this was demonstrated through all six associated TDF domains: beliefs about capabilities, beliefs about consequences, optimism, intentions, goals, and social/professional role and identity. “Automatic motivation,” in contrast, refers to unconscious processes that drive behavior. In our data, this was demonstrated by reinforcement, and the TDF domain emotions was not evident.

Several researchers discussed whether data sharing was driven by self-interest or altruism. Many acknowledged that regardless of whether they engaged in data sharing, there was not much reward or recognition, which at times made it harder to prioritize or meant that researchers were left with a feeling of “could do better” at data sharing. Many also acknowledged that although confident in their existing skills, it was important to get data sharing “right,” especially when the need to deidentify data was involved.

Reflective motivation

Beliefs about capabilities

Beliefs about capabilities relates to self-confidence, perceived competence, self-esteem, and professional confidence (Atkins et al., 2017). Researchers largely believed that their capabilities for data sharing could be enhanced with practical support, including training. However, researchers mostly believed that if they were engaging in data-sharing activities, this would either be on top of existing workload or undertaken in a researcher’s own personal time. Thus, although the wider organizational culture might encourage data sharing, applying it in practice might present a conflict because it is not an activity that is typically prioritized:

Data sharing is not how researchers are judged, there’s no route to promotion, if you will, or even assessment of how a lab is organized. That isn’t given the priority, it’s just purely day to day to get the publication and that, sadly, is how academic research is predominantly judged. And so therefore, the stuff on the side you either do it out of hours on your own, or it doesn’t happen, which I think is terrible. I’m constantly battling it, especially because I do believe in it strongly. But I have to do it largely in my own time. (Michael)

To this end, several researchers believed data sharing to be an administrative burden that did not have any support channeled into it. For it to be taken seriously across the board, in some cases, it would need to be mandated:

I guess it would need to come from above. And it would either have to be something that’s mandated. Everybody has to do it, therefore everyone does it. Or it will have to be strongly encouraged by you know, [principal investigator’s] leadership, etc. Obviously, you know, it could be that it becomes part of the culture as such, but I think there are other pressures on researchers in terms of you know, progressing their careers, etc. And I don’t think that data sharing gives us every one of those things that are going to win unless it gets mandated. (Zainab)

Most researchers reported feeling confident that data sharing was a doable part of any project with the right resources in place, especially if considered and implemented at the start of a project (Eric: “It’s not a particularly hard thing to do, but you’d have to do it at the outset”). They felt confident in their skills set if they had been sharing data for a while, and for participants that had not, they felt confident in their ability to find out which steps to undertake and who they might need to speak to for help.

However, they also acknowledged that it was an activity that they could also be better at and one that may suffer when it competes with other academic priorities: “I think we probably do more than the bare minimum, but not the most,” said Jennifer. Michael said, “Do I do enough of it? No, I don’t, I’m afraid. I’m kind of under constant pressure to continue evolving the research.”

Overall, the beliefs that researchers held about their capability to engage in data sharing were often a product of their working environment (see opportunity: physical opportunity theme).

Beliefs about consequences

This TDF domain relates to accepting the truth about the outcomes of a behavior, including outcome expectancies and potential regret (Atkins et al., 2017). Across the board, researchers felt that what data they shared and how they shared it required careful consideration because of the potential consequences of sharing it. For example, several participants made decisions not to share data that they considered was either messy (e.g., self-taught coding) or high risk (e.g., potentially distressing in vivo images):

I didn’t share some of it because it’s all very basic. So you’d be just clogging up your GitHub with like, how to do a normal plot or something. So I think some of it needs more thought in terms of which parts of my code are really needed for people to reproduce what I’ve done and which parts of my code is something they could do in 3 seconds and much better. (Amelia)

So if I’m doing in vivo work, and there’s been animals used to collect it. The anti-vivisection movement is something that is quite scary. And to do any in vivo work, you do have to be a bit careful . . . not everyone believes in the benefits of in vivo research. (Michael)

Typically, researchers wanted data sharing to contribute something novel and/or helpful to the discipline at hand.

In addition, participants working with qualitative data reported feeling very cautious about deciding to share data because of its very nature (i.e., needing to make decisions to protect the anonymity of their research participants and consider the consequences of not doing this properly):

So I’m . . . not that 100% open to the sharing of data for the sensitivity of data that we have, or for the confidentiality on participants, their ideas, their views, you know, personal information, personal, personally identifiable information, which is really important for some not to be shared. (Beckham)

So we, I, work in a very psychologically unsafe world for people who talked to me. So there was huge caution about the data being made public in any way because it might come back to bite them, and they might lose their jobs. Given that we want them, I want them, to tell the truth. Yeah. And they have to trust me. And they have to trust me a lot, and the other interviewers to give that data and trust that I will do the right thing with it and not throw them under the bus by not deidentifying it correctly. (Sophie)

All researchers working with data capable of identifying the participant reported holding concerns about how to share while still protecting participants’ or patients’ identity. Some researchers felt concern about aspects of their data (i.e., it being imperfect, too basic, or containing a bug or mistakes) and of needing to manage risk (i.e., checking the data for identifying details). Other researchers were concerned about the time and resources it might need. Other researchers still had concerns about what others might do with their data (i.e., the risk of data being sold, e.g., to Google), repositories that contain data disappearing as a result of lack of funds, having their ideas stolen, or data being misinterpreted:

A bit cautious I would say probably sums it up. I think. I know, it is seen as good practice and indeed, highly desirable for publicly funded projects. And of course, I do agree with that where you’ve had public money to generate evidence. I would be fully behind that being available to others. And of course, we want open research. We don’t want data hidden where it can’t be seen and where it can’t be interrogated by others. That’s where kind of perhaps mistakes get made or even worse, you know, people might draw erroneous conclusions from data and falsify data. So broadly, I think having data open is a good idea. But and I guess there’s a big but for me, from my perspective is that that comes with quite a lot of both responsibility and what I’d probably call administrative work, although it’s not only administrative work, there’s quite a bit of ethical thinking and work that has to go on there. (Sophie)

I would say with data sharing I’m afraid of the biggest trouble with sharing is ensuring that I’m not unintentionally leaking personal data, this would be a huge trouble. (Leonid)

Data sharing was therefore seen as a vulnerable activity because it might expose researchers or errors in their work if not archived correctly.

Optimism, intentions, goals, social/professional role and identity

In this theme, we report on four TDF domains together because they are interrelated. “Optimism” refers to confidence (or lack thereof) that things will work out well or that goals will be achieved. Intentions involve conscious decisions to perform a behavior or act in a certain way. Goals are mental representations of the desired outcome a person aims to achieve. Social/professional role and identity encompass the behaviors a person adopts in social or work settings (Atkins et al., 2017).

Researchers felt that data sharing and certainly the ethos of sharing was something they considered part of their identity as a researcher. Although data sharing was not an activity they engaged in every day as part of their role, the guiding principle of being open and willing to share was something all researchers considered integral to their work. Researchers also felt that data sharing was not the only way to be open and that ensuring their articles were open access was another way in which they upheld this principle.

Researchers talked about having the goal of sharing all their data. They discussed wanting their intentions for sharing their data to be driven by morality rather than self-serving goals. They also discussed including data sharing in future grants when they had not previously while also acknowledging that it was either something they just had to do or wanted to do even if the wider system (i.e., appraisals, recognition, rewards) did not always acknowledge it:

It wasn’t something that was prioritized by anyone. So I would like this to become part of my practice as kind of being a good scientist doing the right thing. If you know, I mean, I don’t know whether my internal motivation would follow that. But when it comes to kind of being a good scientist, I would like to tick that box and share data because you know, it’s the right thing to do. (Zainab)

Even when researchers believed that sharing data was self-serving and they operated within a discipline in which data sharing was embedded, this was generally under the premise that it still helped to further the field or led to increased opportunities to collaborate on other researcher projects. Thus, there existed a degree of optimism that the consequences of data sharing could be good for both a researcher and the wider community if done well:

So I do think that leads to you know, higher citation rate and more opportunities to collaborate with people and work with them on different projects. I think I get invited to more . . . proposals or conferences or, you know, people ask me if I can share something with them, . . . but I think that in general, yeah. It’s very positive. (Amelia)

However, individuals who did not have their data in the public domain but had stated in their publications that data were available on reasonable request mentioned they had rarely, if ever, been asked to share the data. Only one participant mentioned being contacted for data sets. Most researchers stated that users of their open data would most likely be colleagues within the wider research community but that in reality, it was unlikely their data would be used by anyone else. This assumption did not affect their motivations to share their data. Thus, it may be that data sharing as a behavior signals more to the wider research community about the type of researcher one is (i.e., open and transparent and the type of values the researcher holds).

Automatic motivation

Reinforcement

“Reinforcement” refers to rewards, incentives, and punishments that increase or decrease the probability of a behavior occurring (Atkins et al., 2017). Several researchers felt that despite data sharing being encouraged and supported in some fields (i.e., by funders and journals), there was not always external reinforcement before or once data had been shared. Some researchers felt that they could write data sharing into a data-management plan and then not action it because no one else followed it up or checked it. Others felt that there could be only personal gain from sharing data (i.e., it was self-interest to boost citations and reach). Others still felt that if data sharing were to be taken seriously, then it would need incentivizing through financial support; that it relied heavily on the goodwill of scientists or was altruistic/the right thing to do; and that it was an activity that competed with other, more pressing priorities and so would drop farther down the list because of not being enforced:

Well, I don’t think there are any incentives for it. You do it because you want to do it, or you’re forced to do it, you know, because of X, Y, Z right? I mean, if your funders force you or whatever, then people have to do it. For me we just do it and that’s it. Nobody incentivizes it, but it is taking from your time isn’t it? But it is the right way to do it. And as I said before, it might be self-serving too, because when you go back to your code a few years down the line, then if everything was clean and annotated, then that’s got to be better for you. (Rick)

It’s the stick rather than a carrot. . . . I guess my internal motivation could be an incentive. But when it comes to anything offered by the institution, or the school, there isn’t really anything that would incentivize me to share data, and if it wasn’t . . . for the journals requiring it, I would probably not be knowing about data sharing at all. (Zainab)

Regardless of whether there was reinforcement or incentivization to share data, researchers engaged with data sharing primarily because of their reflective motivation as opposed to automatic motivation.

Opportunity

Opportunity is external to the individual and encompasses the physical environment and social systems (Michie et al., 2011). In our data, both the TDF domains of social influences and environmental context and resources were present. Typically, researchers felt that data sharing was encouraged in principle but that in reality, very little institutional support was offered in terms of resources to facilitate a broader culture shift. Researchers did not make any specific references to how social opportunities facilitated or hindered their data-sharing behaviors above and beyond there being clear messages from heads of departments or schools that data sharing and indeed open research more generally was a practice all researchers should embrace.

Social opportunity: social influences

“Social influences” refers to interpersonal interactions that may lead people to modify their thoughts, feelings, or behaviors and includes social norms, group conformity, power, and modeling (Atkins et al., 2017). When referring to social influences, participants rarely talked about the influence of specific people (i.e., team members or colleagues). Some researchers referred to more nebulous examples, such as the expectations within any given research community. This included reference to an assumption that certain research communities expected data sharing to be undertaken and that therefore, researchers who share their data were perceived as more trustworthy:

There are some people who still keep their proprietary data for a very, very long time, but I think it’s so frowned upon . . . people are much more willing to trust you and work with you if you’re more open about what you’re doing, I think. (Amelia)

However, in other fields in which there were fewer expectations around data sharing or it was less common (i.e., social sciences), some researchers felt that there was very little point in sharing data because no one expects it or checks it:

Sometimes it’s probably the carrot or the stick and I suspect a bit more stick, a bit less carrot would actually do the job quite honestly. I do follow the rules as much as I humanly can and so if somebody said, “No, you’ve got to do this and we expect you to do this,” then I’ll do it. But if it’s less work, and not a requirement, then I’ve got to see something in it for me. (Frisby)

There were also instances in which data were either too commercial or confidentially sensitive, which presented a barrier to being able to share data:

The employers actually came back and said, no, sorry, they’re working for this company and they are a direct competitor and that is our IP so no. So actually, in that case, I was kind of handcuffed because it was a company funded position and we were doing research for them and they were not comfortable sharing that knowledge. (Michael)

Overall, researchers spoke more about the external factors that influenced their ability to share data or not, as discussed below.

Physical opportunity: environmental context and resources

This TDF domain relates to circumstances in a person’s environment that either encourage or discourage development of skills and adaptive behaviors, such as (material) resources or organizable culture (Atkins et al., 2017). The lack of necessary infrastructure or resources led to challenges when trying to engage in data-sharing behaviors. This most commonly arose in conversation with researchers working with large data sets, in which hosting data was often prohibitively expensive for universities and not always supported by journals either:

It’s very expensive to host and maintain on the university server. So usually it ends up with us hosting them elsewhere because frankly, we don’t have the funds to buy that kind of backup and storage space from the university itself. (Amelia)

Several researchers acknowledged that cost for large data sets or indeed, cost for support (i.e., researcher time) was not always considered by individuals preparing grants:

They won’t put in a bid and forget to buy the mass spectrometer or the lab instrument they need. They won’t forget to charge you know, for the consumables and oratory, they won’t forget to charge for what they need to go out and do interviews. I think they can forget to charge for data management, open data management and the resources and expertise they need. They can forget to charge somebody’s time to do that. (Eric)

For example, you’ll put some money aside putting it onto a certain safe data resource database that has a charge for doing so. But you wouldn’t count, consider having to pay someone to do that. And you don’t always have someone in your team who knows how to deal with that . . . and it’s not until the day that you have to upload but you realize no one on the team knows how to use this platform. (Jennifer)

Time, or rather the lack of it, was also often seen as a barrier. Researchers across all disciplines felt that resources (i.e., cost, time, staff members) would need to be specifically allocated to the activity by the university, funders, and journals to truly embed data sharing.

Concrete examples of how data-sharing behaviors were enabled physically included the implementation of funding guidelines and reporting templates specifically relating to data sharing, data-sharing policies from funders and journals to set out expectations, data-sharing plans in funding applications, specific requirements to make data open (i.e., biological data), public places (e.g., observatories) putting time limits on how long data could remain private to the team/researcher who collected it (i.e., propriety), and providing specific spaces to host data:

Definitely in the last 10 years, I’d say there has been more of a focus on demand from journals and our funders to make data available. Not all data, but some. And it’s no longer a question of whether we want to or not, it’s a requirement. It’s a requirement from funders as well. Some more than others. Charities, for example, don’t tend to necessarily make that requirement but the UK is moving into it. . . . It’s become more common practice. (Jennifer)

Thus, data sharing and the behaviors associated with it appear to be context- and discipline-specific, which may be an important consideration when implementing policies, guidelines, and support for researchers as a physical opportunity to foster data sharing.

Discussion

We interviewed a range of researchers across disciplines and career stages about their experiences of data sharing. Findings indicate that quantitative data-sharing behaviors were performed differently to qualitative behaviors, which affected the required skills. For example, researchers in STEM had noticed a definite culture shift toward data sharing among funders, journals, and their research communities. This was enabled through guidelines, specific sections on funding grants requiring data sharing, and journals requiring data. The culture shift had increased knowledge, awareness, and skills for this group of researchers, allowing data-sharing behaviors to become routine regardless of motivation and opportunity. However, this was not the same for researchers working with qualitative data; these researchers felt they lacked the knowledge about how, where, and indeed why they might share their data.

Findings indicated that the motivation of researchers to carry out data-sharing activities could be both self-serving and altruistic. Although many researchers felt that the data they had already shared or could/might share would not necessarily be used by others, the motivation to contribute to open research was an enabling factor. This could be because although potentially vulnerable and exposing, it signaled a certain identity and associated values about them as researchers (i.e., as someone who values transparency and openness). However, some researchers held concerns about being too open and the consequent risk of ideas being scooped or mistakes being found.

Findings also indicated that data sharing is context-dependent, that is, there are physical, environmental, and social opportunities that can both help and hinder it. The barriers most commonly associated with the physical or social opportunity to share data were similar across all disciplines. These barriers included a lack of time to undertake data-sharing activities, concerns over General Data Protection Regulation/correct deidentification of data, and limited infrastructure to host large data sets and the expense associated with this.

Of the six key data-sharing behaviors described in the introduction, researchers felt they had the capability to seek out skills and resources but that seeking out, preparing, managing, and depositing the data were all constrained by the physical opportunity to do so (i.e., lack of time and resources). Data-management plans were known and implemented by some but not all, and this was the same for securing ethics with a view to sharing data—not all researchers carried out behaviors to facilitate data sharing (i.e., preparing participant information sheets with a view to sharing data). This was because it was not always expected or common. Our findings accord with existing research in this respect—a lack of resources, including physical opportunity (time, funding, and in our case, infrastructure), is the most frequently reported barrier to data sharing (Astell et al., 2018; Chawinga & Zinn, 2019; Cheah et al., 2015; Farran et al., 2020; Hostler, 2023; Houtkoop et al., 2018; Long et al., 2020; Tenopir et al., 2011; Van den Eynden et al., 2016). Not sharing data because it is not expected or is uncommon within the field was also a finding previously identified (Houtkoop et al., 2018), and in our study, this was particularly the case for researchers with qualitative data. All researchers discussed their concerns or fears, including compromising confidentiality, working with sensitive data, reputational harm or risk (i.e., messy data, being scooped), or the misinterpretation or misuse of their data (Bezuidenhout & Chakauya, 2018; Cheah et al., 2015; Gomes et al., 2022; Sayogo & Pardo, 2013; Soeharjono & Roche, 2021; Tenopir et al., 2015; Van den Eynden et al., 2016). Although these concerns did not always prevent data sharing, they influenced how, what, and why researchers shared their data.

The ability to share data and the factors that enabled it included available guidance; access to infrastructure, including a repository; factoring in funding allocations; and having the necessary knowledge and skills. These factors were present in our data set and existing literature (Kim & Zhang, 2015; Sayogo & Pardo, 2013; Van den Eynden et al., 2016). However, researchers did not feel that data-sharing behaviors were recognized or incentivized. This aspect appeared to be overshadowed by a lesser discussed finding in previous literature and one that our study identifies: Researchers are driven to be seen as open researchers. This identity matters to them for both the good of research and their discipline and what it signals about them. It is a key enabling factor, potentially driving behavior even in the absence of other factors. This could be an interesting finding to expand on not only with participants without relevant knowledge of data sharing, who were excluded from our study, but also participants for whom barriers and enablers might present differently.

In this qualitative Registered Report, we used COM-B and TDF frameworks to identify data-sharing behaviors and determine these deductively. Although our methods supported inductive findings also being identified throughout analysis, it is likely that because of how the interview schedule was organized (i.e., aligned to COM-B and TDF), deductive analysis took precedence. Furthermore, although the interview schedule covered almost all COM-B constructs and TDF domains, physical capability was not included because it was assumed that if participants had the physical capability to collect data, they also had the capability to share it. This may have precluded this as a finding. This may warrant further research in this area using a more inductive approach.

Overall, participants believed that data-sharing activities could be mandated by institutions to enable more widespread behaviors. However, these activities need to be both discipline-specific and supported by institutions providing adequate resources (e.g., time, recognition, infrastructure, and support). For researchers working with qualitative data, energy could be invested into raising awareness of the benefits and practicalities through appropriate training and upskilling. Researchers themselves could consider embedding data-sharing behaviors from the start of a project (e.g., in data-management plans, consent forms, and research proposals) rather than treating it as an afterthought. However, data sharing should not be done without careful consideration of the implications on participants, researchers, and universities.

Footnotes

We thank Emily McDougal for her valuable feedback on the draft interview schedule and Alice Motes,Michelle Willows,and Robert Darby for their input on the list of data-sharing behaviors. We also thank all of our participants for their time and insights. Link to Stage 1: https://osf.io/2gm5s;link to OSF project: https://osf.io/d5vuq/;link to data:

Transparency

Action Editor: David A. Sbarra

Editor: David A. Sbarra

Author Contributions

E. L. Henderson and R. Abrams should be considered joint first author.

Emma L. Henderson: Conceptualization;Project administration;Visualization;Writing – original draft;Writing – review & editing.

Ruth Abrams: Data curation;Formal analysis;Investigation;Methodology;Project administration;Writing – original draft.

Afrodita Marcu: Conceptualization;Methodology;Writing – review & editing.

Lou Atkins: Methodology;Validation;Writing – review & editing.

Emily K. Farran: Conceptualization;Funding acquisition;Writing – review & editing.

Emma L. Henderson and Ruth Abrams should be considered joint first author.

ORCID iDs

Ruth Abrams

Afrodita Marcu

Emily K. Farran

References

Abele-Brehm

A. E.

Gollwitzer

Steinberg

Schönbrodt

F. D.

(2019). Attitudes toward open science and public data sharing: A survey among members of the German Psychological Society. Social Psychology, 50(4), 252–260. https://doi.org/10.1027/1864-9335/a000384

Abrams

(2025). Investigating the barriers and enablers to data sharing behaviours: A qualitative registered report: Interview data, 2023 [Data collection]. UK Data Service. https://doi.org/10.5255/UKDA-SN-857915

Adimoelja

Athreya

(2022). Reducing barriers to open science by standardizing practices and realigning incentives. Journal of Science Policy & Governance, 21(2). https://doi.org/10.38126/JSPG210201

Allen

Mehler

D. M. A.

(2019). Open science challenges, benefits and tips in early career and beyond. PLOS Biology, 17(5), Article e3000246. https://doi.org/10.1371/journal.pbio.3000246

Allen

R. E. S.

Wiles

J. L.

(2016). A rose by any other name: Participants choosing research pseudonyms. Qualitative Research in Psychology, 13(2), 149–165. https://doi.org/10.1080/14780887.2015.1133746

Amelung

Whitaker

K. L.

Lennard

Ogden

Sheringham

Zhou

Walter

F. M.

Singh

Vincent

Black

(2020). Influence of doctor-patient conversations on behaviours of patients presenting to primary care with new or persistent symptoms: A video observation study. BMJ Quality & Safety, 29(3), 198–208. https://doi.org/10.1136/bmjqs-2019-009485

Astell

Hrynaszkiewicz

Allin

Penny

Lucraft

Baynes

, & Springer Nature Admin. (2018). Practical challenges for researchers in data sharing—Springer Nature survey data (anonymised) [data set]. figshare. https://doi.org/10.6084/M9.FIGSHARE.5971387

Atkins

Francis

Islam

O’Connor

Patey

Ivers

Foy

Duncan

E. M.

Colquhoun

Grimshaw

J. M.

Lawton

Michie

(2017). A guide to using the Theoretical Domains Framework of behaviour change to investigate implementation problems. Implementation Science, 12(1), Article 77. https://doi.org/10.1186/s13012-017-0605-9

Besançon

Peiffer-Smadja

Segalas

Jiang

Masuzzo

Smout

Billy

Deforet

Leyrat

(2021). Open science saves lives: Lessons from the COVID-19 pandemic. BMC Medical Research Methodology, 21(1), Article 117. https://doi.org/10.1186/s12874-021-01304-y

10.

Bezuidenhout

Chakauya

(2018). Hidden concerns of sharing research data by low/middle-income country scientists. Global Bioethics, 29(1), 39–54. https://doi.org/10.1080/11287462.2018.1441780

11.

Bishop

(2005). Protecting respondents and enabling data sharing: Reply to Parry and Mauthner. Sociology, 39(2), 333–336. https://doi.org/10.1177/0038038505050542

12.

Bloom

Ganley

Winker

(2014). Data access for the open access literature: PLOS’s data policy. PLoS Biology, 12(2), Article e1001797. https://doi.org/10.1371/journal.pbio.1001797

13.

Borgman

C. L.

Bourne

P. E.

(2021). Why it takes a village to manage and share data. arXi, 2109.01694.

14.

Branney

P. E.

Brooks

Kilby

Newman

Norris

Pownall

Talbot

C. V.

Treharne

G. J.

Whitaker

C. M.

(2023). Three steps to open science for qualitative research in psychology. Social and Personality Psychology Compass, 17(4), Article e12728. https://doi.org/10.1111/spc3.12728

15.

Branney

P. E.

Reid

Frost

Coan

Mathieson

Woolhouse

(2019). A context-consent meta-framework for designing open (qualitative) data studies. Qualitative Research in Psychology, 16(3), 483–502. https://doi.org/10.1080/14780887.2019.1605477

16.

Braun

Clarke

(n.d.). Thematic analysis. https://www.thematicanalysis.net/

17.

Braun

Clarke

(2006). Using thematic analysis in psychology. Qualitative Research in Psychology, 3(2), 77–101. https://doi.org/10.1191/1478088706qp063oa

18.

Braun

Clarke

(2013). Successful qualitative research: A practical guide for beginners. Sage.

19.

Braun

Clarke

(2019). Reflecting on reflexive thematic analysis. Qualitative Research in Sport, Exercise and Health, 11(4), 589–597. https://doi.org/10.1080/2159676X.2019.1628806

20.

Braun

Clarke

(2021a). Can I use TA? Should I use TA? Should I not use TA? Comparing reflexive thematic analysis and other pattern-based qualitative analytic approaches. Counselling and Psychotherapy Research, 21(1), 37–47. https://doi.org/10.1002/capr.12360

21.

Braun

Clarke

(2021b). One size fits all? What counts as quality practice in (reflexive) thematic analysis? Qualitative Research in Psychology, 18(3), 328–352. https://doi.org/10.1080/14780887.2020.1769238

22.

Braun

Clarke

(2021c). To saturate or not to saturate? Questioning data saturation as a useful concept for thematic analysis and sample-size rationales. Qualitative Research in Sport, Exercise and Health, 13(2), 201–216. https://doi.org/10.1080/2159676X.2019.1704846

23.

Braun

Clarke

(2023). Toward good practice in thematic analysis: Avoiding common problems and be(com)ing a knowing researcher. International Journal of Transgender Health, 24(1), 1–6. https://doi.org/10.1080/26895269.2022.2129597

24.

Braun

Clarke

Hayfield

Terry

(2019). Thematic analysis. In Liamputtong

(Ed.), Handbook of research methods in health social sciences (pp. 843–860). Springer Singapore. https://doi.org/10.1007/978-981-10-5251-4_103

25.

Brooks

McCluskey

Turley

King

(2015). The utility of template analysis in qualitative psychology research. Qualitative Research in Psychology, 12(2), 202–222. https://doi.org/10.1080/14780887.2014.955224

26.

Broom

Cheshire

Emmison

(2009). Qualitative researchers’ understandings of their practice and the implications for data archiving and sharing. Sociology, 43(6), 1163–1180. https://doi.org/10.1177/0038038509345704

27.

Burgelman

J.-C.

Pascu

Szkuta

Von Schomberg

Karalopoulos

Repanas

Schouppe

(2019). Open science, open data, and open scholarship: European policies to make science fit for the twenty-first century. Frontiers in Big Data, 2, Article 43. https://doi.org/10.3389/fdata.2019.00043

28.

Buus

Perron

(2020). The quality of quality criteria: Replicating the development of the Consolidated Criteria for Reporting Qualitative Research (COREQ). International Journal of Nursing Studies, 102, Article 103452. https://doi.org/10.1016/j.ijnurstu.2019.103452

29.

Cane

O’Connor

Michie

(2012). Validation of the theoretical domains framework for use in behaviour change and implementation research. Implementation Science, 7(1), Article 37. https://doi.org/10.1186/1748-5908-7-37

30.

Chater

A. M.

Family

Abraao

L. M.

Burnett

Castro-Sanchez

Du Toit

Gallagher

Gotterson

Manias

McEwen

Moralez De Figueiredo

Nathan

Ness

Olans

Padoveze

M. C.

Courtenay

(2022). Influences on nurses’ engagement in antimicrobial stewardship behaviours: A multi-country survey using the Theoretical Domains Framework. Journal of Hospital Infection, 129, 171–180. https://doi.org/10.1016/j.jhin.2022.07.010

31.

Chawinga

W. D.

Zinn

(2019). Global perspectives of research data sharing: A systematic literature review. Library & Information Science Research, 41(2), 109–122. https://doi.org/10.1016/j.lisr.2019.04.004

32.

Cheah

P. Y.

Tangseefa

Somsaman

Chunsuttiwat

Nosten

Day

N. P. J.

Bull

Parker

(2015). Perceived benefits, harms, and views about how to share data responsibly: A qualitative study of experiences with and attitudes toward data sharing among research staff and community representatives in Thailand. Journal of Empirical Research on Human Research Ethics, 10(3), 278–289. https://doi.org/10.1177/1556264615592388

33.

Concordat on Open Research Data. (2016). Concordat on open research data. https://www.ukri.org/wp-content/uploads/2020/10/UKRI-020920-ConcordatonOpenResearchData.pdf

34.

Corker

(2018). Open science is a behavior. Center for Open Science. https://www.cos.io/blog/open-science-is-a-behavior

35.

Digital Science, Goodey

Hahnel

Zhou

Jiang

Chandramouliswaran

Hafez

Paine

Gregurick

Simango

Palma Peña

J. M.

Murray

Cannon

Grant

McKellar

Day

(2022). The state of open data 2022. Digital Science. https://doi.org/10.6084/M9.FIGSHARE.21276984.V5

36.

DuBois

J. M.

Strait

Walsh

(2018). Is it time to share qualitative research data? Qualitative Psychology, 5(3), 380–393. https://doi.org/10.1037/qup0000076

37.

European Commission. (2017). Providing researchers with the skills and competencies they need to practise open science. Publications Office. https://data.europa.eu/doi/10.2777/121253

38.

Farran

E. K.

Jerrom

Hamilton

Daoutis

(2020). Open research practice questionnaire report 2020; University of Surrey. PsyArXiv. https://doi.org/10.31234/osf.io/5nv4s

39.

Fecher

Friesike

Hebing

(2015). What drives academic data sharing? PLOS ONE, 10(2), Article e0118053. https://doi.org/10.1371/journal.pone.0118053

40.

Fleming

J. I.

Wilson

S. E.

Espinas

van Dijk

Cook

B. G.

(2022). Special education researchers’ knowledge, attitudes, and reported use of open science practices. EdArXiv. https://doi.org/10.35542/osf.io/tyc36

41.

G7 Science and Technology Ministers. (2023). G7 Science and Technology Ministers’ communique. https://www8.cao.go.jp/cstp/kokusaiteki/g7_2023/230513_g7_communique.pdf

42.

Gabelica

Bojcˇić

Puljak

(2022). Many researchers were not compliant with their published data sharing statement: A mixed-methods study. Journal of Clinical Epidemiology, 150, 33–41. https://doi.org/10.1016/j.jclinepi.2022.05.019

43.

Glaser

B. G.

Strauss

A. L.

(1967). The discovery of grounded theory: Strategies for qualitative research. Aldine.

44.

Gomes

D. G. E.

Pottier

Crystal-Ornelas

Hudgins

E. J.

Foroughirad

Sánchez-Reyes

L. L.

Turba

Martinez

P. A.

Moreau

Bertram

M. G.

Smout

C. A.

Gaynor

K. M.

(2022). Why don’t we share data and code? Perceived barriers and benefits to public archiving practices. Proceedings of the Royal Society B: Biological Sciences, 289(1987), Article 20221113. https://doi.org/10.1098/rspb.2022.1113

45.

Gownaris

N. J.

Vermeir

Bittner

M.-I.

Gunawardena

Kaur-Ghumaan

Lepenies

Ntsefong

G. N.

Zakari

I. S.

(2022). Barriers to full participation in the open science life cycle among early career researchers. Data Science Journal, 21(1), Article 2. https://doi.org/10.5334/dsj-2022-002

46.

Guest

Bunce

Johnson

(2006). How many interviews are enough?: An experiment with data saturation and variability. Field Methods, 18(1), 59–82. https://doi.org/10.1177/1525822X05279903

47.

Hardwicke

T. E.

Mathur

M. B.

MacDonald

Nilsonne

Banks

Kidwell

M. C.

Mohr

A. H.

Clayton

Yoon

E. J.

Henry

Lenne

R. L.

Altman

Long

Frank

M. C.

(2018). Data availability, reusability, and analytic reproducibility: Evaluating the impact of a mandatory open data policy at the journal Cognition. Royal Society Open Science, 5(8), Article 180448. https://doi.org/10.1098/rsos.180448

48.

Henderson

E. L.

Marcu

Atkins

Farran

(2023). Investigating the barriers and enablers to data sharing behaviours: A qualitative registered report. OSF Registries. https://doi.org/10.17605/OSF.IO/2GM5S

49.

Hennink

Kaiser

B. N.

(2022). Sample sizes for saturation in qualitative research: A systematic review of empirical tests. Social Science & Medicine, 292, Article 114523. https://doi.org/10.1016/j.socscimed.2021.114523

50.

Hostler

(2023). The invisible workload of open research. PsyArXiv. https://doi.org/10.31234/osf.io/8xd65

51.

Houtkoop

B. L.

Chambers

Macleod

Bishop

D. V. M.

Nichols

T. E.

Wagenmakers

E.-J.

(2018). Data sharing in psychology: A survey on barriers and preconditions. Advances in Methods and Practices in Psychological Science, 1(1), 70–85. https://doi.org/10.1177/2515245917751886

52.

Huang

Hawkins

B. A.

Lei

Miller

G. L.

Favret

Zhang

Qiao

(2012). Willing or unwilling to share primary biodiversity data: Results and implications of an international survey: Biodiversity data sharing and archiving. Conservation Letters, 5(5), 399–406. https://doi.org/10.1111/j.1755-263X.2012.00259.x

53.

Hulscher

M. E. J. L.

Prins

J. M.

(2017). Antibiotic stewardship: Does it work in hospital practice? A review of the evidence base. Clinical Microbiology and Infection, 23(11), 799–805. https://doi.org/10.1016/j.cmi.2017.07.017

54.

Karhulahti

(2023). Reasons for qualitative psychologists to share human data. British Journal of Social Psychology, 62(4), 1621–1634. https://doi.org/10.1111/bjso.12573

55.

Kim

Zhang

(2015). Understanding data sharing behaviors of STEM researchers: The roles of attitudes, norms, and data repositories. Library & Information Science Research, 37(3), 189–200. https://doi.org/10.1016/j.lisr.2015.04.006

56.

King

Brooks

Tabari

(2018). Template analysis in business and management research. In Ciesielska

Jemielniak

(Eds.), Qualitative methodologies in organization studies (pp. 179–206). Springer International Publishing. https://doi.org/10.1007/978-3-319-65442-3

57.

Kwasnicka

Dombrowski

S. U.

White

Sniehotta

(2016). Theoretical explanations for maintenance of behaviour change: A systematic review of behaviour theories. Health Psychology Review, 10(3), 277–296. https://doi.org/10.1080/17437199.2016.1151372

58.

Long

D. W.

Barnes

A. P. L.

Northcote

P. M.

Williams

P. T.

(2020). Accounting academic workloads: Balancing workload creep to avoid depreciation in the higher education sector. Education, Society and Human Studies, 1(2), 55–83. https://doi.org/10.22158/eshs.v1n2p55

59.

MacLean

L. M.

Posner

Thomson

Wood

E. J.

(2019). Research ethics and human subjects: A reflexive openness approach. SSRN Electronic Journal. https://doi.org/10.2139/ssrn.3332887

60.

Madill

Jordan

Shirley

(2000). Objectivity and reliability in qualitative analysis: Realist, contextualist and radical constructionist epistemologies. British Journal of Psychology, 91(1), 1–20.

61.

Malterud

Siersma

V. D.

Guassora

A. D.

(2015). Sample size in qualitative interview studies: Guided by information power. Qualitative Health Research, 26(13), 1753–1760. https://doi.org/10.1177/1049732315617444

62.

Markowetz

(2015). Five selfish reasons to work reproducibly. Genome Biology, 16(1), Article 274. https://doi.org/10.1186/s13059-015-0850-7

63.

McKiernan

E. C.

Bourne

P. E.

Brown

C. T.

Buck

Kenall

Lin

McDougall

Nosek

B. A.

Ram

Soderberg

C. K.

Spies

J. R.

Thaney

Updegrove

Woo

K. H.

Yarkoni

(2016). How open science helps researchers succeed. ELife, 5, Article e16800. https://doi.org/10.7554/eLife.16800

64.

Michie

Atkins

West

(2014). The behaviour change wheel: A guide to designing interventions. Silverback Publishing.

65.

Michie

van Stralen

M. M.

West

(2011). The behaviour change wheel: A new method for characterising and designing behaviour change interventions. Implementation Science, 6(1), Article 42. https://doi.org/10.1186/1748-5908-6-42

66.

Milham

M. P.

Koo

Amiez

Balezeau

Baxter

M. G.

Blezer

E. L. A.

Brochier

Chen

Croxson

P. L.

Damatac

C. G.

Dehaene

Everling

Fair

D. A.

Fleysher

Freiwald

Froudist-Walsh

Griffiths

T. D.

. . . Schroeder

C. E.

(2018). An open resource for non-human primate imaging. Neuron, 100(1), 61–74.e2. https://doi.org/10.1016/j.neuron.2018.08.039

67.

Miyakawa

(2020). No raw data, no science: Another possible source of the reproducibility crisis. Molecular Brain, 13(1), Article 24. https://doi.org/10.1186/s13041-020-0552-2

68.

Mozersky

Parsons

Walsh

Baldwin

McIntosh

DuBois

J. M.

(2020). Research participant views regarding qualitative data sharing. Ethics & Human Research, 42(2), 13–27. https://doi.org/10.1002/eahr.500044

69.

Munafò

M. R.

Nosek

B. A.

Bishop

D. V.

Button

K. S.

Chambers

C. D.

Percie

Sert

Simonsohn

Wagenmakers

E-J.

Ware

J. J.

Ioannidis

J. P.

(2017). A manifesto for reproducible science. Nature Human Behaviour, 1(1), 0021.

70.

Naaman

Grant

Kianersi

Supplee

Henschel

Mayo-Wilson

(2023). Exploring enablers and barriers to implementing the Transparency and Openness Promotion Guidelines: A theory-based survey of journal editors. Royal Society Open Science, 10(2), Article 221093. https://doi.org/10.1098/rsos.221093

71.

Norris

O’Connor

D. B.

(2019). Science as behaviour: Using a behaviour change approach to increase uptake of open science. Psychology & Health, 34(12), 1397–1406. https://doi.org/10.1080/08870446.2019.1679373

72.

Nosek

B. A.

Alter

Banks

G. C.

Borsboom

Bowman

S. D.

Breckler

S. J.

Buck

Chambers

C. D.

Chin

Christensen

Contestabile

Dafoe

Eich

Freese

Glennerster

Goroff

Green

D. P.

Hesse

Humphreys

. . .Yarkoni

(2015). Promoting an open research culture. Science, 348(6242), 1422–1425.

73.

Osborne

Norris

(2022). Pre-registration as behaviour: Developing an evidence-based intervention specification to increase pre-registration uptake by researchers using the Behaviour Change Wheel. Cogent Psychology, 9(1), Article 2066304. https://doi.org/10.1080/23311908.2022.2066304

74.

Parry

Mauthner

N. S.

(2004). Whose data are they anyway?: Practical, legal and ethical issues in archiving qualitative research data. Sociology, 38(1), 139–152. https://doi.org/10.1177/0038038504039366

75.

Piwowar

H. A.

Day

R. S.

Fridsma

D. B.

(2007). Sharing detailed research data is associated with increased citation rate. PLOS ONE, 2(3), Article e308. https://doi.org/10.1371/journal.pone.0000308

76.

Piwowar

H. A.

Vision

T. J.

(2013). Data reuse and the open data citation advantage. PeerJ, 1, Article e175. https://doi.org/10.7717/peerj.175

77.

Nature Research, & Goodey

. (2022). State of Open Data Survey 2022 additional resources (Version 2) [Data set]. figshare. https://doi.org/10.6084/m9.figshare.21295422.v2

78.

Rowhani-Farid

Barnett

A. G.

(2016). Has open data arrived at the British Medical Journal (BMJ)? An observational study. BMJ Open, 6(10), Article e011784. https://doi.org/10.1136/bmjopen-2016-011784

79.

Sayogo

D. S.

Pardo

T. A.

(2013). Exploring the determinants of scientific data sharing: Understanding the motivation to publish research data. Government Information Quarterly, 30, S19–S31. https://doi.org/10.1016/j.giq.2012.06.011

80.

Sieber

(2015). Data sharing in historical perspective. Journal of Empirical Research on Human Research Ethics, 10(3), 215–216. https://doi.org/10.1177/1556264615594607

81.

Sim

Saunders

Waterfield

Kingstone

(2018). Can sample size in qualitative research be determined a priori? International Journal of Social Research Methodology, 21(5), 619–634. https://doi.org/10.1080/13645579.2018.1454643

82.

Smith

(2018). Generalizability in qualitative research: Misunderstandings, opportunities and recommendations for the sport and exercise sciences. Qualitative Research in Sport, Exercise and Health, 10(1), 137–149. https://doi.org/10.1080/2159676X.2017.1393221

83.

Soeharjono

Roche

D. G.

(2021). Reported individual costs and benefits of sharing open data among Canadian academic faculty in ecology and evolution. BioScience, 71(7), 750–756. https://doi.org/10.1093/biosci/biab024

84.

Steltenpohl

C. N.

Lustick

Meyer

M. S.

Lee

L. E.

Stegenga

S. M.

Standiford Reyes

Renbarger

R. L.

(2023). Rethinking transparency and rigor from a qualitative open science perspective. Journal of Trial and Error, 4(1). https://doi.org/10.36850/mr7

85.

Tenopir

Allard

Douglass

Aydinoglu

A. U.

Read

Manoff

Frame

(2011). Data sharing by scientists: Practices and perceptions. PLOS ONE, 6(6), Article e21101. https://doi.org/10.1371/journal.pone.0021101

86.

Tenopir

Dalton

E. D.

Allard

Frame

Pjesivac

Birch

Pollock

Dorsett

(2015). Changes in data sharing and data reuse practices and perceptions among scientists worldwide. PLOS ONE, 10(8), Article e0134826. https://doi.org/10.1371/journal.pone.0134826

87.

Tong

Sainsbury

Craig

(2007). Consolidated Criteria for Reporting Qualitative Research (COREQ): A 32-item checklist for interviews and focus groups. International Journal for Quality in Health Care, 19(6), 349–357. https://doi.org/10.1093/intqhc/mzm042

88.

Towse

A. S.

Ellis

D. A.

Towse

J. N.

(2021). Making data meaningful: Guidelines for good quality open data. The Journal of Social Psychology, 161(4), 395–402. https://doi.org/10.1080/00224545.2021.1938811

89.

Towse

J. N.

Ellis

D. A.

Towse

A. S.

(2020). Opening Pandora’s Box: Peeking inside psychology’s data sharing practices, and seven recommendations for change. Behavior Research Methods, 53(4), 1455–1468. https://doi.org/10.3758/s13428-020-01486-1

90.

Towse

J. N.

Rumsey

Owen

Langford

Jaquiery

Bolibaugh

(2020). Data sharing: A primer from UKRN. OSF. https://doi.org/10.31219/osf.io/wp4zu

91.

UK Data Service. (n.d.). Anonymising qualitative data. https://ukdataservice.ac.uk/learning-hub/research-data-management/anonymisation/anonymising-qualitative-data/

92.

UNESCO. (2021). UNESCO recommendation on open science. https://doi.org/10.54677/MNMH8546

93.

Van den Eynden

Bishop

. (2014). Sowing the seed: Incentives and motivations for sharing research data, a researcher’s perspective. Knowledge Exchange. https://www.knowledge-exchange.info/event/sowing-the-seed

94.

Van den Eynden

Knight

Vlad

Radler

Tenopir

Leon

Manista

Whitworth

Corti

. (2016). Survey of Wellcome researchers and their attitudes to open research. Wellcome. https://doi.org/10.6084/M9.FIGSHARE.4055448

95.

VandeVusse

Mueller

Karcher

(2022). Qualitative data sharing: Participant understanding, motivation, and consent. Qualitative Health Research, 32(1), 182–191. https://doi.org/10.1177/10497323211054058

96.

Wallis

J. C.

Rolando

Borgman

C. L.

(2013). If we share data, will anyone use them? Data sharing and reuse in the long tail of science and technology. PLOS ONE, 8(7), Article e67332. https://doi.org/10.1371/journal.pone.0067332

97.

Wellcome. (2017). Wellcome data, software and materials management and sharing policy. https://wellcome.org/grant-funding/guidance/data-software-materials-management-and-sharing-policy

98.

Whitaker

(2021). Maximising the impact of patient-doctor video observation data. University of Surrey. https://www.surrey.ac.uk/news/maximising-impact-patient-doctor-video-observation-data

99.

Wilkinson

M. D.

Dumontier

Aalbersberg

I. J.

Appleton

Axton

Baak

Blomberg

Boiten

J.-W.

da Silva Santos

L. B.

Bourne

P. E.

Bouwman

Brookes

A. J.

Clark

Crosas

Dillo

Dumon

Edmunds

Evelo

C. T.

Finkers

. . . Mons

(2016). The FAIR Guiding Principles for scientific data management and stewardship. Scientific Data, 3(1), Article 160018. https://doi.org/10.1038/sdata.2016.18