Sage Journals: Discover world-class research

Abstract

Transparency is a design principle intended to make the inner workings of autonomous agents visible to end-users such that humans can evaluate the reasoning behind its decisions and actions. To test the effect of agent transparency on situation awareness, mental workload, and task performance, an experiment was performed where 34 nautical navigators were tasked with interpreting the information provided by an autonomous collision and grounding avoidance system. Sixteen traffic situations were created with two levels of complexity. Four levels of transparency varied the amount and type of information in terms of the system’s decisions, planned actions, reasoning, and input parameters. The results show that increased transparency improves SA without increasing mental workload. However, the time to comprehend the system’s decisions and planned actions increased when its reasoning was depicted. Traffic complexity impaired SA, mental workload, and time-to-comprehension regardless of transparency level. However, for level 2 SA, transparency was found to negate the influence of complexity, resulting in improved comprehension of the agent’s reasoning despite high traffic complexity. These outcomes demonstrate the merits of agent transparency as a design principle in supporting human supervision of autonomous agents. However, developers should take care when extending these principles to time-critical applications.

Keywords

human-automation interaction autonomous agents supervisory control human-machine interface experimental design

Introduction

Autonomous Shipping and Human Supervisory Control

Systems with autonomous capabilities, typically based on Artificial Intelligence (AI) and Machine Learning algorithms, are proliferating across society and industries. In the maritime domain, ships are envisioned to deploy advanced automation, or ‘agents’, capable of sensing their environment and executing goal-directed behaviour using actuators, allowing for advanced functions to be performed with increasing levels of autonomy (IMO, 2018; Russell & Norvig, 2022). For example, in Japan, a commercial container ship conducted a 790-km trial to test its autonomous navigation capabilities without human intervention (Nippon Yusen Kaisha; NYK, 2022). In Norway, the Yara Birkeland container ship and the ASKO barges have commenced operation with the aim to navigate autonomously within a few years (AS Kolonialgrossistene; ASKO, 2022; Yara International, 2022). Here, operators are envisioned to work in positions from which single or multiple autonomous ships can be continuously monitored and supervised (e.g. see Massterly, 2023). In this context, supervisory performance is dependent on the operator’s ability to ‘[perceive] elements in the environment within a volume of time and space, [comprehend] their meaning, and [project] their status in the near future’, that is, to obtain and maintain situation awareness (SA; Endsley, 1995, p. 36). In other words, this means that operators should be able to perceive critical parameters made available through the control and safety systems, analyse the ship’s current and planned behaviour, and evaluate the plan’s adequacy considering its context (van de Merwe, et al, 2024a). To support operators in achieving and maintaining SA of an autonomous ship’s performance, it is critical to understand how effective human supervisory performance can be achieved whilst avoiding potential human performance pitfalls.

Challenges related to the human supervision of highly automated systems are well documented in the scientific literature (Endsley, 2017). For example, the out-of-the-loop (OOTL) performance problem is attributed to a loss of skills and SA, and occurs when operators are no longer an active part of a system’s information loop (Endsley & Kiris, 1995; Metzger & Parasuraman, 1999). In addition, transitioning back into the information loop often results in high workload because of the need to build SA and regain manual control (Endsley, 2017; Onnasch, Wickens, Li, & Manzey, 2014; Weaver & DeLucia, 2020). Taken together, these challenges are described as the ‘automation conundrum’ which states that ‘the more automation is added to a system, and the more reliable and robust that automation is, the less likely the human operators overseeing the automation will be aware of critical information and able to take over manual control when needed’ (Endsley, 2017, p. 8). This means the safe implementation of systems with autonomous capabilities thus depends on the degree to which humans can oversee the agent’s decisions and actions, and the agent’s ability to afford humans insight into its reasoning processes (J. Y. C. Chen, Procci, et al., 2014a).

Human Performance and Agent Transparency

‘Agent transparency’ (J. Y. C. Chen, Procci, et al., 2014a), ‘system transparency’ (Ososky, et al, 2014), ‘display transparency’ (National Academies of Sciences, Engineering and Medicine, 2022), ‘automation transparency’ (Skraaning & Jamieson, 2021), or simply ‘transparency’ are terms used to describe the ‘understandability and predictability of [a] system’ (Endsley, 2023; Endsley, Bolté, & Jones, 2003, p. 146). Endsley (2017) defined transparency as a means to enhance the understandability and predictability of systems by making observable what it is doing, why it is doing it, and what it will do next. J.Y.C. Chen et al. described agent transparency as ‘the descriptive quality of an interface pertaining to its abilities to afford an operator’s comprehension about an intelligent agent’s intent, performance, future plans, and reasoning process’ (2014b, p. 2). Finally, Lyons depicted transparency as the ability of an operator to perceive an agent’s abilities, intents, and situational constraints (2013). The aim of agent transparency is to provide ‘a real-time understanding of the actions of the AI system’ (National Academies of Sciences, Engineering and Medicine, 2022, p. 31) and enable ‘the operator to maintain proper SA of the system in its tasking environment without becoming overloaded’ (Mercado et al., 2016, p. 402). In addition, transparency intends to facilitate human-agent collaboration when humans are tasked with supervising automated systems. That is, when an agent communicates what it does, why it does it, and what it will do next, human supervision should be supported (Endsley, 2023). Conversely, opaque agents can be challenging to supervise as they may be difficult to interpret because of a lack of information provision (Doshi-Velez & Kim, 2017; Lipton, 2017). In other words, when the agent’s inner workings are made apparent to the user, the user’s comprehension of the agent may be enhanced (Ososky et al., 2014).

In recent years, there has been an increasing interest in understanding the effect of transparency on selected human performance variables including, SA (Selkowitz, Lakhmani, & Chen, 2017; Skraaning & Jamieson, 2021; Wright, Chen, & Lakhmani, 2020), decision making (Bhaskara et al., 2021; Loft et al., 2023), mental workload (Mercado et al., 2016; Stowers et al., 2020), and automation trust (J. Y. C. Chen et al., 2018; Ezenyilimba et al., 2023; Schmidt, Biessmann, & Teubner, 2020). Furthermore, a recent review of the transparency literature studied the relation between agent transparency and human performance variables finding positive effects on SA and task performance, without negatively affecting mental workload, for increasing levels of transparency (van de Merwe et al., 2024b). These findings indicate the potential benefit transparency can have in cases where operators need to understand the behaviour of a system and perform manual intervention when required. Thus, transparency can be especially relevant in safety critical domains where understandability and predictability are essential for safe and effective control of processes (Endsley, 2023; Jamieson, Skraaning, & Joe, 2022).

Agent Transparency and Autonomous Shipping

Several recent studies have addressed agent transparency within the autonomous shipping domain. For example, Ramos et al. (2019) performed a task analysis to derive potential human failures when monitoring autonomous ships. Here, the study identified the importance of the supervisors’ ability to collect and evaluate information from the autonomous ship through ‘an adequate HMI’ (human-machine interface), such that a strategy for intervention could be determined should the automation fail (Ramos et al., 2019, p. 43). Van de Merwe et al. (2024a) identified specific information requirements for supervising autonomous collision and grounding avoidance (CAGA) systems based on a Goal-Directed Task Analysis (GDTA; Endsley et al., 2003). The study highlighted the need for continuous, sufficient, and adequate information about the CAGA system’s decisions, planned actions, and underlying information processing, that is, transparency information, to alleviate some of the human performance issues in supervision and support safe and effective oversight of CAGA systems. Furthermore, Porathe (2021) discussed the use of ‘expert systems’ to aid operators in supervising one or more autonomous ships. Here, HMI concepts were proposed aiding operators to obtain at-a-glance understanding of how the system perceives and understands the nearby traffic and its intentions for solving collision situations. This includes showing how the CAGA system plans to solve a situation by graphically displaying the various options it has considered, and which solution it intends to execute. Also, Van de Merwe et al. (2023a) operationalised transparency for autonomous ships by developing concepts for how an autonomous CAGA system may display its perception and analysis of its environment, determination of collision risk, and plans to resolve the situation. Moreover, Alsos et al. (2022) examined how the transparency concept could be operationalised for autonomous ships. Here, the aim was to assess how autonomous ships can share intent information to external stakeholders, such as passengers, traffic services, and other nearby ships. Finally, operationalising this idea, Simic and Alsos (2023) developed a concept for autonomous urban ferries in which the ship’s perceptions, current state, and future intentions are communicated to external stakeholders through light strips and displays mounted on the outside of the ferry.

Although these studies address the potential benefits of agent transparency in relation to human supervisory control in an autonomous shipping context, they fall short on measuring its purported effects. That is, to the best of our knowledge, no studies have been performed that have empirically tested the effect of transparency on human supervisory performance in an autonomous shipping context and have considered the complexities that can arise in realistic traffic-dense environments. As such, given the concrete developments towards autonomy in the maritime domain, there is a need for knowledge with regards to the application of transparency within this context and study its effect on human performance variables. Therefore, this study aims to extend the literature by empirically evaluating the application of transparency in a maritime autonomous shipping context. Specifically, this study asks what the effects of agent transparency and traffic complexity are on the supervisor’s (1) SA, (2) mental workload, and (3) task performance.

Situation Awareness, Mental Workload, and Task Performance

In complex and dynamic environments, such as shipping, action execution is highly dependent on the human’s ability to make accurate and timely decisions in a constantly changing state of the environment. When the collision avoidance task is performed by an autonomous agent, the supervisor’s mental model is of particular relevance in understanding if its behaviour is according to expectations or whether intervention is needed (Endsley, 2017). In addition, to effectively assess the real-time and future performance of an autonomous collision and grounding avoidance system, supervisors need to have SA of the system in its tasking environment (Endsley, 2023). To support this, CAGA systems should provide detailed and relevant information regarding its internal processing, for example, which elements in its environment it has perceived (ships, objects, and shallow waters), how these affect the ship’s collision and grounding risk (collision, no collision), and how it plans to resolve the situation (give-way and stand-on). This way, SA knowledge, that is, the system’s perceptual, comprehended, and projected knowledge (van Doorn, Rusák, & Horváth, 2017), is directly provided to the supervisor and understandability and predictability of the system should be improved (Bhaskara et al., 2020; Endsley, 2023; van de Merwe et al., 2024b). In other words, when information is provided in a manner that supports the cognitive processes needed for supervision, for example, by providing information compatible with how humans process information and make decisions (Westin, Borst, & Hilburn, 2015), improved SA should be expected. Specifically, it is expected that level 1 SA is improved when the CAGA system depicts its perception of the environment, level 2 SA is improved when the system depicts its analysis, and level 3 SA is improved when it provides its decision and planned actions (see Table 1). Furthermore, it is hypothesised that transparency may especially be beneficial in complex traffic situations where making sense of the system’s reasoning may be challenged by the high amount of information available for interpretation. Presenting information that supports transparency, provided this is presented in a salient, well-organised, and integrated manner, is expected to support SA in such cases (Endsley, 2023; Endsley et al., 2003; Skraaning & Jamieson, 2021).

Table 1.

Summary of Predictions Regarding the Effect of Transparency and Complexity on Situation Awareness, Mental Workload, and Task Performance.

Measure	Impact of Transparency	Impact of Complexity	Interaction
SA	Improved SA with increased transparency	Reduced SA with high complexity	Increased transparency may negate effect of high complexity
Mental workload	No effect predicted	Increased mental workload with high complexity	No interaction predicted
Task performance	Improved task performance with increased transparency	Reduced task performance with high complexity	Increased transparency may negate effect of high complexity

As agent transparency is about disclosing system-internal information, the degree of transparency can typically be varied by increasing or decreasing the amount of information it presents about its internal processes, decisions, and planned actions (see Bhaskara et al. (2021) and Pokam et al. (2019) for examples). Although increased levels of agent transparency imply increased insight into the agent’s reasoning, full disclosure of the system’s internal state may pose challenges in terms of the user’s cognitive processing capabilities (Bhaskara et al., 2020; Wickens, 2018). That is, although increased transparency may benefit SA, this may also add an additional cognitive processing burden due to the resources required for selecting and dividing attention and keeping information in working memory (Wickens & Carswell, 2021). This may be exacerbated in situations where the baseline level of information is already high, that is, in complex traffic situations (Moacdieh & Sarter, 2017). Here, increased levels of transparency information may provide an additional information burden and the risk of overloading the operator with information that supports transparency is high, especially when increased information leads to display clutter (Moacdieh & Sarter, 2015a). However, despite risks of increased workload with agent transparency, recent studies have not found a clear relationship between these variables (Ezenyilimba et al., 2023; Loïck, Guérin, Rauffet, Chauvin, & Éric, 2023; Tatasciore, Bowden, & Loft, 2023), possibly because of the use of graphical symbols and integration of transparency information in task displays (Gegoff, Tatasciore, Bowden, McCarley, & Loft, 2023; van de Merwe, et al, 2024a; van Doorn, Horváth, & Rusák, 2021). Building on these findings, this study anticipates that when, first, information requirements are identified based on an iterative human-centred design approach (Endsley et al., 2003; ISO, 2019), second, symbology is developed based on context-specific industry standards (IEC, 2022), and third, transparency information is integrated in the primary task display (Skraaning & Jamieson, 2021; van Doorn et al., 2021), mental workload will not be affected by agent transparency (see Table 1).

Future supervisors of autonomous systems are likely to divide their attention between multiple units and/or have other concurrent tasks to perform (Cummings & Guerlain, 2007; Mercado et al., 2016; Wohleber, Stowers, Barnes, & Chen, 2023). Such roles may require shifting attention between one unit and another, or between one task and another, emphasising the need for rapid assessment of agent performance and ‘to quickly get in-the-loop’ (Porathe, Fjortoft, & Bratbergsengen, 2020, p. 3). Assuming that human-centred design principles are adequately applied in this study (Endsley et al., 2003; ISO, 2019), this study anticipates that the availability of information that supports transparency, in the form of SA knowledge directly perceivable on the CAGA system’s interface, expedites the supervisor’s attainment of SA (van Doorn et al., 2021). Therefore, it is hypothesized that users spend less time to comprehend the CAGA system’s reasoning when this information is available (Kunze, Summerskill, Marshall, & Filtness, 2019; Roth, Schulte, Schmitt, & Brand, 2020) (see Table 1). Furthermore, it is hypothesised that transparency has a benefit in situations where essential system-internal information may get lost among other information elements, that is, in complex traffic situations. In these cases, provided that transparency information is made salient, presented in a well-organised manner, and integrated in the user’s primary task display, this should facilitate comprehension of the system, despite increased complexity (Moacdieh & Sarter, 2015b, 2017).

Method

Participants

To study the effects of transparency on operator performance variables, an experiment was designed addressing SA, mental workload, and task performance. For this study, 34 navigators with a deck officer license (32 males and 2 females) were recruited as participants (see Table 2). Of these, 30 participants held an active license whilst 4 participants have navigation experience, but at the time of the study their licenses expired between 1 and 5 years prior.

Table 2.

Participant Demographics and Selected Experience With Technologies.

	Min	Max	Mean	Median	Std. dev	Yes	No
Age	25	67	42.7	43.5	11.5
Navigational experience (yrs.)	1	40	13.0	11.0	9.4
Experience at sea (yrs.)	2	43	14.5	12.0	10.3
Active navigational licence (D1-D3)						30	4
Experience with
- Track control autopilot						28	6
- Auto-docking/departure						12	22
- Auto-crossing						16	18

Technical Setup

To maximise recruitment, the first author travelled to locations most suitable for the participants to perform the study, including onboard a passenger ferry where participants worked or at various national nautical training institutes. Nevertheless, the technical setup, conditions, and conduct of the experiment was standardised and consistent regardless of the location where the data was gathered (see Figure 1). The experiment was conducted on a standard portable office computer using a 24″ screen with 1920x1200 resolution running Windows 10. E-Prime 3.0 served as the experimental platform in which the experimental stimuli were provided and primary data was recorded (Psychology Software Tools, Inc, 2023). Finally, post-experiment interviews were recorded using pen and paper.

Figure 1.

The technical setup used for the experiment: on location onboard of one of the passenger ferries, and at the university’s experimental lab.

Execution of the Experiment

Procedure

Figure 2 depicts the execution of the experiment. After a brief introduction, participants signed an informed consent form stating that participation is voluntary and that they had the liberty to withdraw at any stage during the experiment, without reason or penalty. This research complied with the American Psychological Association Code of Ethics and was approved by the Norwegian Centre for Research Data reference number 986652. Informed consent was obtained from each participant. Participants were briefed on the experimental procedure, what was expected of them, and the HMI used in the experiment. A practice session was performed to familiarise the participants of the execution of the experiment, including stimuli and questionnaires. After this, the experiment commenced, and the experimental trials and measurements were performed. Two trials were performed that were identical in set up, but with new traffic situations to avoid familiarisation. After the trials, the pairwise comparisons, as part of the workload measurements, were performed, and a semi-structured interview was conducted. Depending on the participant’s progress, the entire experiment lasted between one and 2 hours and the experimental trials between 10 and 30 minutes each.

Figure 2.

An illustration of the procedure for the experiment.

Experimental Tasks

Participants took the role of a supervisor of a ship equipped with an autonomous CAGA system. They were tasked with observing and understanding a traffic situation depicting own ship in conflict with a target ship and own ship’s proposed solution to resolve it. Once the participant felt they had sufficiently understood the situation, including the system’s solution, they were to press a button on the keyboard after which the screen was blanked, and questions were presented concerning SA and mental workload. To provide participants with a sense of urgency, participants were told they had a 90 second time limit to evaluate the traffic situation after which the radar image would disappear automatically. However, in practice, there was no time limit imposed by the researchers to avoid a ceiling effect in the measurements. No time keeping device was available to the participants. Once the questions were answered, the participant pressed a key to continue, and a new traffic situation was shown. This process was repeated until all traffic situations for all experimental conditions were completed.

The traffic situations were developed by a licensed navigator and reviewed by two independent, and licensed navigators (see Van de Merwe et al. (2023a) for further details). The traffic situations were created on a desktop simulator at a maritime education and training institution. Each traffic situation was configured such that they represented a potential collision situation involving own ship and one other vessel in either a head-on-, crossing-, or overtaking situation. To avoid familiarisation with the traffic situations, multiple variations were developed including conflict situations in coastal- and confined waters, restrictions in target ship’s ability to manoeuvre, and own ship as a stand-on vessel (IMO, 1977). However, to ensure equivalence in difficulty between the situations, they only consisted of one-to-one ship encounters. This meant that, although traffic situations could depict multiple ships, own ship was only in conflict with one other target ship. As such, traffic situations were created with variations in terms of type of conflict situation (head-on, crossing, overtaking/overtaken), who has right of way (own ship is the give-way vessel or the stand-on vessel), type of relevant avoidance actions proposed by the CAGA system (route- and/or speed change), and any restrictions in target ship manoeuvrability (restricted in ability to manoeuvre). In total, 20 unique traffic situations were used for the experiment: four for the familiarisation phase, eight for trial one and eight for trial two, that is, 16 situations for the experimental trials in total. For readers interested in the traffic situations and their configurations, a table is made available as Supplemental Material on the journal’s web site.

Experimental Design

This study used a repeated measures approach in which all participants performed all eight experimental conditions: four transparency levels x two complexity levels. Participants were shown one traffic situation for each condition in each trial. Since the experiment comprised of two trials, participants performed 16 experimental runs in total. The data for each experimental condition was averaged between trial one and two. To avoid familiarisation and order effects, the conditions were administered in random order within each trial.

Independent Variables

Transparency

For this study, four levels of transparency were defined based on the amount and type of information to disclose to the supervisor. Which information to disclose was identified in an earlier study based on a GDTA of collision avoidance manoeuvring (van de Merwe, et al., 2024b). These information requirements were subsequently structured based on an information processing model (Parasuraman, Sheridan, & Wickens, 2000; van de Merwe, et al, 2023b) (see Figure 3).

Figure 3.

The framework for establishing transparency requirement for a CAGA system based on a model of human information processing (adapted from Parasuraman et al., 2000; and adopted from van de Merwe et al., 2023b).

Applying this model to the collision avoidance context allowed the information from the task analysis to be organised into distinct categories with specific information elements belonging to each information processing step (see Table 3). This, in turn, defined which information was depicted (van de Merwe et al., 2023b). In the ‘condition detection’ step, information regarding the system’s input parameters were depicted, including which objects it had detected, a basic classification of these in terms of object type and size, the object’s relative motion to own ship, its sensor status, and any uncertainties in these. In the ‘condition analysis’ step, information regarding the outcome of the system’s analytical process were depicted in terms of objects that posed a collision risk, priorities, intended trajectories and safe speed parameters. In the ‘action planning’ step, the CAGA system depicted its collision avoidance decision and updated passage plan only. Finally, note that the ‘action control’ step was omitted as there was no information processing performed in this stage, only the execution of earlier made decisions and action plans.

Table 3.

Information Elements Corresponding to Each Information Processing Step (van de Merwe et al., 2023b). Key: OT = overtaking/overtaken, HO = head-On, CR = crossing, GW = give-way, SO = stand-on.

Information Processing Step	Information Elements CAGA Should Depict (Excerpts)
1. Condition detection	- Detected objects short and long range
CAGA performs object detection, basic classification, object tracking, and status	- Identified target ship
	- Target object type and size
	- Identified target object as OT/HO/CR
	- Uncertainties in the radar/sensor data
	- Status of sensors
2. Condition analysis	- Objects that pose a risk
CAGA performs object classification, tracking, situation analysis, and risk estimation	- Plotted objects
	- Risk object type and size
	- Risk object priority
	- Risk object course and speed
	- Risk object intended trajectory
	- Risk object conflict type
	- Safe speed parameters
3. Action planning	- Own ship priority (GW/SO)
CAGA decides on collision avoidance manoeuvring and determines an updated passage plan	- Target vessel priority (GW/SO)
	- Own ship intended track and speed

Based on this structure, four levels of transparency were defined in which the amount and type of information was varied. These levels were defined based on the minimum requirements for supervising an autonomous CAGA system. A ‘low’ level of transparency was defined as the information an operator needs at minimum for supervision. In this case, the low transparency level depicts the information about the system’s decisions and planned actions (the ‘action planning’ stage in Figure 3). Additional transparency is provided by adding information from previous information processing steps; in other words, by ‘going backwards’ through the model. As such, the ‘medium (A)’ level of transparency is a combination of the ‘action planning’ and ‘condition analysis’ steps and the ‘medium (B)’ layer is a combination of the ‘action planning’ and ‘condition detection’ steps. The latter level was developed to explore the effects of providing participants with information regarding the system’s detection and action planning only, that is, leaving out the analysis part, and thereby not only varying the amount of information but also the type. Finally, the ‘high’ level provides the information from all steps: the ‘action planning’, ‘condition analysis’, and ‘condition detection’ steps (see Table 4).

Table 4.

Levels of Transparency.

Level of Transparency	Information Processing Steps
Level of Transparency	Condition Detection	Condition Analysis	Action Planning
Low			X
Medium (A)		X	X
Medium (B)	X		X
High	X	X	X

Traffic Complexity

Two levels of complexity were defined for this study: traffic situations with low- and with high complexity. Traffic complexity was defined by the degree to which own ship had the space to perform an avoidance manoeuvre. In cases where there was limited manoeuvring space, for example, because of another ship, the vessel was considered ‘boxed in’ and own ship may needed to postpone an avoidance manoeuvre until the obstruction had been passed, change speed, or choose an alternative solution. Given the additional analysis and decision making that was required for such cases, these were considered more complex than those where a single and unobstructed solution could be implemented. As such, complexity was operationalised by adding objects to the traffic situation and ensuring own ship is boxed in.

Human-Machine Interface

During the experimental trials, participants were shown traffic situations in the form of a static radar image depicted on a radar display from a popular maritime equipment manufacturer (see Figure 4 for an example). On this image, vessels, objects and other radar echoes were shown representing a realistic traffic situation. Information such as settings, range, targets, and (time to) closest point of approach limits were also available and could be freely used by the participant to make sense of the traffic situation.

Figure 4.

A typical traffic situation representing a collision situation (overtaking) with high complexity (without transparency).

Information about the CAGA system’s information processing was added to the radar display (see Figure 5 for an example) and integrated in the primary task display as much as possible (Endsley, 2023; Endsley et al., 2003). The symbology representing information that supports transparency was developed by a licensed navigator using an iterative development process (ISO, 2019), based on the IEC 62288 standard for maritime navigation and radiocommunication equipment (IEC, 2022), and reviewed by two independent and licensed navigators (see Van de Merwe et al., 2023a for more details on the development process). In this case, central information regarding own ship actions, risk analysis, and detections, were overlaid onto the primary information source for collision avoidance, that is, the radar display. Depending on the experimental condition, this information varied depending on the relevant level of transparency (see Table 4) and thereby which elements of the system’s information processing were depicted (see Table 3). An example of a traffic situation with four different levels of transparency is made available as Supplemental Material on the journal’s web site.

Figure 5.

A typical traffic situation representing a collision situation (overtaking) with high complexity (with transparency).

Dependent Variables

Situation Awareness

SA about the system’s solution and information provision was measured after each experimental run using the Situation Awareness Global Assessment Technique (SAGAT; Endsley & Garland, 2000). The SAGAT is an assessment of a person’s SA that is typically applied in experiments where simulations are used. At random intervals during the simulation, the participant’s screen is blanked and specific queries about the scenario are asked. Because the participant’s answers can be evaluated against the data from the simulator, the SAGAT provides an objective assessment of SA (Endsley & Garland, 2000). For this experiment, a pool of 30 generic SAGAT queries were developed and distributed across the traffic situations. These queries were subsequently tailored to specifically fit the situation (see Table 5 for an example). Three SA queries were administered per traffic situation, one for each level of SA, one at a time, and starting at level 1 SA. Participants answered by selecting one of three multiple-choice alternatives per query of which only one alternative was correct. Scores were recorded per level of SA where ‘1’ was correctly answered and ‘0’ was incorrectly answered.

Table 5.

Examples Situation Awareness Queries for the Traffic Situation Depicted in the Above figures. Correct Answers are in Bold Font.

Level of SA	Query
1	How many targets, within 40 degrees of either side of your bow, are sailing approximately in the opposite direction of you?
	a) None
	b) 1
	c) 3
2	What target ship limits your ability to perform an avoidance manoeuvre?
	a) ‘ISAR HIGHWAY’ (target 004)
	b) ‘RT STAR’ (target 007)
	c) ‘NIMBUS’ (target 008)
3	For ‘NIMBUS’ (target 008), what is your ship’s intention?
	a) Pass on its starboard side
	b) Pass on its port side
	c) Pass on its aft

Workload

Workload was measured using the NASA-TLX (Hart & Staveland, 1988). This scale measures self-reported subjective experience of workload across six dimensions (mental demand, physical demand, temporal demand, performance, effort, and frustration level). As part of this scale, participants perform a pairwise comparison to create weights for the dimensions. The sum of the weighted workload scores for all dimensions defines the total workload score. However, as setting the weights after each run is somewhat time-consuming and as the type of task is constant across the experiment, a version of the NASA-TLX was used where participants only perform pairwise comparisons once, and only after all experimental trials were performed. As such, the weights derived from the pairwise comparison applied to all workload scores for the individual runs.

Task Performance

Task performance was defined as the time required for participants to feel they had obtained an understanding of the traffic situation through the information provided by the CAGA system, that is, time-to-comprehension (TTC). Similar to other time-related performance measures, such as eye-tracking, reading speed, search time, and time to task completion, this variable was chosen as an indicator of how quickly humans are able to process information (Gawron, 2019). TTC was self-guided and consisted of the participant deciding that the traffic situation and the visualised solution was sufficiently understood. The time measurement started at the moment the traffic situation was displayed and ended upon a key press by the participant after which the screen was blanked. Time was measured in seconds with no time limit imposed. Still, the participants were urged to be as quick and accurate as possible.

Ranking

After the experimental trials, one representative high complexity traffic situation from the experiment was shown but with different levels of transparency presented. Participants were asked to rank the four variants for each of the dimensions of transparency: observability and predictability (MITRE, 2018). Definitions for these dimensions were read verbatim to the participants and were available on paper, including an example of its application in the collision avoidance context. A think-aloud protocol was used to record the participant’s verbal reasoning of the ranking (Eccles & Arsal, 2017). The traffic situation with four levels of transparency that was used for the ranking is made available as Supplemental Material on the journal’s web site.

Results

Data Analysis and Statistics

In the experiment, two trials were performed (trial 1 and trial 2) that were identical in experimental setup and execution, but for which different traffic situations were used. The data from these trials were averaged, and screened for missing values, outliers, and tested for normality. Due to technical issues with the experimental setup, recording of TTC was incomplete for the initial set of participants, and led to missing values for six participants. This issue was corrected, and no missing values were reported for the remaining participants. As a result, of the 272 measurements for TTC, 20 measurements (7%) were missing. Finally, there were three participants with outliers for the TTC variable that were removed in the final data analysis. An outlier was defined as a data point lying outside 1.5 times the inter-quartile range of that variable. Thus, the data of 25 participants were used in the analysis of this variable.

The dependent variables were tested for normality using the Shapiro-Wilk test (Shapiro & Wilk, 1965). Significant deviations from normality were found for the SA scores. However, the number of observations per cell for these variables was sufficient and equal for each cell (N = 34), such that robustness against normality was attained. As such, it was decided to use the standard Repeated Measures – Analysis of Variance (RM-ANOVA). Main and interaction effects were tested using the F-test, and follow-up pairwise comparisons between the levels of the independent variables were performed using t-tests with Bonferroni corrections.

Table 6 presents the overall descriptive statistics and correlations for the dependent variables. For calculating the correlations, the overall mean scores per participant were calculated for each dependent variable. This resulted in five new variables representing the mean values for each dependent variable, irrespective of transparency- or complexity level. Subsequently, the Pearson correlation was calculated between these dependent variables. Significant positive correlations were found between TTC and level 1 SA (r(23) = .43, p < .05), and TTC and level 3 SA (r(23) = .43, p < .05). In other words, increased TTC were positively associated with the ability to perceive elements in the traffic situation, and the ability to project the status of these elements into the future. In addition, a positive correlation between level 2 SA and level 3 SA was found (r(32) = .51, p < .05), indicating a positive association between the comprehension of the current traffic situation and its projection into the future. No significant correlations were found between TTC and level 2 SA, TTC and workload, and level 1 SA with level 2 SA, level 3 SA, and mental workload.

Table 6.

Overall Means, Standard Deviations, and Pearson Correlations Between the Dependent Variables.

	N	Mean	SD	Level 1 SA	Level 2 SA	Level 3 SA	WL	TTC
Level 1 SA	34	.59	.12	--
Level 2 SA	34	.74	.13	.30	--
Level 3 SA	34	.76	.14	.33	.51**	--
WL	34	59	21	−.20	−.05	−.18	--
TTC	25	46.72	13.47	.43*	.38	.43*	.16	--

*p < .05 and**p < .01. Note that TTC is measured in seconds.

Table 7, Table 8, and Table 9 depict the means and standard deviations for the dependent variables as a function of transparency, complexity, and their interactions. The statistical results for each of these variables, including the figures depicting the interaction between transparency and complexity, are presented in their respective sub-sections below. For readers interested in the graphs depicting the main effects for transparency and complexity, figures are made available as Supplemental Material on the journal’s web site.

Table 7.

Means and Standard Deviations for the Dependent Variables as a Function of Transparency Level only. Note That TTC is Measured in Seconds.

	Transparency Level
	Low		Medium (A)		Medium (B)		High
	M	SD	M	SD	M	SD	M	SD
Level 1 SA	.46	.22	.60	.24	.57	.19	.71	.23
Level 2 SA	.65	.25	.84	.20	.85	.18	.64	.24
Level 3 SA	.73	.19	.69	.23	.77	.20	.85	.21
WL	60.19	20.58	58.37	21.98	60.12	21.67	60.80	24.39
TTC	38.40	14.17	52.62	14.24	44.78	14.75	51.07	14.62

Table 8.

Means and Standard Deviations for the Dependent Variables as a Function of Complexity only. Note That TTC is Measured in Seconds.

	Complexity
	Low		High
	M	SD	M	SD
Level 1 SA	.70	.18	.47	.16
Level 2 SA	.82	.16	.67	.16
Level 3 SA	.85	.15	.67	.18
WL	55.29	21.75	64.45	22.72
TTC	40.30	10.88	53.14	16.99

Table 9.

Means and Standard Deviations for the Dependent Variables as a Function of Level of Transparency and Complexity. Note That TTC is Measured in Seconds.

			Transparency Level
			Low		Medium (A)		Medium (B)		High
			M	SD	M	SD	M	SD	M	SD
Complexity	Low	Level 1 SA	.56	.34	.65	.34	.72	.31	.88	.22
		Level 2 SA	.79	.30	.85	.26	.94	.20	.71	.35
		Level 3 SA	.82	.24	.78	.28	.85	.26	.94	.16
		WL	56.07	20.80	55.19	23.81	53.21	22.55	56.69	24.04
		TTC	31.81	13.38	45.50	11.30	36.55	13.61	47.32	11.98

	High	Level 1 SA	.37	.33	.56	.30	.43	.25	.54	.40
		Level 2 SA	.50	.30	.82	.30	.77	.31	.57	.35
		Level 3 SA	.63	.33	.60	.27	.69	.28	.75	.35
		WL	64.31	22.22	61.54	22.05	67.03	23.85	64.91	26.66
		TTC	44.99	16.73	59.75	18.72	53.00	18.14	54.82	20.37

Situation Awareness

A main effect for transparency was found for level 1 SA (F (3, 31) = 9.37, p < .001, η_p² = .48). The high transparency level (M_high = .71) resulted in improved awareness of elements in the environment compared to the low transparency and the medium (B) condition (M_low = .46, M_{medium (B)} = .57; see Table 7). No differences were found between the medium (A) condition and the other conditions (M_{medium (A)} = .60). A main effect for complexity was found where traffic situations with high complexity indicate lower level 1 SA (F (1, 33) = 30.35, p < .001, η_p² = .48; M_low = .70, M_high = .47; see Table 8). A weak and non-significant interaction was found between the transparency and complexity conditions (F (3, 31) = 2.89, p = .051, η_p² = .22; see Table 9). As Figure 6 depicts, there are differences between the level 1 SA scores between the low- and high complexity conditions across the transparency levels, except for medium (A) transparency, albeit this difference is not statistically significant.

Figure 6.

Mean scores for level 1 SA as a function of transparency and complexity. Note the error bars represent the 95% confidence interval.

A main effect of transparency on level 2 SA was found (F (3, 31) = 10.57, p < .001, η_p² = .51). The SAGAT level 2 scores for medium (A) transparency level (M_{medium (A)} = .84) are higher than the low- and high condition (M_low = .65, M_high = .64). Also, the scores for the medium (B) condition are higher than the scores for the low condition and did not differ from the medium (A) condition (M_{medium (B)} = .85; see Table 7). Furthermore, a main effect of complexity on level 2 SA was found (F (1, 33) = 24.71, p < .001, η_p² = .43; M_low = .82, M_high = .67). This indicates that a lower level 2 SA was achieved in high complexity cases compared to low complexity ones (see Table 8). Finally, a significant interaction between complexity and transparency was found for level 2 SA (F (3, 31) = 3.21, p < .037, η_p² = .24) showing significant differences in level 2 SA scores for medium (A) transparency and complexity (see Table 9 and Figure 7).

Figure 7.

Mean scores for level 2 SA as a function of transparency and complexity. Note the error bars represent the 95% confidence interval.

A main effect of transparency on level 3 SA was found (F (3, 31) = 4.36, p < .011, η_p² = .30). The scores on SAGAT were highest for the high transparency condition (M_high = .85) and significantly higher than the low- and medium (A) transparency conditions (M_low = .73, M_{medium (A)} = .69; see Table 7). No difference between the medium (B) transparency level and the other levels was found (M_{medium (B)} = .77). A main effect for complexity was found in which the low complexity level resulted in higher scores on the SAGAT compared to the high complexity level (F (1, 33) = 38.60, p < .001, η_p² = .54; M_low = .85, M_high = .67; see Table 8). No interaction between complexity and transparency was found for level 3 SA (see Table 9 and Figure 8).

Figure 8.

Mean scores for level 3 SA as a function of transparency and complexity. Note the error bars represent the 95% confidence interval.

Mental Workload

No main effect of transparency on mental workload was found (see Table 6). However, individual dimensions as measured through the NASA-TLX were analysed and showed an effect on the ‘Performance’ sub-dimension (F (3, 28) = 7.79, p < .001, η_p² = .46). Here, the participants reported they were more satisfied with ‘achieving the goals set by the experimenter’ (Hart & Staveland, 1988, p. 30) for the medium (A) transparency level compared to the other levels. Also, a main effect for complexity on mental workload was found (F (1, 33) = 21.96, p < .001, η_p² = .40; M_low = 55.29, M_high = 64.45). This indicates that participants reported higher levels of workload in the high complexity cases compared to the low complexity cases (see Table 8). Finally, no interaction between complexity and transparency was found (see Table 9 and Figure 9).

Figure 9.

Mean scores for mental workload as a function of transparency and complexity. Note the error bars represent the 95% confidence interval.

Task Performance

A main effect for transparency was found for mean TTC (F (3, 22) = 24.73, p < .001, η_p² = .77). The medium (A)-, medium (B)-, and high transparency conditions (M_{medium (A)} = 52.62, M_{medium (B)} = 60.12, M_high = 60.80) led to increased mean comprehension times compared to the low transparency condition (M_low = 38.40). Also, the medium (A)- and high transparency conditions resulted in higher mean comprehension times compared to the low- and medium (B) conditions. No difference in TTC was found between the medium (A)- and high transparency conditions (see Table 7). For complexity, a main effect was found on the mean TTC (F (1, 24) = 46.65, p < .001, η_p² = .66; M_low = 40.30, M_high = 53.14). A high traffic complexity resulted in increased mean comprehension times for the participants (see Table 8). For the interaction between transparency and complexity no effect was found (see Table 9 and Figure 10).

Figure 10.

Mean scores for TTC as a function of transparency and complexity. Note the error bars represent the 95% confidence interval.

Preference

A main effect of transparency was found on the subjective ranking of the transparency levels (F (3, 31) = 616.64, p < .001, η_p² = .98). The medium (A)- and high transparency levels were preferred compared to the low- and medium (B) levels. The low transparency was rated the least preferred, followed by the medium (B) level, and a shared highest preference for the medium (A)- and high transparency level (see Figure 11).

Figure 11.

The participants’ preferences for the transparency levels. Note that a lower score indicates a higher preference (most preferred = 1 and least preferred score = 4).

Results Summary

To summarise, the results from the experiment showed that SA improved with transparency, indicating that level 1 SA was highest for the high transparency condition, level 2 SA was highest in the medium (A) transparency condition, and level 3 SA was highest in the high transparency condition. For all SA measurements, high complexity traffic situations resulted in reduced levels of SA. Moreover, no significant effect of transparency on mental workload was observed, although a significant effect for complexity was found showing that higher traffic complexity resulted in higher perceived mental workload. Furthermore, TTC was highest for the medium (A)- and high level. TTC was also highest for the high complexity traffic situations. Finally, the medium (A)- and high transparency levels were rated as the most preferred by the participants.

Discussion

This study aimed to investigate the relationship between agent transparency, complexity, and selected human performance variables in a maritime autonomous collision avoidance context. Transparency was predicted to have a positive effect on SA and task performance without affecting mental workload. Complexity was predicted to have a negative effect on SA, mental workload, and task performance. Finally, it was predicted that higher transparency levels could mitigate the effect of complexity on SA and task performance. No interaction effect was predicted for mental workload. The hypotheses and corresponding results are summarised in Table 10.

Table 10.

Summary of Predictions and Results Regarding the Effect of Transparency and Complexity on Situation Awareness, Mental Workload, and Task Performance.

Measure	Impact of Transparency	Results Match Prediction?	Impact of Complexity	Results Match Prediction?	Interaction	Results Match Prediction?
SA	Improved SA with increased transparency	Level of SA:	Reduced SA with high complexity	Level of SA:	Increased transparency may negate effect of high complexity	Level of SA:
		1: Yes		1: Yes		1: No
		2: Yes		2: Yes		2: Yes
		3: Yes		3: Yes		3: No
Mental workload	No effect predicted	Yes	Increased mental workload with high complexity	Yes	No interaction predicted	Yes
Task performance	Improved task performance with increased transparency	No	Reduced task performance with high complexity	Yes	Increased transparency may negate effect of high complexity	No

Situation Awareness

For level 1 SA, the highest SAGAT scores were achieved with the highest level of transparency. In Endsley’s definition of SA (1995), level 1 SA is concerned with the perception of elements in their environment and provides the foundation for the higher levels of SA. In this study, it was anticipated that when the system provided information regarding its perception of its environment, that is, ‘condition detection’ (see Table 4), this would support level 1 SA. In this level of transparency, the CAGA system depicts which targets it has detected in the short and long range, the type of conflict with all detected targets, uncertainties in the sensor data, and the status of its sensors (see Table 3). This study anticipated that level 1 SA would be best for transparency levels in which the ‘condition detection’ information would be presented, that is, the medium (B)- and the high conditions. However, the results indicate that the highest level 1 SA scores were achieved only in the high transparency condition and not in the medium (B) condition. Furthermore, no significant difference was found between the high- and the medium (A) transparency condition in terms of level 1 SA, indicating similar SAGAT scores. This may indicate that the information depicted in the ‘condition analysis’ step (e.g. risk objects, intended trajectories, and priorities; absent in the medium (B) transparency condition yet present in the medium (A) condition) may have played a role in achieving improved level 1 SA. Possibly, the additional information regarding collision risk have made the participants more observant of the ship’s surrounding traffic and thus better able to achieve level 1 SA.

For level 2 SA, the highest level of SA was achieved with the medium (A) level of transparency regardless of complexity level. Again, this is as hypothesized as it is at this level the system’s analysis is depicted on the HMI and made available to the supervisor, for example, depicting risk objects, risk priorities, intended trajectories, conflict type, and safe speed parameters (see Table 3 and Table 4). However, the same level of level 2 SA was also achieved for the medium (B) level of transparency compared to the medium (A) level. In the medium (B) level of transparency, the CAGA system depicts which objects were detected in the short and long range, target type, relative motion, status and uncertainties of sensor data, that is no analytical information, yet participants were able to achieve equally high level 2 scores compared to the medium (A) level, where the system’s analytical information was readily available. For example, at the medium (A) level of transparency, the system depicts which objects it sees as posing a collision danger by extrapolating the objects’ current vector and highlighting the level of risk using specific symbology and colours. This way, participants could directly perceive the outcomes of the system’s risk analysis process and use this information to understand the system’s interpretation of the traffic situation. In addition to the medium (B) results, what is somewhat unexpected is that the same level of SA was not achieved in the high level of transparency condition. As the high transparency level includes all information from the medium (A) transparency level, that is, also the analytical information (see Table 3 and Table 4), one could reasonably expect that participants would score equally well on level 2 SA for both the medium (A)- and high transparency conditions. As this is not the case, one explanation may be that the additional information about the system’s detection and sensor information, as shown in the high transparency case (see Figure 5), may have distracted the participants in establishing an understanding of the system’s analysis.

Finally, for level 3 SA, the highest level was achieved with the highest transparency level. No differences were observed between the low and medium (A) transparency levels. To support level 3 SA, the system provided the future state prediction of own ship and target objects. The future state of own ship, that is, its future track and speed (see Table 4), was depicted for each level of transparency. The future state of target ships was depicted for the medium (A)- and high transparency levels but not for the other levels. As such, it would follow that either all transparency levels scored equal on level 3 SA, or that the medium (A)- and high transparency levels scored equal. However, given that only the high transparency level resulted in the highest level 3 SA scores makes this finding somewhat challenging to interpret. One explanation is that the high transparency level provided the complete picture of the system’s interpretation of the traffic situation: its decision and future actions, its analysis, and its object detections, including sensors states. Possibly, providing participants with a complete information overview allowed them to understand own ship’s future state more adequately, considering that they now have a more comprehensive information basis to build this on. In addition, based on the full picture, participants may be better able to reason towards the correct answer when answering the SAGAT.

For traffic complexity, SAGAT scores were lower for the high complexity traffic situations indicating it was more challenging to achieve a similar level of SA in the high complexity cases compared to the low complexity ones. This finding is consistent with earlier observations where increased number of objects presented to a supervisor, including their interactions, increases the number of goals and decisions to be made which, given the limitations of human information processing capabilities, will have an effect on how well SA can be achieved (Endsley, 1995). In terms of interactions between transparency and complexity, an effect was found for level 2 SA pointing towards a positive contribution of the depiction of the system’s reasoning, for example, risk objects, intended trajectories, and priorities, as present in the medium (A) transparency level, for high complexity cases.

Comparing our results to similar studies in which the relationship between transparency and SA was investigated, we find comparable results. For example, Roth et al. (2020) found improvements in SAGAT scores when participants were evaluating agent-generated proposals in an unmanned-manned helicopter teaming operation. In their study, level 3 SA was most improved in the high transparency condition compared to the low condition. Chen et al. (2014b, 2015) found improvements in SA when participants were supervising unmanned aerial vehicles in a search operation, and Selkowitz et al. (2017) reported improved SAGAT scores when monitoring an autonomous robot for level 2 and 3 SA, but not for level 1. However, some studies failed to identify a relationship between transparency and SA for supervision (Skraaning & Jamieson, 2021; Experiment 3) and monitoring tasks (Pokam et al., 2019; Selkowitz, Lakhmani, Chen, & Boyce, 2015; Wright et al., 2020). Overall, these studies point towards an overall neutral to positive relationship between transparency and SA, and this study has strengthened these findings.

Mental Workload

No effect of transparency on mental workload was found. For complexity, increased workload scores were found for all high complexity traffic situations, but there was no interaction effect with transparency.

Still, for one sub-dimension of the NASA-TLX scale: ‘Performance’ a significant relationship between transparency and mental workload was found. Here, participants rated their own performance in relation to the experimental task as better for the medium (A) transparency level compared to the other transparency levels. In other words, as the experimental task was to understand the traffic situation and the system’s handling of it, participants felt they achieved this best in the medium (A) transparency condition. Possibly, participants felt they had sufficient information in the medium (A) condition and therefore felt they were able to meet the goals of the experiment.

When comparing these results to similar studies where participants were tasked with monitoring an autonomous agent only, limited effects of transparency on mental workload were also reported (e.g. Du et al., 2019; Selkowitz et al., 2015, 2017; Wright et al., 2020). A study by Panganiban et al. (2020) found a reduction in mental workload as measured through the NASA-TLX when an autonomous agent communicated its intensions to support the participant in its task execution. Conversely, a study by Selkowitz et al. (2017) reported an increase in eye-fixation duration, a measure of visual search and mental processing (Di Nocera, Camilli, & Terenzi, 2007; Harris, Glover, & Spady, 1986), when monitoring an autonomous robot’s display for its actions.

In studies where participants took the role as a supervisor of an autonomous agent, mostly reductions in workload were found (T. Chen et al., 2014b, e.g. 2015; Skraaning & Jamieson, 2021; Experiment 1 and 2), although an increase (Guznov et al., 2020) and no effect (Skraaning & Jamieson, 2021; Experiment 3) were also reported. Finally, in studies where participants were asked to respond to system-generated proposals, no effect on mental workload was reported (e.g. Bhaskara et al., 2021; Loft et al., 2023; Mercado et al., 2016; Roth et al., 2020; Stowers et al., 2020).

This may imply that the relationship between transparency and mental workload depends on the type of task and role given to the participant (van de Merwe, et al, 2024a). In this experiment, participants did not interact with the autonomous CAGA system as they were only asked to perceive and comprehend its information. Although several of the studies mentioned above found a relationship between transparency and mental workload, 17 out of 23 indicators, as reported in the study by van de Merwe, et al, 2024a did not. This experiment’s result does not change the overall conclusion that adding information that supports transparency to an HMI has a limited effect on mental workload.

Task Performance

The results indicate participants take more time in building up a mental picture in the medium (A)- and high transparency conditions and less time in the low- and medium (B) transparency conditions. Participants consistently took more time to comprehend the traffic situation in the medium (A)- and high levels compared to the low transparency level. This was the case for both the low- and high complexity conditions indicating an equal effect of traffic complexity regardless of transparency level. The results were inconsistent with the hypothesis that the cognitive processes associated with developing a mental picture of the traffic situation would be supported when much of the information needed was readily available on the HMI for the higher transparency cases. It was also hypothesized that this effect would be stronger for the high complexity condition than the low complexity condition, but this was not the case.

Earlier studies have shown inconsistent effects for time-related performance measures associated with transparency. A recent study investigating the impact of transparency on decision risk in human-agent teams measured the time it took for participants to choose between two options suggested by a recommender system (Loft et al., 2023). No differences between various levels of transparency and decision time were found, except for an interaction between decision time and decision risk indicating that transparency alleviated the negative effect of increased risk on response time. A study performed by Skraaning and Jamieson (2021) found reduced response times to events in a nuclear control room simulation study. Here, control room operators were tasked with controlling a simulated nuclear power plant and handle small to large system upsets, including taking corrective action. A reduction in response time to system upsets were found in the transparency condition indicating a better task performance when information that supports transparency was integrated in the primary task HMI. Conversely, a study by Stowers et al. (2020) found an increase in response time with increased levels of transparency. In this study, participants were tasked with monitoring and controlling multiple unmanned vehicles and evaluate plans for these provided by an intelligent agent. Here, the addition of information that supports transparency in the form of basic projection and uncertainty information significantly increased response time, albeit with a small effect size. Finally, Wright et al. (2020) found no difference in the time participants took to identify and assess events when monitoring an autonomous robot.

In our study, response time was driven by the instruction for the participants to ‘continue to the next step when you feel you have built up a sufficient understanding of the traffic situation’, that is, the time needed for comprehension. In contrast with the aforementioned studies, in which participants were asked to evaluate plans, respond to events, or monitor autonomous agents, this study asked participants to build a mental representation of the traffic situation only. Considering that there were no significant differences in TTC between the medium (A)- and high transparency conditions and that both showed significantly higher TTCs than the low- and medium (B) conditions indicates that the analytical information contributed to the participants’ time needed to comprehend the traffic situations. Conversely, this also implies that the addition of the system’s detection information did not contribute to the participants’ TTC.

Considering Table 3 and Table 4, the information presented in the condition analysis step, represented in the medium (A)- and high transparency conditions, depicts elements primarily concerned with collision risk, for example, objects that pose a risk, risk object priority, conflict type, and their predicted course and speed. This information is essential in understanding the CAGA system’s risk determination and is the primary basis for interpreting the reasoning behind its avoidance actions. The information in the condition detection step, represented in the low- and medium (B) transparency conditions, primarily consists of elements depicting what the ship has detected, for example, objects in the short and long range, object type and size, and basic classification of relative motion. That is, whereas the analytical information is specific to objects posing a risk, the detection information covers all objects irrespective of risk.

In this experiment, the participants, all experienced navigators, took the role of a supervisor of a ship equipped with a CAGA system with the task to observe and understand the system’s depicted solutions to traffic conflict situations. Since the system’s analysis and avoidance actions are the most safety critical information to understand, participants may have taken additional time to evaluate the analytical information provided by the CAGA system, as presented in the medium (A)- and high transparency conditions, because they wanted to understand the situation as accurately as possible. The correlational results between TTC and SA support this assumption as participants with higher TTC values also have higher level 1 SA and level 3 SA scores. In other words, those that spent more time observing, interpreting, and understanding the traffic situations also scored better on the SAGAT. Similar results have been reported in eye-tracking studies where increased focus on critical information elements was correlated with improved SA (van de Merwe, et al, 2012). Alternatively, participants in the medium (A)- and high transparency conditions may also have taken more time to analyse the traffic situations because they were comparing CAGA’s analysis with their own. That is, rather than taking the system’s interpretation of the traffic situation at face value, the participants may have performed their own analysis first to ensure they were equipped with sufficient knowledge to be able to scrutinise the systems. Also, given that the CAGA system’s analytical information was not depicted in the low- and medium (B) transparency conditions, the TTC was less than the medium (A) and high transparency conditions because there was less critical information to evaluate and compare. Similar observations have been reported when operators are required to evaluate recommendations and need to compare these to system information and other information sources (Endsley, 2017). As such, considering the potential role of humans in the ship autonomy context where a thorough understanding of the CAGA system’s performance is essential for supervisory performance (van de Merwe et al., 2024b), this finding demonstrates the importance of addressing the type of information in developing transparent agents and not only the amount.

Practical Considerations

The results of this study imply that transparency has value as a design principle for designing CAGA systems given the positive results for SA. In addition, the qualitative feedback from the navigators about which of the levels of transparency they prefer clearly indicates a positive attitude towards HMIs depicting the system’s analytical information at minimum. Conversely, these results also clearly indicate which of the transparency levels were not preferred. For example, the low transparency level, that is, where the system only showed its decisions and planned actions, was the least preferred. In addition, the medium (B) transparency level, that is, where the system’s analytical information was not depicted, ranked just slightly better than the low level. Clearly, our participants preferred to have information about the system’s analytical information in addition to its decisions and planned actions, as indicated by the shared highest ranking of the medium (A) and high transparency levels. Nevertheless, there is no clear result pointing towards the optimal level of transparency across our dependent variables. This means that, when designing for transparency, it may be challenging to decide on which level to implement. Possibly, a more demand-driven transparency, that is, where users adjust the level of transparency depending on the task and context, can be used to provide control to the supervisor over the amount of system information presented. A study by Vered et al. (2020) demonstrated that such an approach could avoid the downsides of presenting transparency information whilst maintaining its benefits. For example, when applied to autonomous shipping, supervisors may only depict a low level of transparency in situations with little to no traffic whilst ‘dialling up’ the level of transparency for situations that require closer supervision. This way, this approach may improve comprehension times compared to the sequential transparency approach as used in our study. However, a potential risk associated with this approach is the potential for choosing an inappropriate transparency level and thereby overseeing important information. Furthermore, this approach allows for potentially large variation in how information is presented on the HMI and the possibility for confusion regarding which level is active. Although an iterative and human-centred design process should address these concerns when developing HMIs, future studies should investigate these risks further.

Limitations and Future Work

This experiment adjusted the transparency of a CAGA system for which information was overlaid onto static radar images. Our approach assumed that future operators of autonomous ships may need to divide their attention between multiple ships and/or tasks and may not continuously monitor a single ship. Therefore, when a ship requires attention, the supervisor may be ‘dropped-into’ the specifics of the operational traffic situation. Our study hypothesized that transparency facilitates this sense-making process needed to quickly build SA. However, despite significant effort put into making the traffic situations as realistic as possible, real-world situations are, of course, dynamic. As such, in dynamic situations supervisors would be able to build a mental representation of the developing traffic situation over time. Although this study provided insights into the effects of transparency on human performance variables in a maritime collision avoidance setting, future research should focus on the application of transparency implementation in dynamic settings, for example, by using real-time simulation facilities.

In this experiment, the CAGA system provided information about its perceptions, analysis, and future intentions regarding a traffic situation to the participants. Participants were only required to answer SA queries about the traffic situation and the system’s proposed handling of it. Through the development of the traffic situations and the transparency levels, significant effort was put into ensuring that the system provided sound conflict resolutions such that disagreements between the participants’ solution to a situation and the system’s solution were kept at a minimum and would not confound the results (van de Merwe et al., 2023a). As such, this experiment did not study the effects of incorrect resolutions or solutions that which the supervisor disagreed with. However, given the body of knowledge available about the potential pitfalls for humans in supervising automation (Endsley, 2017; Onnasch et al., 2014; Strauch, 2018), future work should elaborate on the effect of transparency on the supervisor’s ability to detect and resolve performance deviations, especially when performing under concurrent task demands, such as supervising multiple autonomous ships (Burmeister et al., 2014; Gegoff et al., 2023; Porathe, 2014; Tatasciore et al., 2023).

Conclusions

This study highlighted the relationships between agent transparency and human performance variables, SA, mental workload, and task performance. Our overall findings point towards improvements in all levels of SA as a consequence of transparency, albeit that different levels of transparency affect different levels of SA. In addition, this study found that more time was needed to create a mental representation of the situation when the system’s reasoning was depicted. Interestingly, no significant correlations between mental workload and SA, and mental workload and TTC were found. Given the relationship between task performance, SA, and mental workload (Wickens, Hollands, Banbury, & Parasuraman, 2013), these findings indicate an effort-performance trade-off where participants with increased SA scores also used more time to comprehend the traffic situations, albeit without increased mental workload ratings. Moreover, this study showed clear and consistent effects of complexity on both SA scores, workload ratings, and TTC, consistent with predictions from earlier models (e.g. Endsley, 1995, 2017). No interaction effects between transparency and complexity were found, except for level 2 SA, where transparency negated the effect of traffic complexity. Finally, the medium (A)- and high transparency levels were also the most preferred by the participants.

To summarise, as agent transparency is frequently operationalised through an HMI, our results imply that agent transparency has merits as a design philosophy when developing highly automated systems that require human supervision (e.g. see MITRE, 2018 for guidance). However, implementing transparency ‘is as much an art as it is a science’ given the risk of visual clutter and potential distractions caused by additional information (Wickens, 2018, p. 39). Also, the exact operationalisation of transparency depends on the domain it is applied to and the function allocation between humans and systems (Holder, Huang, Chiou, Jeon, & Lyons, 2021). Although, there is limited evidence-based guidance available for designers to develop transparent agents (Jamieson et al., 2022), this study demonstrated that, by basing the transparency design on a structured human-centred design approach, the purported effects of clutter and information overload were kept to a minimum whilst achieving improvements in SA. Hence, given supervisors have sufficient time available to process the additional transparency information, improved levels of SA may be achieved without burdening supervisors with additional mental workload. As such, if effort is made to integrate information supporting transparency in the primary task interface, human performance benefits can be expected.

Supplemental Material

Supplemental Material - The Influence of Agent Transparency and Complexity on Situation Awareness, Mental Workload, and Task Performance

Supplemental Material for The Influence of Agent Transparency and Complexity on Situation Awareness, Mental Workload, and Task Performance by Koen van de Merwe, Steven Mallam, Salman Nazir, and Øystein Engelhardtsen in Journal of Cognitive Engineering and Decision Making

Footnotes

Acknowledgements

The authors would like to express their sincere gratitude to the navigators for their participation in the experiment and workshops. We would also like to express our sincere gratitude to Koen Houweling,MSc. for his contribution in developing the traffic situations and the transparency illustrations. Finally,we would like to thank the anonymous reviewers for their significant contributions and reflections on the work.

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research,authorship,and/or publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research,authorship,and/or publication of this article: This research is sponsored by the Research Council of Norway,project nr. 311365 and 327903.

ORCID iD

Koen van de Merwe

Supplemental Material

Supplemental material for this article is available online.

Koen van de Merwe is a principal researcher at DNV Group R&D at Høvik,Norway. He received his MSc. in Cognitive Psychology in 2004 and an MSc. in Industrial Ecology in 2006 from Leiden University,The Netherlands and he is currently pursuing his PhD in Nautical Operations at the University of South-Eastern Norway.

Steven Mallam is an Associate Professor of Maritime Human Factors at the Fisheries and Marine Institute,Memorial University of Newfoundland,Canada,and at the Faculty of Technology Natural Sciences and Maritime Sciences at The University of South-Eastern Norway. He received his PhD in Human Factors in 2016 from Chalmers University of Technology,Sweden.

Salman Nazir is a Professor in Training and Assessment at Department of Maritime Operations at the University of South-Eastern Norway. He received his PhD in Industrial Chemistry and Chemical Engineering from Politecnico di Milano,Italy in 2014.

Øystein Engelhardtsen is group lead of the Ship Autonomy section at DNV Group R&D at Høvik,Norway. He is trained as a commanding officer in the Royal Norwegian Navy. He received his MSc. in Cybernetics at the Norwegian University of Science and Technology in 2007.

References

Alsos

O. A.

Hodne

Skåden

O. K.

Porathe

(2022). Maritime autonomous surface ships: Automation transparency for nearby vessels. Journal of Physics: Conference Series, 2311(1), 012027. https://doi.org/10.1088/1742-6596/2311/1/012027

ASKO . (2022). Verdens første batterielektriske autonome sjødroner har ankommet Norge!. ASKO. Retrieved May 10, 2022, from.https://asko.no/nyhetsarkiv/verdens-forste-autonome-sjodroner-har-ankommet-norge/

Bhaskara

Duong

Brooks

McInerney

Skinner

Pongracic

Loft

(2021). Effect of automation transparency in the management of multiple unmanned vehicles. Applied Ergonomics, 90, 103243. https://doi.org/10.1016/j.apergo.2020.103243

Bhaskara

Skinner

Loft

(2020). Agent transparency: A review of current theory and evidence. IEEE Transactions on Human-Machine Systems, 50(3), 215–224. https://doi.org/10.1109/THMS.2020.2965529

Burmeister

H.-C.

Bruhn

Rødseth

Ø. J.

Porathe

(2014). Autonomous unmanned merchant vessel and its contribution towards the e-navigation implementation: The MUNIN perspective. International Journal of E-Navigation and Maritime Economy, 1, 1–13. https://doi.org/10.1016/j.enavi.2014.12.002

Chen

J. Y. C.

Lakhmani

S. G.

Stowers

Selkowitz

A. R.

Wright

J. L.

Barnes

M. J.

(2018). Situation awareness-based agent transparency and human-autonomy teaming effectiveness. Theoretical Issues in Ergonomics Science, 19(3), 259–282. https://doi.org/10.1080/1463922X.2017.1315750

Chen

J. Y. C.

Procci

Boyce

Wright

Garcia

Barnes

M. J.

(2014a). Situation awareness-based agent transparency (No. ARL-TR-6905). U.S. Army Research Laboratory. Aberdeen Proving Ground. https://doi.org/10.21236/ADA600351

Chen

Campbell

D. A.

Gonzalez

Coppin

(2014b). The effect of autonomy transparency in human-robot interactions: A preliminary study on operator cognitive workload and situation awareness in multiple heterogeneous uav management. In Proceedings of Australasian conference on robotics and automation 2014. Australian Robotics and Automation Association. Retrieved from. https://www.araa.asn.au/acra/acra2014/papers/pap166.pdf

Chen

Campbell

D. A.

Gonzalez

L. F.

Coppin

(2015). Increasing Autonomy Transparency through capability communication in multiple heterogeneous UAV management. In 2015 IEEE/RSJ international conference on intelligent robots and systems (IROS), 2434–2439. IEEE. https://doi.org/10.1109/IROS.2015.7353707

10.

Cummings

M. L.

Guerlain

(2007). Developing operator capacity estimates for supervisory control of autonomous vehicles. Human Factors, 49(1), 1–15. https://doi.org/10.1518/001872007779598109

11.

Di Nocera

Camilli

Terenzi

(2007). A random glance at the flight deck: Pilots’ scanning strategies and the real-time assessment of mental workload. Journal of Cognitive Engineering and Decision Making, 1(3), 271–285. https://doi.org/10.1518/155534307X255627

12.

Doorn

E. V.

Rusák

Horváth

(2017). A situation awareness analysis scheme to identify deficiencies of complex man-machine interactions. International Journal of Information Technology and Management, 16(1), 53–72. https://doi.org/10.1504/IJITM.2017.080958

13.

Doshi-Velez

Kim

(2017). Towards A rigorous science of interpretable machine learning. arXiv:1702.08608 [Cs, Stat]. Retrieved from. #https://arxiv.org/abs/1702.08608

14.

Haspiel

Zhang

Tilbury

Pradhan

A. K.

Yang

X. J.

Robert

L. P.

(2019). Look who’s talking now: Implications of AV’s explanations on driver’s trust, AV preference, anxiety and mental workload. Transportation Research Part C: Emerging Technologies, 104, 428–442. https://doi.org/10.1016/j.trc.2019.05.025

15.

Eccles

D. W.

Arsal

(2017). The think aloud method: What is it and how do I use it? Qualitative Research in Sport, Exercise and Health, 9(4), 514–531. https://doi.org/10.1080/2159676X.2017.1331501

16.

Endsley

M. R.

(1995). Toward a theory of situation awareness in dynamic systems. Human Error in Aviation, 37(1), 217–249. https://doi.org/10.4324/9781315092898-13

17.

Endsley

M. R.

(2017). From here to autonomy: Lessons learned from human-automation research. Human Factors, 59(1), 5–27. https://doi.org/10.1177/0018720816681350

18.

Endsley

M. R.

(1995). Toward a theory of situation awareness in dynamic systems. Human Error in Aviation, 37(1), 217–249. https://doi.org/10.4324/9781315092898-13

19.

Endsley

M. R.

(2023). Supporting human-AI teams: Transparency, explainability, and situation awareness. Computers in Human Behavior, 140, 107574. https://doi.org/10.1016/j.chb.2022.107574

20.

Endsley

M. R.

Bolté

Jones

D. G.

(2003). Designing for situation awareness: An approach to user-centered design: Taylor and Francis.

21.

Endsley

M. R.

Garland

D. J.

(Eds.), (2000). Situation awareness: Analysis and measurement. Lawrence Erlbaum Associates.

22.

Endsley

M. R.

Kiris

E. O.

(1995). The out-of-the-loop performance problem and level of control in automation. Human Factors: The Journal of the Human Factors and Ergonomics Society, 37(2), 381–394. https://doi.org/10.1518/001872095779064555

23.

Ezenyilimba

Wong

Hehr

Demir

Wolff

Chiou

Cooke

(2023). Impact of transparency and explanations on trust and situation awareness in human–robot teams. Journal of Cognitive Engineering and Decision Making, 17(1), 75–93. https://doi.org/10.1177/15553434221136358

24.

Gawron

V. J.

(2019). Human performance and situation awareness measures (3rd ed.). CRC Press/Taylor and Francis Group.

25.

Gegoff

Tatasciore

Bowden

McCarley

Loft

(2023). Transparent automated advice to mitigate the impact of variation in automation reliability. Human Factors, 0(0), 00187208231196738. https://doi.org/10.1177/00187208231196738

26.

Guznov

Lyons

J. B.

Pfahler

Heironimus

Woolley

Friedman

Neimeier

(2020). Robot transparency and team orientation effects on human–robot teaming. International Journal of Human-Computer Interaction, 36(7), 650–660. https://doi.org/10.1080/10447318.2019.1676519

27.

Harris

R. L.

Glover

B. J.

Spady

A. A.

(1986). Analytical techniques of pilot scanning behavior and their application. Retrieved from.https://ntrs.nasa.gov/citations/19860018448

28.

Hart

S. G.

Staveland

L. E.

(1988). Development of NASA-TLX (task load index): Results of empirical and theoretical research. In Hancock

P. A.

Meshkati

(Eds.), Advances in psychology (pp. 139–183). North-Holland. https://doi.org/10.1016/S0166-4115(08)62386-9

29.

Holder

Huang

Chiou

Jeon

Lyons

J. B.

(2021). Designing for Bi-directional transparency in human-AI-robot-teaming. Proceedings of the Human Factors and Ergonomics Society - Annual Meeting, 65(1), 57–61. https://doi.org/10.1177/1071181321651052

30.

IEC . (2022). IEC 62288:2022 Maritime navigation and radiocommunication equipment and systems. International Electrotechnical Commission.

31.

IMO . (1977). Convention of the international regulations for preventing collisions at sea (COLREGS). International Maritime Organisation.

32.

IMO . (2018). Maritime Safety Committee (MSC). IMO, 100th session, 3-7 December 2018. Retrieved October 23, 2020, from International Maritime Organisation website. https://www.imo.org/en/MediaCentre/MeetingSummaries/Pages/MSC-100th-session.aspx

33.

ISO . (2019). ISO 9241-210:2019 Ergonomics of human-system interaction—Part 210: Human-centred design for interactive systems. International Organization for Standardization.

34.

Jamieson

G. A.

Skraaning

Joe

(2022). The B737 MAX 8 accidents as operational experiences with automation transparency. IEEE Transactions on Human-Machine Systems, 52(4), 794–797. https://doi.org/10.1109/THMS.2022.3164774

35.

Kunze

Summerskill

S. J.

Marshall

Filtness

A. J.

(2019). Automation transparency: Implications of uncertainty communication for human-automation interaction and interfaces. Ergonomics, 62(3), 345–360. https://doi.org/10.1080/00140139.2018.1547842

36.

Lipton

Z. C.

(2017). The mythos of model interpretability. arXiv:1606.03490 [Cs, Stat]. Retrieved from. https://arxiv.org/abs/1606.03490

37.

Loft

Bhaskara

Lock

B. A.

Skinner

Brooks

Bell

(2023). The impact of transparency and decision risk on human–automation teaming outcomes. Human Factors, 65(5), 846–861. https://doi.org/10.1177/00187208211033445

38.

Loft

Bhaskara

Lock

B. A.

Skinner

Brooks

Bell

(2023). The impact of transparency and decision risk on human–automation teaming outcomes. Human Factors, 65(5), 846–861. https://doi.org/10.1177/00187208211033445

39.

Loïck

Guérin

Rauffet

Chauvin

Éric

(2023). How humans comply with a (potentially) faulty robot: Effects of multidimensional transparency. (p. 1–10). IEEE Transactions on Human-Machine Systems. https://doi.org/10.1109/THMS.2023.3273773

40.

Lyons

J. B.

(2013). Being transparent about transparency: A model for human-robot interaction. In 2013 AAAI spring symposium series. Stanford University.

41.

Massterly . (2023). A snapshot of some of the projects we are involved in. Retrieved August 5, 2023, from.https://www.massterly.com/news-1

42.

Mercado

J. E.

Rupp

M. A.

Chen

J. Y. C.

Barnes

M. J.

Barber

Procci

(2016). Intelligent agent transparency in human–agent teaming for multi-UxV management. Human Factors, 58(3), 401–415. https://doi.org/10.1177/0018720815621206

43.

Metzger

Parasuraman

(1999). Free flight and the air traffic controller: Active control versus passive monitoring. Proceedings of the Human Factors and Ergonomics Society - Annual Meeting, 43(1), 1–5. https://doi.org/10.1177/154193129904300101

44.

MITRE . (2018). Human-machine teaming systems engineering guide (No. MP180941 (p. 68). MITRE Corporation. Retrieved from MITRE Corporation website: https://www.mitre.org/publications/technical-papers/human-machine-teaming-systems-engineering-guide

45.

Moacdieh

Sarter

(2015a). Data density and poor organization: Analyzing the performance and Attentional effects of two aspects of display clutter. Proceedings of the Human Factors and Ergonomics Society - Annual Meeting, 59(1), 1336–1340. https://doi.org/10.1177/1541931215591221

46.

Moacdieh

Sarter

(2015b). Display clutter: A review of definitions and measurement techniques. Human Factors, 57(1), 61–100. https://doi.org/10.1177/0018720814541145

47.

Moacdieh

Sarter

(2017). The effects of data density, display organization, and stress on search performance: An eye tracking study of clutter. IEEE Transactions on Human-Machine Systems, 47(6), 886–895. https://doi.org/10.1109/THMS.2017.2717899

48.

National Academies of Sciences . (2022). Engineering and medicine. Human-AI teaming: State of the art and research needs: The National Academies Press. https://doi.org/10.17226/26355

49.

NYK . (2022). NYK group companies participate in trial to simulate the actual operation of fully autonomous ship. NYK. Retrieved August 25, 2022, from.https://www.nyk.com/english/news/2022/20220303_02.html

50.

Onnasch

Wickens

C. D.

Manzey

(2014). Human performance consequences of stages and levels of automation: An integrated meta-analysis. Human Factors, 56(3), 476–488. https://doi.org/10.1177/0018720813501549

51.

Ososky

Sanders

Jentsch

Hancock

Chen

J. Y. C.

(2014). Determinants of system transparency and its influence on trust in and reliance on unmanned robotic systems. In Karlsen

R. E.

Gage

D. W.

Shoemaker

C. M.

Gerhart

G. R.

(Eds.), Proceedings volume 9084: Unmanned systems Technology XVI. SPIE Defense + Security. https://doi.org/10.1117/12.2050622

52.

Panganiban

A. R.

Matthews

Long

M. D.

(2020). Transparency in autonomous teammates: Intention to support as teaming information. Journal of Cognitive Engineering and Decision Making, 14(2), 174–190. https://doi.org/10.1177/1555343419881563

53.

Parasuraman

Sheridan

T. B.

Wickens

C. D.

(2000). A model for types and levels of human interaction with automation. IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans: A Publication of the IEEE Systems, Man, and Cybernetics Society, 30(3), 286–297. https://doi.org/10.1109/3468.844354

54.

Pokam

Debernard

Chauvin

Langlois

(2019). Principles of transparency for autonomous vehicles: First results of an experiment with an augmented reality human–machine interface. Cognition, Technology and Work, 21(4), 643–656. https://doi.org/10.1007/s10111-019-00552-9

55.

Porathe

(2014). Remote monitoring and control of unmanned vessels –the MUNIN shore control Centre. In Proceedings of the 13th international conference on computer applications and information Technology in the maritime industries (COMPIT ’14) (pp. 460–467).May.

56.

Porathe

(2021). Human-automation interaction for autonomous ships: Decision support for remote operators. TransNav, the International Journal on Marine Navigation and Safety of Sea Transportation, 15(3), 511–515. https://doi.org/10.12716/1001.15.03.03

57.

Porathe

Fjortoft

Bratbergsengen

I. L.

(2020). Human Factors, autonomous ships and constrained coastal navigation. IOP Conference Series: Materials Science and Engineering, 929(1), 012007. https://doi.org/10.1088/1757-899X/929/1/012007

58.

Psychology Software Tools, Inc. (2023). E-Prime 3.0. Psychology Software Tools, Inc. Retrieved from. https://support.pstnet.com/

59.

Ramos

M. A.

Utne

I. B.

Mosleh

(2019). Collision avoidance on maritime autonomous surface ships: Operators’ tasks and human failure events. Safety Science, 116, 33–44. https://doi.org/10.1016/j.ssci.2019.02.038

60.

Roth

Schulte

Schmitt

Brand

(2020). Transparency for a workload-adaptive cognitive agent in a manned–unmanned teaming application. IEEE Transactions on Human-Machine Systems, 50(3), 225–233. https://doi.org/10.1109/THMS.2019.2914667

61.

Russell

S. J.

Norvig

(2022). Artificial intelligence: A modern approach (4th ed., global edition): Pearson.

62.

Schmidt

Biessmann

Teubner

(2020). Transparency and trust in artificial intelligence systems. Journal of Decision Systems, 29(4), 260–278. https://doi.org/10.1080/12460125.2020.1819094

63.

Selkowitz

A. R.

Lakhmani

S. G.

Chen

J. Y. C.

(2017). Using agent transparency to support situation awareness of the Autonomous Squad Member. Cognitive Systems Research, 46, 13–25. https://doi.org/10.1016/j.cogsys.2017.02.003

64.

Selkowitz

A. R.

Lakhmani

S. G.

Chen

J. Y. C.

Boyce

(2015). The effects of agent transparency on human interaction with an autonomous robotic agent. Proceedings of the Human Factors and Ergonomics Society - Annual Meeting, 59(1), 806–810. https://doi.org/10.1177/1541931215591246

65.

Shapiro

S. S.

Wilk

M. B.

(1965). An analysis of variance test for normality (complete samples). Biometrika, 52(3/4), 591–611. https://doi.org/10.2307/2333709

66.

Simic

Alsos

O. A.

(2023). Automation transparency: Designing an external HMI for autonomous passenger ferries in urban waterways. In Proceedings of the 2023 ACM Designing Interactive Systems Conference (1145–1158). ACM. https://doi.org/10.1145/3563657.3596130

67.

Skraaning

Jamieson

G. A.

(2021). Human performance benefits of the automation transparency design principle: Validation and variation. Human Factors, 63(3), 379–401. https://doi.org/10.1177/0018720819887252

68.

Stowers

Kasdaglis

Rupp

M. A.

Newton

O. B.

Chen

J. Y. C.

Barnes

M. J.

(2020). The IMPACT of agent transparency on human performance. IEEE Transactions on Human-Machine Systems, 50(3), 245–253. https://doi.org/10.1109/THMS.2020.2978041

69.

Strauch

(2018). Ironies of automation: Still unresolved after all these years. IEEE Transactions on Human-Machine Systems, 48(5), 419–433. https://doi.org/10.1109/THMS.2017.2732506

70.

Tatasciore

Bowden

Loft

(2023). Do concurrent task demands impact the benefit of automation transparency? Applied Ergonomics, 110, 104022. https://doi.org/10.1016/j.apergo.2023.104022

71.

van de Merwe

Mallam

Engelhardtsen

Ø.

Nazir

(2023a). Operationalising automation transparency for maritime collision avoidance. TransNav, the International Journal on Marine Navigation and Safety of Sea Transportation, 17(2), 333–339. https://doi.org/10.12716/1001.17.02.09

72.

van de Merwe

Mallam

Engelhardtsen

Ø.

Nazir

(2023b). Towards an approach to define transparency requirements for maritime collision avoidance. Proceedings of the Human Factors and Ergonomics Society - Annual Meeting, 67(1), 483–488. https://doi.org/10.1177/21695067231192862

73.

van de Merwe

Mallam

Nazir

(2024a). Agent transparency, situation awareness, mental workload, and operator performance: A systematic literature review. Human Factors, 66(1), 180–208. https://doi.org/10.1177/00187208221077804

74.

van de Merwe

Mallam

Nazir

Engelhardtsen

Ø.

(2024b). Supporting human supervision in autonomous collision avoidance through agent transparency. Safety Science, 169, Article 106329. https://doi.org/10.1016/j.ssci.2023.106329

75.

van de Merwe

Oprins

Eriksson

van der Plaat

(2012). The influence of automation support on performance, workload, and situation awareness of air traffic controllers. The International Journal of Aviation Psychology, 22(2), 120–143. https://doi.org/10.1080/10508414.2012.663241

76.

van Doorn

Horváth

Rusák

(2021). Effects of coherent, integrated, and context-dependent adaptable user interfaces on operators’ situation awareness, performance, and workload. Cognition, Technology and Work, 23(3), 403–418. https://doi.org/10.1007/s10111-020-00642-z

77.

Vered

Howe

Miller

Sonenberg

Velloso

(2020). Demand-driven transparency for monitoring intelligent agents. IEEE Transactions on Human-Machine Systems, 50(3), 264–275. https://doi.org/10.1109/THMS.2020.2988859

78.

Weaver

B. W.

DeLucia

P. R.

(2020). A systematic review and meta-analysis of takeover performance during conditionally automated driving. Human Factors, Advance online publication. https://doi.org/10.1177/0018720820976476

79.

Westin

Borst

Hilburn

(2015). Strategic conformance: Overcoming acceptance issues of decision aiding automation? IEEE Transactions on Human-Machine Systems, 46(1), 41–52. https://doi.org/10.1109/THMS.2015.2482480

80.

Wickens

C. D.

(2018). Automation stages and levels, 20 Years after. Journal of Cognitive Engineering and Decision Making, 12(1), 35–41. https://doi.org/10.1177/1555343417727438

81.

Wickens

C. D.

Carswell

C. M.

(2021). Information processing. In Salvendy

Karwowski

(Eds.), Handbook of human Factors and ergonomics (5th ed., p. 1603). John Wiley and Sons.

82.

Wickens

C. D.

Hollands

J. G.

Banbury

Parasuraman

(2013). Engineering psychology and human performance (4th ed.). Pearson.

83.

Wohleber

R. W.

Stowers

Barnes

Chen

J. Y. C.

(2023). Agent transparency in mixed-initiative multi-UxV control: How should intelligent agent collaborators speak their minds? Computers in Human Behavior, 148, 107866. https://doi.org/10.1016/j.chb.2023.107866

84.

Wright

J. L.

Chen

J. Y. C.

Lakhmani

S. G.

(2020). Agent transparency and reliability in human–robot interaction: The influence on user confidence and perceived reliability. IEEE Transactions on Human-Machine Systems, 50(3), 254–263. https://doi.org/10.1109/THMS.2019.2925717

85.

Yara International (2022). Crown Prince and youths christen world’s first emission-free container ship. Yara International. Retrieved May 16, 2022, from. https://www.yara.com/corporate-releases/crown-prince-and-youths-christen-worlds-first-emission-free-container-ship/

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

2.11 MB

0.00 MB

The Influence of Agent Transparency and Complexity on Situation Awareness,Mental Workload,and Task Performance

Abstract

Keywords

Introduction

Autonomous Shipping and Human Supervisory Control

Human Performance and Agent Transparency

Agent Transparency and Autonomous Shipping

Situation Awareness, Mental Workload, and Task Performance

Method

Participants

Technical Setup

Execution of the Experiment

Procedure

Experimental Tasks

Experimental Design

Independent Variables

Transparency

Traffic Complexity

Human-Machine Interface

Dependent Variables

Situation Awareness

Workload

Task Performance

Ranking

Results

Data Analysis and Statistics

Situation Awareness

Mental Workload

Task Performance

Preference

Results Summary

Discussion

Situation Awareness

Mental Workload

Task Performance

Practical Considerations

Limitations and Future Work

Conclusions

Supplemental Material

Supplemental Material - The Influence of Agent Transparency and Complexity on Situation Awareness, Mental Workload, and Task Performance

Footnotes

Acknowledgements

Declaration of conflicting interests

Funding

ORCID iD

Supplemental Material

References

Supplementary Material