Abstract
1. Introduction
Computational epidemiology is a multidisciplinary field that investigates issues related to epidemiology, such as the spread of diseases and the potential effectiveness of public health interventions, by using computational tools such as computer-based simulations. In computational epidemiology, simulations traditionally rely on mathematical compartmental models (e.g., the susceptible–exposed–infectious–recovered—SEIR—model) that characterize the disease being studied (e.g., its infectivity and transmissibility) and where the (relative) sizes of the compartments change through differential equations. While these models have become quite sophisticated over the years and advanced techniques have been applied to optimize the SEIR parameters (e.g., via swarm optimization 1 ), they mostly rely on assumptions such as homogeneity of the individuals in a population and the distinction of agents solely based on their disease state, that limit the accuracy of the predictions that can be made.2–5 In the real world, however, the spread of a disease is not only affected by the properties of the virus itself, but also by socio-psychological behavioral dynamics that heavily rely on aspects that are heterogeneous across individuals (including spatial and demographic heterogeneity 6 ). Squazzoni et al. 7 point out that, given the significant adverse effects that behavioral interventions can have on individuals, decisions “cannot be solely based on epidemiological knowledge, because the efficacy of implementation depends on people’s reactions, pre-existing social norms, and structural societal constraints.”
To overcome this limitation and introduce complexity and heterogeneity in the models of individuals, agent-based modeling and simulation (ABMS) has been proposed as a key approach for more realistic computational epidemiology. 8 ABMS allows to simulate actions, decisions, and social interactions of individual agents, providing a powerful tool for studying and explaining complex social phenomena related to the spread of diseases.9–11 The importance of ABMS for studying the impact of behavioral health interventions has become particularly evident during the recent COVID-19 pandemic. As a consequence, in the last few years a large number of agent-based simulations (see Tables 1 and 2 for an overview) have been used to predict and study the effect of interventions on disease spread by taking into account aspects such as age, profiles of people and households, interaction patterns along with their evolution in time and space, and individual attitudes.
Despite the recent proliferation of epidemic agent-based models, however, the modeled dynamics of individual decision-making are still limited.12–14 Compliance of individuals with behavioral interventions is typically modeled via stochastic processes that uniformly apply a certain level of compliance across the population.
15
In reality, compliance can be highly nonuniform as it is the result of complex sociological dynamics and of individual decision-making that depend on a number of factors, including demographics, peer influence, political orientation, risk assessments, and beliefs about the efficacy of the behavior.16,17 To build realistic simulations that can accurately anticipate the acceptance or rejection—and thus the effectiveness—of behavioral interventions, agents in epidemic simulations should be capable of autonomous decision-making. This requires more complex and detailed models of agents, that, for the purpose of this work, we will call
Deliberative agents should be capable of deliberation that realistically characterizes the decision-making of humans. Depending on the research question at hand, this could include the capability to take into account the effects of such concepts as habits, fatigue, and political preferences. Moreover, they should be norm-aware, i.e., explicitly capable of taking into account existing norms such as behavioral interventions or laws when deciding their actions. They should further be socially aware, i.e., take into account the behavior and mentalities of other agents, in their decisions whether to comply with behavioral interventions.18,19 Finally, to ensure accurate reflection of real-world decision-making, the design of deliberative agents should be backed by theory and grounded in data.
Various theories of deliberative decision-making have been proposed over the years, from rational decision theories 20 to psychologically based approaches such as the theory of planned behavior (TPB) 21 and the self-determination theory.22,23 These theories conceptualize decision-making behavior in terms of motivational, informational, and deontic attitudes, together with decision rules for the agents to select actions.24–26 Based on these theories, various agent decision-making models have been proposed. The belief–desire–intention (BDI) model, 27 for example, attributes to agents’ mental states such as beliefs, desires, and intentions, and characterizes the agents’ deliberation and reasoning in terms of these mentalistic notions.28,29 Other examples include cognitive models such as Adaptive Control of Thought -- Rational (ACT-R) 30 and SOAR. 31
Various surveys on the use of decision-making agents in social simulations identify scalability as a key issue that limits their applicability.18,32–35 The difficulty lies in scaling these individual models to the population sizes required for meaningful studies due to their computational complexity. Indeed, the vast majority of solutions for epidemic agent-based simulation in the literature falls in one of two categories: frameworks or platforms that can simulate millions of individuals with simple behaviors (e.g., based on simple state machines “triggers”-based behavior), and frameworks or platforms that consider more complex behaviors and social dynamics (e.g., agents that act according to their own agenda and preferences, in a social context, and deliberate about norm compliance) but have limited scalability. As a result, the current literature lacks a unified solution for epidemic simulations that can both scale and support deliberative agents that can model realistic human behavior, for example their deliberate response to interventions.
In this paper, we address this issue by introducing a novel framework that enables large-scale distributed epidemic simulations with deliberative agents. The proposed framework supports simulations with millions of agents that can individually deliberate about their own knowledge, goals, and preferences, and can adapt their behavior during the simulation based on observations of other agents’ behaviors and on their (possibly evolving) attitudes and intentions to comply with norms. To implement deliberative agents, we propose a novel adaptation of A Practical Agent Programming Language (2APL)
36
—a well-known Java-based programming language that is designed to support the implementation of autonomous agents—which we call
We demonstrate the applicability and scalability of the framework by modeling the spread of COVID-19 in the US state of Virginia with ~8 million deliberative agents. Our simulation takes a synthetic population with realistic demographics, weekly activity schedules, and activity locations drawn from real location data as its inputs. Each individual in the synthetic population is modeled as a software agent, which decides its intentions regarding compliance with nonpharmaceutical interventions (NPIs) such as mask wearing and physical and social distancing that were introduced in Virginia in the period March through June 2020. In building and evaluating such a simulation, we report for the first time experiments about the spread of COVID-19 resulting from an agent-based simulation that involves individual deliberative agents at this scale.
This paper significantly extends and integrates our previous work,38,39 where we presented a preliminary version of the framework and a COVID-19 simulation experiment, respectively. In this paper, we describe an improved version of the framework that does not suffer from communication overhead issues highlighted in our preliminary experiments, 38 and supports simulations with ~8 million agents. Moreover, we extend the simulation experiment 39 in two ways: (a) we provide a formal characterization of how normative reasoning takes place in the deliberation cycle of the Sim-2APL agents; (b) we refine and recalibrate the model of norm-reasoning agents, and present more realistic results quantifying the effects of normative interventions in the state of Virginia.
The rest of the paper is organized as follows. We start with a discussion of the state of the art, followed by a presentation of the simulation platform PanSim and the agent programming language Sim-2APL, respectively. Note that our platform is distinct from a similarly named platform that has recently been published. 40 Since our platform was presented earlier,38,39 we have not changed the name in this paper. We then explain how the proposed simulation framework is used to design, develop, and calibrate a simulation experiment for COVID-19 regulations, and report on the strong and weak scalability of our proposed framework as well as the results of the conducted simulation experiments. Finally, we conclude the paper with a discussion of some remaining issues and pointing out directions for future research.
2. State of the art
The past 20 years have seen the development of a number of agent-based (often referred to as individual-based) epidemic and social simulations. Table 1 gives an overview of some of the most influential work based on the three aspects that are the most relevant to this paper: (a) the scalability of the approach; (b) the type of mechanism used to determine compliance with behavioral interventions; and (c) whether the simulation involves deliberative agents.
An overview of agent-based epidemic and other social simulations.
BDI: belief–desire–intention; TPB: theory of planned behavior.
In the great majority of the epidemic simulations that we analyzed, the disease progression is modeled via S(E)I(A)R-like models at the level of individual agents. Each agent is characterized by its disease state (which determines the so-called
An overview of other COVID-19 simulations using individual agent models with agent-based S(E)I(A)R-like infection mechanisms.
BEN: behavior with emotions and norms; MPI: message passing interface.
GAMA supports BDI agents through the BEN architecture. However, to the best of our knowledge, this has not been used for epidemic simulations.
Several works have focused on the scalability of the simulation by utilizing a relatively simple model of the agents. For example, Ferguson et al.
47
adapted an earlier simulation platform created in the context of Influenza
41
to study COVID-19. By leveraging OpenMP,
61
an application programming interface that provides support for multiplatform shared-memory multiprocessing programming, the authors were able to simulate approximately
In the context of COVID, the CityCOVID simulator
50
was built on
Abadeer et al. 53 have developed a flexible tool that simulates the physics of disease transmission—based on factors such as speech volume, ventilation, agent movement, and proximity—in which users can change simulation parameters at run time. However, their agent models are pedestrian locomotion models, extended with SEIR states, and agent decision-making is limited to collision avoidance.
Another strand of work has considered more complex models of agents with the intent to characterize additional information, such as age, economic characteristics, social relationships, personal attitudes, and political preferences that can influence the behavior of individuals, their potential response to interventions, and, therefore, the spread of a disease.
Koehler et al.
52
developed a COVID model in NetLogo and simulated specific counties in the United States. In their model, agents visit a number of locations given by a precalculated contact network degree, alternating one location for 12 h with staying at home for the other 12 h of the day. The network degree was calculated from a contact network derived from data provided by the American Community Survey (ACS) 2020,
63
following the approach of Meyers et al.
64
The proximity of agents is based on agents co-locating at a
COMOKIT 8 is a COVID-19 disease simulation system based on the GAMA 65 multi-agent simulation environment which uses a synthetic population of agents (which can be generated from statistical data or created using gen*). The activity performed by an agent at each hour in the day is determined either by being enrolled in an activity by other agents, or by selecting the activity corresponding to the current day and time. Compliance with NPIs (i.e., norms) is determined by a central authority that the agents query to determine whether they can perform an activity or not. Normative reasoning is, therefore, delegated to the central authority, and individual agents comply with the received directives.
In the work summarized above, agents’ decisions about their daily activities are affected by a number of largely external factors (e.g., NPI telling agents to exhibit behavior like mask wearing and physical distancing, closure of public locations, and the health status of the agent causing them to quarantine). These approaches do not consider agents that explicitly and individually reason about compliance with norms during the simulation. To the best of our knowledge, the only framework to explicitly consider what motivates individuals to comply with an intervention is Agent-based Social Simulation of the Coronavirus Crisis (ASSOCC).
51
In the ASSOCC framework, the reasoning of the agents is grounded in behavioral theories from psychology and sociology, and agents make their decisions based on their needs, such as belonging, compliance, financial stability, health, and leisure. These needs are implemented in a so-called water tank model,
66
in which the satisfaction of needs slowly decays. Agents try to keep the water tanks for their needs filled, but the importance of the various needs differs from agent to agent, based on societal and individual-level values.67,68 However, in their work, only
Employing deliberative agents allows for more realistic modeling of human decision-making. Increased detail and heterogeneity in agent characteristics improve the quality of prediction and forecasting of those models. Even more so, frameworks that can explicitly model individual agent decisions allow for
As has been widely observed,18,32–34 the absence of deliberative agents is largely due to the difficulty of scaling multi-agent simulations involving deliberative agents. Further support for this observation can be found by considering the scale of nonepidemiological (social) simulations that do make use of various types of deliberative agents.
For example, both Bordini and Hübner
56
and Caballero et al.
58
developed social simulations with BDI agents programmed in Jason, a platform that supports the distributed execution of a multi-agent system. The performance of distributed Jason applications has been studied by Pérez-Carro et al.
70
and Fernández et al.
71
However, Pérez-Carro et al. distributed just
From the brief review of the state of the art presented above, we observe that: (a) the literature on agent-based epidemic simulations is split between models that consider only individuals with simple behaviors (e.g., based on simple finite state machines or drawn from distribution) but scale to hundreds of millions of individuals, and models that consider more complex behaviors (e.g., considering age matrices, social differences, and individual choices) but have limited scalability; (b) there is currently no framework for large-scale distributed epidemic simulation with deliberative agents, and one of the reasons is the difficulty of scaling such models to sizes that are adequate for providing meaningful answers in an epidemiological context; and (c) in the existing literature on agent-based epidemic simulations, models that scale lack mechanisms for explicit and adaptive normative reasoning in the decision-making of agents.
In this work, we address the above observations. In our framework, the epidemic simulation platform PanSim calculates the physical disease progression, and the Java-based agent programming language Sim-2APL deals with the individual deliberation of agents. Our framework can distribute epidemic simulations on HPC systems, and employ a realistic synthetic population and a contact network to characterize the population of agents. Moreover, Sim-2APL is designed to impose few restrictions on the simulation author, allowing the use of different constructs to model more complex types of decision-making not natively present in the core library. For example, we will show how Sim-2APL can be used to introduce explicit normative reasoning directly in the individual agents’ decision-making regarding compliance with behavioral interventions, without requiring changes to the core library. Indeed, the experiments reported in this paper are the first large-scale simulations of COVID with deliberative norm-aware agents with adaptive behavior. Moreover, our experiments demonstrate that simulations with millions of deliberative agents are feasible, with a degree of scalability comparable to other approaches in the literature that consider much more simplified models of agents.
3. PanSim: a framework for large-scale distributed epidemic simulation
In this section, we introduce PanSim, a distributed agent-based epidemic simulator with support for complex agent behavior models. PanSim simulations leverage extensive existing work on developing realistic synthetic populations of US states. 78 In these synthetic populations, each agent is assigned demographic variables, such as age, sex, race, household income, and political orientation. The baseline behavior of these agents is characterized by weekly activity schedules. Activities in the agent activity schedules comprise of them visiting specific locations during specific times of the week. Section 5 describes the particular synthetic population used for our COVID-19 case study in more detail. In a PanSim simulation, the transmission of the disease happens probabilistically when two agents are visiting the same location at the same time. In addition, agents are able to observe visible behaviors of other agents (such as coughing, mask wearing, and social distancing) who are at the same location as them at the same time. PanSim allows custom behavior models to be incorporated in the simulation that can modify an agent’s baseline activity schedule based on the decision-making logic.
3.1. Design objectives
PanSim has been built to satisfy the following design objectives:
3.2. Conceptual model
PanSim is a distributed discrete-time agent-based simulation framework. A simulation in PanSim progresses sequentially in discrete time steps, and within a given time step, the simulation progresses in multiple sequential phases. However, within any given phase, computations corresponding to different agents progress concurrently and in parallel. Thus, PanSim’s design can be described as bulk synchronous parallel design. 79
PanSim simulations use realistic synthetic populations of US states
78
for agent data and baseline agent mobility information. In these synthetic populations, agents visit different locations at specified times. These visits can be modeled as a temporal bipartite agent–location graph
To facilitate distributed computation at the beginning of every PanSim simulation, the agent–location bipartite graph
In a PanSim simulation, an agent’s state is comprised of two parts, its disease state (corresponding to the disease model) and its behavioral state (corresponding to the behavior model). The agent behavior model is responsible for deciding the activities that an agent undertakes—which locations the agent visits and when—and how the agent behaves during those visits. Outward behavior exhibited by the agents during the location visits is categorized into two classes: (a) disease modifier behaviors and (b) visible attribute behaviors.
In a PanSim simulation, during a given simulation time step, the following steps are executed. First, every agent in the system decides which locations to visit, and at what times, as well as how to “behave” during each of those visits. These behaviors include disease modifier behaviors and visible attribute behaviors. Second, when visiting a location, the agents come into contact with each other. During this step, disease transmission takes place probabilistically from infectious to susceptible agents. Also, while the agents interact, they observe each other’s visible attributes.
3.3. Structure of a PanSim simulation
A PanSim simulation consists of four major modules: the behavior module, the social interaction module, the disease transmission module, and disease progression module. Figure 1 shows the overall organizations of the modules and their communication patterns.

Structure of a PanSim simulation.
The behavioral module and the social interaction module together represent the behavior model component of a PanSim simulation, while the disease transmission and progression modules together represent the disease model/epidemic component of the simulation. Another way of organizing the modules is to think of them from the perspective of the dynamic agent–location bipartite graph
To write a custom PanSim simulation, the simulation authors only need to provide the code for the behavior module. The rest of modules are provided by PanSim itself. PanSim provides a generic language-agnostic interface, written using Apache Arrow (https://arrow.apache.org/), that can be used to write the behavioral module in many high-level popular programming languages, including Java, Python, and R. The rest of the modules are implemented in PanSim itself in fast low-level languages. A formal description of PanSim is provided in Appendix A (Online Supplemental Material).
3.4. Disease model implementation
PanSim implements compartmental disease models, 80 also known as SIR-like disease models, for epidemic simulations. In these models, each agent can be in one of a finite set of disease states. The model also specifies the probabilities of transition between states—disease progression. In addition, when a “susceptible” agent comes in contact with an “infectious” agent, there is finite probability that the susceptible agent contracts the disease and moves to one of the “infected” states—disease transmission. The particular disease model used in the current work is a SEIAR model with five disease states: susceptible, exposed, infected symptomatic, infected asymptomatic, and recovered. In Section 5, we will show the specific parameters of the COVID-19 disease model. We refer readers to Brauer 80 for a more thorough discussion of compartmental models.
Since disease progression is time-dependent, we, in addition, annotate each state transition in the disease model with dwell time distributions. For each state transition, the disease model author also provides a distribution of transition times for the current state to the next state. During the disease progression step, once the specific next disease state has been selected, the transition time distribution is used to sample the duration after which the transition will occur. This addition makes the compartmental disease models used in PanSim a special case of Probabilistic Time Automata models. 81
PanSim provides a custom domain-specific language to specify the disease model that it automatically parses and implements using a fast implementation written in a low-level language.
3.5. Distributed software implementation
PanSim is a distributed memory HPC application. For implementing PanSim we chose message passing interface or MPI (https://www.mpi-forum.org/) as the distributed memory HPC messaging framework. PanSim makes heavy use of the nonblocking MPI communication primitives to overlap computation and communication phases. While many other distributed memory HPC messaging frameworks have been developed over the years (such as Charm++, 82 UPC++, 83 and Legion and Regent 84 ), MPI remains the most widely supported HPC framework on HPC platforms.
PanSim is implemented in a mix of Python and C++. In PanSim, a C++ process (MPI rank) runs on each CPU core of each available compute node. The behavioral module, written in an arbitrary language, is run as a separate process and shares the CPU cores with the PanSim processes.
To ensure that the behavioral module processes and PanSim processes don’t compete for CPU resources, we use MPI implementation-specific configuration to make PanSim processes sleep during the execution of the behavioral module. The PanSim and behavioral module processes on the same compute node communicate with each other using Apache Arrow tables using the Unix Interprocess Communication (IPC) mechanism.
To be able to utilize distributed computing hardware, the vertices in the agent–location bipartite graph

Partitioning of agents/individuals and locations for distributed processing on PanSim.
We have experimented with using Metis and ParMetis 85 for this partitioning. However, we found that our simple approach was much faster and produced adequately good partitions. Note that the partitioning method leverages the fact that agents in the simulation are modeled after real people, who only frequently visit a small number of places. This partitioning, however, would significantly increase the overall runtime of the simulations if used in a scenario where each agent could visit any location uniformly at random.
PanSim uses a bulk synchronous parallel design. 79 A PanSim simulation progresses in discrete time steps. Within a time step, the execution progresses in five distinct phases, as described formally in Appendix A. The exploitation of parallel hardware in PanSim comes during each phase, during which computation related to all agents and locations can proceed in parallel.
Figure 3 shows the different phases of computation of a PanSim simulation for a single time step. First, in the behavioral decision phase, every agent decides the locations to visit, and how to behave during those visits. This is followed by data exchange among MPI ranks to transfer information to the rank corresponding to the location of the visits. Second, in the social interaction phase, the interactions of the individuals at every location are computed. Third, in the disease transmission phase, the probability of susceptible agents getting infected from visits is computed. After the third phase, data are again exchanged among the MPI ranks to send the social interaction and transmission updates back to the agents they correspond to. Fourth, in the disease transmission phase, the disease state of the agent is updated based on the transmission and progression models. Finally, in the behavioral belief update phase, the behavioral agent state is updated based on the social interaction and the updated disease state of the agent.

Different phases of computation in a single time step of a PanSim simulation.
As shown in Figure 3, the first, fourth, and fifth phases of the simulation are collectively referred to as the individual phases. The computation of these phases can progress concurrently for every agent. Similarly, the second and third phases are location-specific and can be executed concurrently for every location.
3.6. Discussion
For the design of PanSim, we chose a discrete time architecture as opposed to the discrete-event architecture due to the relative simplicity of implementing efficient parallel discrete time models as opposed to parallel discrete-event models, which require additional event conflict detection and rollback logic (in case of an optimistic parallel discrete-event design) or event conflict avoidance/prevention logic (in case of a pessimistic parallel discrete-event design). In addition, due to the multilanguage nature of PanSim’s design, a discrete-event design would incur high cost of going back and forth between the behavior model (implemented using an arbitrary language) and the disease and mobility model (implemented in C++).
PanSim uses static partitioning of the agent–location graph as opposed to dynamic partitioning. 86 While dynamic partitioning can potentially provide better load balancing, this comes at a cost of programming complexity as well as expensive migration and tracking costs. Migrating agents to different compute nodes involves serialization/deserialization of agent state data, destruction of agent objects on the source compute node, and reconstruction of the agent object at the destination compute node. Further additional dynamic tracking and routing mechanisms must be maintained to track the location of agents on compute nodes. In addition, since agent state in PanSim consists of two parts: disease model state (maintained by PanSim) and behavioral state (maintained by user code written by the simulation authors in higher level languages), this task is cumbersome. Thus, we trade off the potential benefits of dynamic load balancing for significant simplicity in implementation and ease of use for simulation authors.
4. Sim-2APL
We now discuss the behavior model of our framework, in which the individual software agents decide their visits based on their disease and behavior states. The agents implement the
We introduce a novel agent programming language that we call
We first provide some necessary background on 2APL and discuss its limitations for simulation. Then, we present the key changes that make Sim-2APL. Finally, we illustrate how we model normative reasoning in the decision-making of Sim-2APL agents to support reasoning about compliance with NPIs during epidemic simulations.
4.1. Background: 2APL
2APL is a Java library for implementing autonomous software agents using object-oriented design patterns (see Dastani and Testerink
36
and Dastani
87
for technical details). A 2APL agent is programmed through four major components. First, the
The execution of a 2APL agent occurs through a cyclic process, called the

The 2APL deliberation cycle.
During the
In 2APL, the execution of agents and delivery of messages is handled by a so-called
4.1.1. Limitations of 2APL for simulation
As the agents can dynamically update their belief, goal, and plan bases, 2APL allows the implementation of many forms of complex proactive and reactive behavior. 36 However, 2APL has two main limitations that make it unsuitable for simulation.
4.1.1.1. Limitation 1 (scalability)
2APL is no exception to other agent programming languages with respective to limited scalability (as discussed in Section 2). More specifically, in 2APL, the amount of CPU resources that can efficiently be used in parallel is severely limited by the tight integration of agent execution and the execution of actions in the environment. This approach essentially makes 2APL a discrete-event-based system, where the events are any action performed by any agent in the environment. Because of the reasons outlined in the previous section, this poses an issue for efficiency if considered in a (distributed) simulation context.
4.1.1.2. Limitation 2 (lack of time-step synchronization)
As discussed above, 2APL allows agents to directly change the environment by their plan execution. As a consequence, the agents are executed asynchronously both from each other and from the environment, and 2APL does not contain an explicit notion of time. This means that agents may have different—possibly inconsistent—information about the environment’s current state. Furthermore, because compute time of agents can differ due to the scheduling of the Java Virtual Machine (JVM), in a simulation context, some agents may be able to perform more deliberation cycles than others in the same (simulated) time, posing difficulties for the repeatability of a simulation.
4.2. Sim-2APL
Sim-2APL addresses the two limitations of 2APL outlined in the previous section. First, Sim-2APL separates the agent deliberation from the execution of actions by deferring the execution of actions until
Together,
4.3. Normative reasoning in Sim-2APL
We now present the normative reasoning process (i.e., reasoning about
We define a norm
The normative reasoning process is implemented in the agents’ plan schemes, and its integration in the agent deliberation cycle is shown in Figure 5. During deliberation, before selecting a plan for some goal, the agent first finds all norms that

The normative deliberation cycle of a Sim-2APL agent starts with instantiating a plan for each trigger (transformed by the normative reasoning in the “Normative Reasoning” box) through the plan application rules in the plan schemes (“Plan Selection”) after which the instantiated plans are executed (“Plan Execution”) resulting in a ordered list of external actions.
The agents’ attitudes are determined by particular beliefs they hold regarding their capabilities, the risks, and (observations of) the behavior of other agents. We call these beliefs
where
In the next section, we describe the norms and factors used in our epidemic simulation case study based on the NPIs implemented in Virginia, USA, against the spread of COVID-19, and how trust is computed.
5. A simulation of compliance with COVID-19 regulations
To illustrate and evaluate our framework, we describe the development and calibration of a simulation of behavioral responses to interventions during a pandemic in the context of COVID-19 (the full simulation model is available at https://github.com/A-Practical-Agent-Programming-Language/Normative-COVID-19-Simulation 92 ). The key components of this simulation are illustrated in Figure 6.

COVID-19 simulation setting.
We start with a synthetic population of the US state of Virginia, where individuals are assigned realistic demographics, weekly activity schedules, and activity locations drawn from surveyed behavior and real location data. In our simulation, each individual in the synthetic population is represented by a Sim-2APL agent that reasons about whether to comply with the various NPIs that were implemented in the state of Virginia and that we model as
In the following, we provide details about the data used to develop the simulation, and about the implementation of the agents and disease models.
5.1. Data sets used in the simulation
To model the agents and to calibrate and evaluate the simulation, we make use of the five data sets described below.
5.1.1. Synthetic population of Virginia, USA
Agents in our simulation are drawn from a synthetic population of the state of Virginia, USA. This synthetic population has been constructed from multiple data sources including the ACS, the National Household Travel Survey (NHTS), and various location and building data sets, as described in Adiga et al.
78
This gives us a very detailed representation of the region we are studying (multiple counties within Virginia). Agents are assigned demographic variables drawn from the ACS, such as age, sex, race, household income, or, optionally, a designation as an essential worker, e.g., medical or retail. The behavior of agents is characterized by weekly activity schedules, a set of typical daily activities the agents perform over the course of 1 week obtained by integrating data from the NHTS. The activity schedule defines the location, start time, and duration of the agent’s activities as one of seven distinct high-level
5.1.2. Mobility data
To model the changes in agents’ mobility due to various executive orders (EOs) implemented between March and July 2020, we use cellphone-based mobility data provided by Cuebiq. This data set contains location pings generated from the cellphones of a large number of anonymous and opted-in users throughout the United States. Cuebiq collected data with informed consent, anonymized all records, and further enhanced privacy by replacing pings corresponding to home and work locations with the centroids of the corresponding Census blockgroups. We aggregate the data to the county level as follows. First, we calculate the average
5.1.3. COVID-19 case data
We use county-level COVID-19 case data from US facts to calibrate the disease model in our simulation. A caveat is that the number of confirmed cases likely under-counted the number of actual cases substantially, especially early in the epidemic, due to limited testing. To account for this, in Section 5 we determine, via calibration of the simulation, a scale factor which we use to multiply the number of reported cases.
5.1.4. Disease model parameters
Table 3 shows the parameters with which we instantiate the disease model formalized in Section 3. The infectivity of symptomatically and asymptomatically infected agents (isymp and iasymp resp) determine the probability of an infected agent infecting a susceptible agent if they spent one unit time (300 s here) in the same location, and are obtained through calibration, which is further explained in Section 5. After being exposed, the agent automatically transition to the next state after a
Simple contagion model (COVID-19 disease model).
Obtained through calibration.
5.1.5. EOs in Virginia
We use a data set of EOs that were implemented in each state in the United States,
96
collected from the Johns Hopkins Coronavirus Resource Center.
97
From this, we extract the EOs that were implemented in Virginia in the period between 1 March and 30 June 2020. In our simulation, EOs are represented by one or more

Timeline of the nine major EOs implemented in Virginia, USA, between March and June 2020. The column on the left indicates the norms belonging to the nine EOs. For each norm in an EO (e.g.,
Table 4 provides the semantics of these norms. The parameters that are associated with the norms in the norm specify the applicability of a particular instance of the norm to particular types of activities or agents. For example, the
A brief explanation of the norms enforced in our simulation and of their parameters.
HE: higher education.
5.2. Instantiating the behavior model
We instantiate the model such that each agent in the simulation represents one individual in the simulated counties of Virginia. These agents draw their demographic characteristics from the synthetic population and adopt (to-do) goals to perform activities from representative weekly activity schedules. The simulation progresses in discrete time steps that each represent 1 day. During the deliberation for a time step
In the following subsections, we describe the norms and the factors that we implemented in our COVID-19 simulation.
5.2.1. Norms and normative reasoning
We distinguish two types of norms:
91
regimented norms (R)—which cannot be violated by the agents—and nonregimented (NR)—which the agents can autonomously decide whether to comply with or not. In the case of R norms, the agents are regimented to always exhibit an
Which activities are affected by regimented (R) and nonregimented (NR) norms, and the factors influencing the decision to comply with each norm.
Applies only if the parameter
If the agent has the essential designation
Unless the agent as any essential designation.
Unless the activity location is marked
If agent has the essential designation
As a proxy for agents communicating or being refused at the door, we stochastically apply this norm to the percentage of agents that results, on average, on the maximum allowed still continuing with the activity.
This norm is applied only to visits with a duration of at least 20 min. Of those, we arbitrarily apply to only 10% because we have no data on what actual visits are restaurant visits. The transformation that is applied is
We have identified which of the activity types each norm that we simulate applies to. These activity types are shown in Table 5 (activities of type
As a special case, we mark the primary work location of each agent that is designated as an essential worker (i.e., with a belief indicating so), with the same designation. We also mark all locations that are used for the activity
For each norm that is not regimented and that applies to a goal of the agent, the agent calculates its attitude
In the case of Transformation
5.2.2. Trust
We assign each agent a value of prior
We assign each household either conservative with a probability equal to the fraction of Republican votes in the 2016 presidential elections in their county, and liberal otherwise. For each group, we create one beta distribution. For each agent, we sample their initial trust value at the start of the simulation from the beta distribution associated with the group of the household they belong to. The beta distributions are characterized by
The Cuebiq mobility data show an increase in mobility after the initial reduction in the first few weeks when the first set of measures was instigated, without relaxations to these measures being applied. This suggests decreased compliance with norms over time. While the cause of this increase in mobility is speculation, we chose to model it as a form of
5.2.3. Factors
We use five factors representing the beliefs that agents use as the evidence supporting norm compliance. All factors are real values in the range
In our model,
In the current work, the value for
5.3. Calibration
We now describe the calibration of the two components—the behavior model (as described in this section) and the disease model (Section 3)—of the simulation we have presented. We calibrate the key parameters of both models separately. The disease model is calibrated against the cumulative confirmed number of cases in Virginia to estimate the infectivity of symptomatically and asymptomatically infected agents. The behavior model is calibrated against the mobility index, i.e., the percentage change in radius of gyration in each county.
Although both behavior and disease components are enabled in every simulation run that we perform for calibration, we calibrate them independently in two separate processes, by fixing the component that is not the target of calibration for that process. The dynamics of one component affect the other, because agents base their decisions on observations of the behavior of other agents and in response to the disease progression, which in turn also depends on the behavior of agents. For this reason, we perform two rounds of calibration, where the first round serves to find initial estimates for both components, and the second round serves to obtain the final parameters.
Both calibration processes are performed by means of Nelder–Mead (NM) minimization.
101
NM iteratively refines an initial configuration of parameters until it finds a local optimum that minimizes a given objective function, in this case the root mean square error (RMSE) between observations of each day in the simulated period in the real world and in the simulation initiated with the tested parameter configuration. The calibration simulations were performed using data from the counties of Charlottesville (
5.3.1. Agent parameters
We calibrate the four parameters of the agent model introduced in Section 10.2.2, i.e., the
We perform the first round of behavior calibration simulations with the parameters of the disease model fixed to
5.3.2. Disease model parameters
We calibrate two parameters of the disease model, the base level infectivity of symptomatic
5.3.3. Calibration results
Both models were calibrated until the objective function did not improve for
Calibration results.
RMSE: root mean square error.

The mobility index observed in the simulation (solid lines) plotted against that recorded by Cuebiq (dashed lines) in each of the simulated counties of Charlottesville (a), Fluvanna (b), Goochland (c), and Louisa (d), and the cumulative percentage of the infected simulated population plotted against the number confirmed cases (
Table 6 shows that the best parameters for the disease model were obtained in the first round, and that the performance of both the behavior model and disease model was worse in the simulation where the final calibrated models were combined than what was obtained with the last calibration round. We believe this is due to minor changes in the starting configuration having a big impact on the final outcomes, so the calibration results in a precarious optimum that depends on the values with which the fixed component was instantiated. In the combined round, the fixed model had slightly different parameters than were used to obtain the calibrated values, but no further calibration took place, so a small negative impact on the results is to be expected. From the results shown in Table 6, it further appears the behavioral model is not very sensitive to the configuration of the disease model, with all parameters and the RMSE being very similar after both calibration rounds (although very different from the starting configuration). This also seems a valid conclusion from the small standard deviation (which is plotted, but hardly shows up in Figure 8) between the five simulations. The outcome of the disease model calibration does differ more starkly between simulation rounds, which we believe indicates a higher sensitivity to both the behavioral model and to the random perturbations resulting from the very low probabilities it concerns. In fact, looking only at the isolated RMSE of the disease component, the best fit appears to be obtained with the uncalibrated behavior model, and the worst fit with the behavior model fixed to the parameters obtained from the second round of calibration. However, both models improved on previous work, where the best RMSE obtained were
Comparing Figure 8(a)–(d), it can be seen that the recorded mobility changes varied widely from county to county. Despite our simulation using the same parameters for each county, it was still able to reproduce some of this variation, albeit less strongly than was recorded. The simulation was able to approximate the changes in mobility best in the county of Louisa (8d), closely followed by Goochland (8c). The observed mobility change in Charlottesville seems to be an outlier compared with the other simulated counties, with mobility
From Figure 8(e), it can be seen that initially the outbreak in our simulation grows slower than what was recorded, but that the simulated number of recovered agents overtakes the (scaled) number of recorded cases after about two-thirds of the simulation. The disease progression in the simulation is clearly exponential, while the recorded progression appears to be more linear (or even logarithmic). Whether this is (a) the result of under-testing in that period of time, (b) the result of randomness caused by the small numbers reported, (c) only showing the very start of a curve that is in actual fact exponential, or (d) an accurate reflection of the actual disease progression is unclear, but since the disease progression is exponential by necessity in our model, we do not expect we can obtain a much better fit with the current data.
5.4. Discussion
We have attempted to minimize the number of free parameters from which the model can be instantiated, to reduce the chance of over-fitting during calibration, but intercounty variations in mobility patterns suggested that mechanisms for decision-making should not be uniformly applied to all agents in the population. The heterogeneity of demographics that is spatially realistically assigned to agents should be able to account for some of this variation. However, on top of that, we looked for a mechanism specifically aimed at trust—which underlies all agents’ attitudes—that we could use to distinguish different counties. Early in the pandemic, the topic of mask wearing has been considered highly polarizing in the United States. Although the media acknowledged many factors to be at play, partisanship was cited as highly influential in the context of mask wearing102,103 and social distancing,104,105 and correlations between the type of news consumption and inclination to wear masks have also been reported
106
(although it should also be noted that media polls suggest the effect of partisanship is grossly mis-estimated by opposing sides of the spectrum
107
). We have not been able to find detailed statistics about types of news consumed in each county, so instead generalize to political voting data, which is available on a county-by-county level. However, the values for the means
To the best of our knowledge, our calibration approach is the first in the context of computational epidemiology that explicitly takes mobility data into account to calibrate the behavior of agents. Contrary to other data-driven models at scale, human decision-making is a key component of this model, and the parameters of that behavior should be grounded in data through calibration. Direct data on human decision-making in the context of NPIs are not available at the scale required for calibration, but changes in mobility provide a suitable proxy,93–95 because a large number of the NPIs under consideration were aimed at reducing mobility.
6. Experiments
We now present two experiments to evaluate our framework, for which we use the calibrated simulation model presented in Section 5.
6.1. Scalability
For the purposes of the scaling experiments, we selected synthetic populations of eight counties in the state of Virginia, USA, with varying sizes, as well as the whole state of Virginia. Table 7 shows the number of persons, households, and their weekly activity schedule (location visits) in the synthetic populations of the selected counties, and of the entire synthetic population of Virginia. To understand the scalability of PanSim + Sim-2APL, we ran individual simulations for each of the eight counties, with each simulation simulating 1 week, representing the week starting from 2 March 2020, at which time the most interventions were active. For the eight counties, simulations were run with 40, 80, 160, and 320 CPU cores, and for the whole state of Virginia, the simulations were run on 320 and 640 CPU cores. All simulations were run on compute nodes that each have two Intel Xeon Gold 6148 CPUs with 20 CPU cores each. The compute nodes used to run the experiments also had 384 GB of DDR4 RAM Memory and were connected to each other with Mellanox ConnectX-5 network adaptors. Each simulation was run 10 times, and the running times were noted.
The counties of the state of Virginia and the whole state used for the experiments, along with the number of persons, households, and weekly location visits in the synthetic population.
We study scaling in two ways. First, we study strong scaling by keeping the problem size fixed and increasing the number of CPU cores. This is done by running the simulation for each county with the four sets of core counts listed above, and for the whole state of VA with the two sets of core counts. The expectation is that the running time should decrease smoothly as the computational resources increase.
Second, to study weak scaling, we keep the computational resources fixed and increase the problem size. This is done by comparing the running times for simulations of increasingly larger counties, while keeping the number of CPU cores fixed. We carried out this experiment for all four sets of CPU core counts as well. The expectation is that the running time should not increase too sharply as the problem size increases.
In both cases, the resulting performance curves should ideally be linear. However, communication overheads can make the curves nonlinear. There is also inherent nonlinearity in the structure of the problem, as the disease spread computation is quadratic in the number of agents simultaneously present at a location. It is also expected that at some point, the overhead of communication between distributed parts of the simulation becomes higher than the efficiency gained by distributing the computation across multiple cores. For smaller problem sizes (i.e., smaller counties), this should become apparent with fewer cores.
6.1.1. Strong and weak scaling results
Figure 9 shows the variance in the runtime of the simulations when run with different numbers of CPU cores. We can see in Figure 9(a) that when the same simulation is run with increasing numbers of CPU cores (strong scaling), all eight counties show an almost linear decrease in run time up to 160 CPU cores on a log–log scale. In the case of the four smaller counties (the four lowest curves), increasing the number of CPU cores to 320 actually increases the runtime due the communication overhead becoming apparent, as discussed above. However, for a larger county like Henrico, the strong scaling results hold even with 320 CPU cores. For the simulation of the whole state of Virginia, increasing the number of CPU cores from 320 to 640 provides a speedup in runtime of 1.963.

The variance in mean run time of PanSim + Sim-2APL simulations with respect to (a) the number of CPU cores used and (b) the number of simulated individuals, for eight counties in the state of Virginia and the whole state of Virginia.
A similar story can be seen for the weak scaling results shown in Figure 9(b), which shows the runtime of simulations with increasing compute load (number of individuals in the county simulated). We can see that for counties with more than 100K individuals, increasing the number of CPU cores to 320 shows definite benefits. However, for the rest of the counties simulated, the benefits of increasing CPU cores are observed only up to 160 CPU cores as there the compute load is insufficient for the benefits of distributing it to offset the additional overhead of MPI communication.
These results demonstrate that PanSim + Sim-2APL simulations integrate well, and can be used to simulate large populations by distributing the computational load.
The PanSim platform has been designed to efficiently scale complex multicontagion simulations. It assumes that the socio-psychological models that are being simulated using it are computationally heavy. In particular, for the current implementation the compute times—per agent, per time step—needs to be in the order of 1 ms for good scaling. The deliberation cycles of agents in Sim-2APL took in our experiments on average 500 ms. Our scaling studies have shown that our agent models were computationally too “light” to fully achieve the benefits of distributing the simulation, and the communication overhead dominated execution time. In our experiments we also tested with simpler and computationally faster models that took approximately 1 ms on average per deliberation cycle. In that case, overhead of the PanSim system started to dominate the runtime of the simulations for the smaller six of the eight counties tested with 160 CPU cores. Traditionally the community appears to exhibit some restraint in using individual agent-based models as opposed to stochastic models due to their computational demand. Our results, however, suggest that individual agent models are not only feasible, but that more computationally demanding agent models (i.e., many and complex reasoning behavior) provide more favorable conditions for scaling than more simple (individual agent) models.
6.2. Counterfactual simulations
In this subsection, we report experiments aimed at illustrating how the proposed framework can be applied for policy study. In particular, we conduct a demonstrative policy study in which we quantify the efficacy of the various EOs in Virginia, taking into account the behavioral response of its residents to interventions, through counterfactual runs with our calibrated model described in the previous section. For this experiment, we employ the same
Given the list of
In each experiment, we compute the total number of agents that have been infected at the end of the simulation. We run each experiment 5 times to account for stochastic variation in the simulation.
Figure 10 shows the number of recovered agents at each time step in the simulations (SIR plots available in the code repository
92
), with the standard deviation of the five runs shown as the confidence interval.

Cumulative cases in our simulation (solid line, with standard deviation plotted as confidence interval) and in the real-world (
Table 8 shows the total number of infected or recovered agents over the course of the simulation. The experiment
The number of cumulative cases in each experiment.
From the results, the single most effective EO appears to be the second
The second most influential single EO appears to be the fifth, which extended the order to stay home when possible
Both EOs announcing relaxations of behavior interventions resulted in a slight increase in the number of infections. For the last EO, this increase was a difference of
The simulations presented in this work have served as a proof of concept to demonstrate the applicability of the proposed framework. It should be noted that various simplifications have been applied to the motivations of the agents and to the actual norms enforced. Moreover, while we use an experiment where we study the effectiveness of the various EOs to demonstrate how our framework can be applied, in our simulation the time when the various EOs were issued (including relaxations) were fixed to the dates they were implemented in Virginia, while in reality they have been issued in response to the actual spread of COVID-19 at that time. Nevertheless, the results reported above show that behavioral responses of individual agents to normative interventions, and not just the effect of an a priori uniform assumption on the level of compliance, can be studied through our proposed simulation framework.
7. Conclusions
In this paper, we have introduced a novel epidemic simulation framework that allows for large-scale distributed simulations with deliberative norm-aware agents. First, in Section 3, we have introduced PanSim, a novel epidemic simulation platform that can (a) integrate realistic behavior models that can capture the complex social dynamics of individual decision-making, (b) incorporate existing agent behavior modeling frameworks written in high-level languages such as Python or Java, (c) seamlessly combine fast epidemic simulation code—written in a low-level language but easily specified in a simple declarative language—with these behavior models, and (d) exploit distributed memory HPC systems. Second, in Section 4, we have presented Sim-2APL, an agent programming language, based on 2APL, that supports the flexible implementation and execution of deliberative agents in discrete time distributed simulations executed with PanSim to implement a realistic human-like behavior model of agents’ reasoning about compliance with nonpharmaceutical intervention norms. To demonstrate our framework combining PanSim + Sim-2APL, in Section 5, we have proposed a COVID-19 simulation with a population of agents instantiated from a representative synthetic population that explicitly models the normative reasoning of individuals about the NPIs issued in the state of Virginia in the United States in the period of March to June 2020. In contrast to similar simulations that uniformly apply compliance across the population, the agents in this work autonomously determine their attitude toward compliance with the NPIs based on a number of factors, most of which are dynamic and heterogeneous across the population. This attitude determines what locations agents visit which directly drives the disease progression. In this way, the (changing) behaviors of the population directly influence the progression of the studied disease. We have further shown that a data-intensive calibration process is feasible with the proposed framework and model. In Section 6, we have (a) demonstrated both the strong and weak scalability of this framework through simulations with a population of 8 million individuals from the state of Virginia, as well as through several smaller simulations with population size ranging from 20,000 to 180,000 agents, and (b) shown how such a simulation can be used to study the efficacy of interventions under varying and heterogeneous compliance.
PanSim has been designed with the intention that the behavior module can be instantiated by arbitrary high-level behavior modeling frameworks written in, e.g., Python or Java. This allows simulation authors to use other, perhaps existing, models to take the place of Sim-2APL in this paper, or even use precomputed activities as its input. Moreover, many social phenomena can be modeled as contagion processes similar to the spread of disease, such as the spread of information 108 and misinformation, 109 the spread of technologies,110,111 fashions, 112 changes in language113,114 and more.115–118 PanSim allows, in its simple declarative language, the specification of “behaviors” that can also be used to represent, e.g., outwardly communicated ideas, information, and use of technologies. Moreover, the semantics of what we have called “locations” in this paper are given by the agents and in PanSim are just nodes in a graph. These locations might equally well be substituted for, e.g., newspapers or web pages (where “visits” become “reads”). All this allows PanSim to be used as a more generic environment and distribution platform than for just computational epidemiology. Sim-2APL supports similar generality. It can be easily connected to other existing simulation environments and platforms (that may or may not already contain complex environmental dynamics) different from PanSim.
In future work, we intend to synchronize the message passing between 2APL agents. In addition, we intend to make both our approach for normative reasoning and our proof-of-concept COVID-19 simulation more realistic, leveraging additional theories from the field of ABMS, psychology, and sociology. For example, while the effect of trust on compliance with NPIs is supported by the literature, the trends on mobility in Virginia show that trust is not static. In this work, we have taken
Supplemental Material
sj-pdf-1-sim-10.1177_00375497231184898 – Supplemental material for A framework for modeling human behavior in large-scale agent-based epidemic simulations
Supplemental material, sj-pdf-1-sim-10.1177_00375497231184898 for A framework for modeling human behavior in large-scale agent-based epidemic simulations by Jan de Mooij, Parantapa Bhattacharya, Davide Dell’Anna, Mehdi Dastani, Brian Logan and Samarth Swarup in SIMULATION
Footnotes
Authors’ Note
Funding
Supplemental material
Author biographies
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
