Abstract
Keywords
Introduction
With the advent of high-precision transducers in conjunction with multi-faceted communication and computation hardware,
Besides physical sensing, (a) Examples of physical sensing applications; (b) Examples of social sensing applications.
While physical sensing has an established reputation for accurately capturing raw data from the environment, it suffers from several fundamental limitations such as: (i) physical sensors are designed to be application-specific and are limited by the events they can sense (Khalil et al., 2014), restricting their sensing scope (e.g., a temperature sensor can only capture the surrounding temperature while a microphone is designed to only record sound); (ii) autonomous mobile physical sensing systems such as networks of unmanned aerial vehicles (UAVs) and unmanned ground vehicles (UGVs) do require some form of human assistance to locate events of interest, regardless of being autonomous (Rashid et al., 2019a); (iii) physical sensors are typically scarce resources and need to be deployed sparingly, making their sensing coverage limited (e.g., a group of ground robots might not be able to cover a large forest during a wildfire) (Casbeer et al., 2005); (iv) stationary physical sensors such as proximity sensors and surveillance cameras are installed in particular locations cannot be relocated easily (Rashid and Wang 2021); and (v) physical sensors have an initial deployment cost as well as periodic maintenance costs (Blaszczyszyn and Radunovic 2008).
Social sensing enjoys an array of benefits not typical in physical sensing, such as: (i) multifaceted information acquisition (e.g., people who report traffic incidents on social media can also report crime incidents) (Wang et al., 2019a); (ii) greater mobility (e.g., human sensors tend to spontaneously move from one location to another in contrast to stationary physical sensors) (Zhang et al., 2019b); (iii) lower management costs (e.g. hardware sensors require periodic maintenance and repairs in contrast to human sensors which do not require such service from the application end) (Li et al., 2019); and (iv) wider sensing coverage due to the pervasive nature of social signals and the active participation of individuals (e.g., any person possessing a smart device with Internet connectivity can post on the social media from any part of the world) (Wang et al., 2012b). However, despite its immense benefits, social sensing also has a number of drawbacks: (i) inconsistent reliability since social sensing innately relies on noisy social signals contributed by unvetted human users (e.g., people can report observations that are biased or influenced by personal views) (Zhang et al., 2018b); (ii) uncertain data provenance since human sensors tend to be correlated and may propagate rumors or falsified facts initiated by other users (Shang et al., 2019); (iii) limited sensing availability since social sensing relies on the participatory nature of individuals (e.g., people may be less interested in certain types of public occurrences and not report them through crowdsensing platforms) (Zhang et al., 2018g); (iv) privacy concerns whereby the personal information of the participants of social sensing remains at risk of falling into the wrong hands (e.g., the whereabouts of an individual may be obtained from crowdsensing apps and used by criminals to threaten them) (Pournajaf et al., 2016); and (v) unstructured data since human sensors can use any combination of text (which can further consist of emojis, special characters, and different languages), images, or video to report on social data platforms (Zhang et al., 2016).
Motivated by the complementary virtues of social and physical sensing, The architecture of a social airborne sensing (SAS) system.
A few other notable application domains empowered by SPS include urban search and rescue (Dubey 2019), smart healthcare (Chen et al., 2018), simultaneous localization and mapping (Jiang et al., 2019), human mobility modeling (Noulas et al., 2012), and anomaly detection (Lyu et al., 2016). Figure 3 highlights several recent examples of representative SPS applications which encompass: (i) anomalistic crowd detection with social media and surveillance cameras; (ii) social vehicular sensor network (S-VSN)-based plate recognition; (iii) fire monitoring with UAV and crowdsensing; (iv) road damage detection with satellites and social media; (v) crime reporting with wireless sensor networks (WSN) and crowdsensing; and (vi) contact tracing with social media and wearable sensors. The key design philosophy of such SPS applications is to harness the complementary information from social and physical sensors and draw a complete picture of real-world occurrences that otherwise might not be possible with standalone sensors. For instance, in an anomalistic crowd detection application based solely on networked surveillance cameras, the cameras might only be able to detect crowd events of interest (e.g., election campaigns, protests) and estimate their size without deducing the key attributes of the crowds, such as nature and cause. In contrast, people might post their plans for accumulating in public places across social media platforms (e.g., Twitter, Facebook) and post real-time updates on the progress of the crowds. However, the size and exact duration of the crowds might not be attainable from just the social media reports. When the complementary information from the social and physical sensing sources are merged, it can potentially be used to infer the critical attributes of the crowd (e.g., duration, nature, and cause of the crowd) and tell the complete story behind the crowd gathering in the first place (e.g., for staging a public demonstration in support of a protest). Examples of representative SPS applications: (a) Anomalistic Crowd Detection with Social Media and Surveillance Cameras; (b) S-VSN-based Plate Recognition; (c) Fire Monitoring with UAV and crowdsensing; (d) Crime Reporting with WSN and Crowdsensing; (e) Damage Detection with Satellites and Social Media; and (f) Contact Tracing with Social Media and Wearable sensors.
While SPS promises the groundwork for a paradigm shift in sensing and data collection, it also brings new challenges to address. Examples of such challenges include: (i) how to simultaneously collect relevant data from multitudes of social and physical sensors scattered around the world and relate the collected data to each other in a reliable fashion given their diverse characteristics? (ii) How to efficiently handle the complex interactions between the human, cyber, and physical components in SPS when melding social sensing with physical sensing? (iii) How to handle the data and device heterogeneity originating from the two distinct sensing paradigms (e.g., text data from social media vs. image data from cameras)? (iv) How to characterize the dependency and correlation between the data sources when physical and social sensors are melded together? (v) How to ensure end-user privacy and security considering the diverse sets of complementary information contained in the social and physical sensing mediums (e.g., geo-location data from mobile devices can be combined with information from social media posts of users to reveal sensitive information)? (vi) How to adapt to the intricate dynamics that arise when jointly exploring the physical world and the social domain (e.g., how to concurrently cope with the rapidly evolving physical world events and the escalating social media reports during an emergency response)?
Although the above challenges impose difficulty in developing effective SPS systems, they also set forth opportunities to instigate future research directions. To address the highlighted challenges, we envision the potential to incorporate techniques from multiple disciplines, such as networked sensing, communication systems, estimation theory, control theory, artificial intelligence (AI), distributed systems, and cryptography. Several current survey papers on physical sensing have investigated the functionality and features of recent physical sensing approaches (e.g., roadside surveillance systems, wildfire monitoring systems, indoor localization using wireless networks) (Lee and Gerla 2010; Zafari et al., 2019). On the same note, several survey papers on social sensing have provided comparative studies on representative social sensing schemes (e.g., fuel availability finder using crowdsensing apps, social media-driven interesting place discovery) (Ferreira et al., 2019; Xintong et al., 2014; Li et al., 2016). While a few survey papers have explored some sensing approaches that fall at the intersection of social sensing and physical sensing and are partially related to SPS (Shi et al., 2011; Zeng et al., 2020; Dressler 2018), they do not focus on an extensive overview of the SPS paradigm itself or present a comparative study of existing SPS applications. Most importantly, past studies have not fully addressed the need for highlighting the key challenges prevalent in emerging SPS systems, which are necessary for designing, implementing, and evaluating emerging SPS systems and applications. This survey paper aims to reduce this knowledge gap in the existing literature and extensively explore SPS.
The rest of the paper is organized as follows. Section 2 presents an in-depth overview of SPS. Section 3 outlines the key enabling technologies for SPS. In Section 4, we identify the different applications propelled by SPS and discuss the corresponding state-of-the-art solutions. Section 5 elucidates the key potential research challenges in constructing reliable and pervasive SPS. In Section 6, we highlight a few research directions and opportunities for future work in SPS to mitigate the identified challenges. Lastly, in Section 7, we manifest a reflection of our findings and conclude our survey of SPS.
Overview of SPS
This section provides a detailed overview of social-physical sensing (SPS). Specifically, we discuss the deficiency of earlier literature in defining SPS and describe the possible formats of SPS.
Before detailing the underpinnings of SPS, it is essential to highlight why prior studies have not acknowledged the need for a generalized definition of SPS. First, depending on the application context, the lines between social and physical sensors often tend to be blurred. For example, at first glance, an urban air quality monitoring application that uses a crowdsourcing app and social media to take user inputs for assessing the air quality might appear to be a purely social sensing application. However, if the application utilizes the GPS and accelerometers of the users’ smartphones to determine the location and position of the users or relies on images taken by the users through the crowdsensing app (e.g., pictures of the sky or surroundings), the application also involves physical sensors. As such, it can be categorized as an SPS scheme. Since there are diverse ways of intertwining the plethora of social and physical sensors in applications that can be classified as SPS, there is no single widely accepted definition of SPS. Second, while SPS is a versatile sensing paradigm, it is a relatively new sensing paradigm that has not been extensively explored by existing literature. A few early survey papers have attempted to discuss sensing approaches that incorporate social and physical sensors such as cyber-physical-social systems (CPSS) (Dressler 2018) and cyber-social systems (CSS) (Wang et al., 2019c). However, such papers solely discuss mapping physical and social sensors to cyberspace by considering the entities as black-box information retrieval tools. Moreover, survey papers on CPSS and CSS primarily focus on controlling or monitoring physical processes through feedback loops without explicitly defining SPS.
As illustrated in Figure 4, SPS encompasses several diverse domains based on the application requirements and the data acquisition tools involved. While there are no strict classification criteria for SPS schemes, the applications in SPS may be broadly classified into a few significant types, as discussed below.
The first major type of SPS involves information acquisition from reports obtained from
The second major type of SPS melds
The third major type of SPS involves
The fourth major type of SPS combines
While the discussed categories represent the major formats of SPS applications, different variants of SPS can be further combined based on the application criteria since there are no absolute boundaries across the application types. For example, in a search and rescue application in the aftermath of an earthquake, locations of potential victims can be collectively gathered from social media posts and crowdsensing-based crisis reporting apps. Subsequently, ground robots might be dispatched to the reported locations to validate the information from the social data platforms.
By leveraging the collective wisdom of social and physical sensors, SPS can sense the real world and help control and actuate critical real-world processes. Examples of such control processes include mitigating traffic accidents, reducing the spread of diseases, and preventing crimes in high-risk areas. While traditional social and physical sensing systems focus on acquiring environmental stimuli, SPS applications aim to bridge the gap between the social and physical worlds by establishing a closed-loop system connecting the human, cyber, and physical worlds. To accomplish the above objectives, SPS requires careful coordination and interaction between essential enabling technologies, which are discussed in the following section. An overview of the SPS paradigm.
Enabling technologies
This section discusses the key enabling technologies that form the foundation of SPS. In Figure 5, we present an abstraction model comprising the fundamental enablers for SPS. The bottom-most layer is the Abstraction layers making up SPS.
Figure 6 illustrates a few examples of the enabling technologies: (i) for the data acquisition platforms, there can be any combination of sensor-fitted autonomous UAVs, surveillance cameras, social media websites like Twitter, or crowdsensing apps; (ii) the communication technologies and protocols can be comprised of WiFi, Bluetooth, LTE, or MQTT; and (iii) the computing paradigms can be made up of distributed compute nodes and edge devices like smartphones. In the following section, we detail the functionality of each key SPS enabler. Examples of the key enabling technologies for SPS: 
Data acquisition platforms
An essential component of the sensing process in SPS is data collection. The key drivers for data acquisition in SPS can be classified broadly into social and physical data platforms. The details of the platforms are discussed below.
Social data platforms
Intuitively, social data platforms embody the mediums of information retrieval where human sensors are directly involved in synthesizing knowledge. Recent literature such as (Batrinca and Treleaven 2015; Olteanu et al., 2019) has extensively reviewed solutions incorporating social data platforms. Social data platforms can be further subdivided into two types.
The first type of social data platform is
The second type of social data platform is
Crowdsensing can be further divided into two subcategories. One variant of crowdsensing is non-monetized crowdsensing, where individuals perform small sensing tasks on a
Physical data platforms
As the name implies, physical data platforms are made of hardware sensing devices for data capture (e.g., cameras and thermal scanners) (Khalil et al., 2014). A good amount of effort has been contributed towards the development of energy-efficient and high-resolution transducers and electronic devices for physical sensors. Examples of such schemes can be found in (Babiceanu and Seker 2016; Stavropoulos et al., 2020).
The first form of physical data acquisition tools is based on
While the physical data platforms are shared with other applications, such as IoT, one crucial distinction exists. In IoT and other related applications, the data acquisition platforms only consist of fixed and mobile physical sensors (Yasumoto et al., 2016) and often do not entail social media portals or crowdsensing apps. However, in SPS, the data sources additionally require social media and crowdsensing platforms as the fundamental drivers of knowledge. In SPS, the confluence of the social and physical data platforms helps to collect an extensive and comprehensive representation of the physical world. As an example of how the complementary information from social and physical data platforms in SPS can be leveraged to retrieve knowledge from the real world, let us consider a post-disaster resource monitoring application based on social vehicular sensor networks (S-VSN). Following a disaster (e.g., hurricane or flood), locating vital resources such as fuel and pharmacy is critical. Often people report information about such resources on social media websites such as Twitter. However, the availability of fuel at gas stations or the chances of a pharmacy being open might change at any time following the disaster. Car drivers driving nearby can be dispatched to the reported locations of the vital resources based on the tweets. Afterwards, the onboard sensors of the cars (e.g., dashboard cameras) can be used to confirm or debunk the information about the availability of the resources. Thus, the mutual information exchange between the social and physical data acquisition platforms enables SPS applications to perceive and interpret real-world phenomena with greater fidelity.
Communication technologies and protocols
The data exchange between the entities in SPS is enabled by diverse communication technologies and protocols (Al-Fuqaha et al., 2015). Based on the application context (e.g., critical vs. non-critical), nature of the environment (e.g., outdoor vs. indoor), and energy profiles of the data sources (e.g., battery-powered UAVs vs fixed surveillance cameras), appropriate networking standards and protocols can be incorporated, a selection of which are discussed below.
Ubiquitous local wireless connectivity and cellular technology
In SPS, communication across the entities (e.g., UAVs, data centers, and smartphone apps) relies on ubiquitous local wireless connectivity and cellular technology. One can read more about local wireless standards and cellular technology in (Mahmood et al., 2015; Sidhu et al., 2007). Commonly used connectivity methods in SPS include WiFi and Bluetooth, which utilize radio waves to transfer data among connected devices (Rashid et al., 2015). For longer-range communication in SPS or fast-traveling mobile physical sensors (e.g., cars, UAVs, and UGVs), cellular technology is preferred, specifically the LTE (Long-Term Evolution) and the newer 5G standards, which are treated as the norm for high-speed data transfer (Sesia et al., 2011). We note that the above ubiquitous local wireless connectivity and cellular technology can also be used in other related applications such as IoT as WSNs. However, in an SPS context, human sensors do not directly use such connectivity options (e.g., WiFi or LTE) to communicate their observations. Instead, human sensors leverage user interfaces (UI) on their personal devices (e.g., smartphone apps, websites on laptops) to input knowledge, eventually communicating through the highlighted ubiquitous local wireless connectivity and cellular technology. Figure 7 summarizes the state-of-the-art wireless connectivity standards enabling SPS, highlighting short-range standards such as WiFi and Bluetooth and longer-range standards such as LTE and 5G. Overview of wireless connectivity that enables SPS.
Internet of Things (IoT) standards and protocols
The interconnection of the sensing devices in bandwidth-constrained SPS applications (e.g., vehicular sensors and surveillance cameras in an anomaly detection application) is facilitated by several Internet of Things (IoT) messaging standards and protocols (Al-Fuqaha et al., 2015). In the recent past, several energy-efficient IoT protocols have been developed, such as
Computing paradigms
Given the colossal amount of data generated in SPS applications, it is imperative to process and analyze the sensing signals to interpret valuable information in a scalable and efficient manner (Hashem et al., 2015). This paper focuses on two major computing paradigms that enable such analytics:
Cloud computing
Cloud computing is a distributed computing paradigm consisting of high-performance clustered computing nodes in a networked environment capable of processing huge volumes of data in parallel (Qian et al., 2009) and thus can serve as a powerful platform for analyzing the deluge of multi-modal data in real-time for SPS applications. Readers can find a comprehensive study of cloud applications in (Rimal et al., 2009).
Cloud computing provides global service interfaces to the heterogeneous entities in SPS applications (e.g., vehicular sensors, smartphones, and human sensors) to upload their data which is processed using specialized hardware in conjunction with efficient task scheduling frameworks. Recent advances in cloud computing that facilitate SPS applications include: (i) serverless computing, where cloud providers allocate machine resources for on-demand sensing tasks such as anomalistic crowd investigation using IoT sensors and crowdsensing (Hendrickson et al., 2016); and (ii) ThingSpeak, an open-source cloud framework for processing, analyzing, storing, and visualizing real-time sensing data concurrently from wearable sensors (e.g., fitness trackers and smartwatches) and social media platforms (e.g., Twitter, Facebook) (Maureira et al., 2011).
Edge computing
Edge computing is an efficient computing paradigm to conduct localized data processing on devices at the edge of the network (Zhang et al., 2019b) and is best suited for time-critical SPS applications such as disaster response. An extensive study on edge computing-based applications can be found in (Yu et al., 2017). In contrast to cloud computing, edge computing administers computation at the “edge” of the network, closer to the social and physical data sources. One key feature of edge computing is
We note that both cloud and edge computing paradigms are also incorporated in IoT and other similar applications in which they need to analyze continuous-time signals (Mahmud et al., 2017) along with images, videos, and audio data from physical sensors (Al-Fuqaha et al., 2015). However, SPS applications not only involve the above computation tasks but also require processing text data generated by human sensors, which is associated with greater computational complexity (Barkovska et al., 2021). Moreover, the text is often unstructured in nature and might contain misleading or sarcastic remarks that can further increase computational overhead.
The following section discusses a collection of existing representative SPS applications.
State-of-the-art SPS applications
Summary and comparison of representative SPS applications.
Contact tracing of infectious diseases using crowdsensing and smartphone sensors
In the field of epidemiology, Concept of contact tracing with crowdsensing and smartphone sensors.
Several recent studies have attempted to meld non-monetized crowdsensing with Bluetooth and WiFi radios found in smartphones for COVID-19 contact tracing applications (Panduranga and Hecht 2020; Altuwaiyan et al., 2018; Bay et al., 2020). For example, Google and Apple launched a decentralized COVID-19 contact tracing framework called Exposure Notification System (ENS) that logs interactions with other ENS users using their smartphones’ Bluetooth radio (Google 2020) and augments it with crowdsensed data provided through mobile apps (Michael and Abbas 2020). MIT Media Lab further enhanced the ENS framework by developing a privacy-preserving location extrapolation mechanism with a smartphone’s GPS to deduce the approximate geographical location of a contacted person (Raskar et al., 2020). The scheme also allows healthy users to determine if they have “crossed paths” with any infected person (Panduranga and Hecht 2020).
The Singaporean government launched BlueTrace, a privacy-aware open-source COVID-19 contact tracing application based on Bluetooth-based localization and voluntary crowdsensing application (Bay et al., 2020) that logs Bluetooth interactions between participating devices. When two devices “meet,” they trade encrypted messages with temporary identifiers, and anyone suspected of infection will be requested to share their contact history with the concerned authority. Altuwaiyan et al. proposed a contact tracing scheme with integrated WiFi and Bluetooth-based localization technology from smartphones combined with crowdsensing through a mobile app (Altuwaiyan et al., 2018). Once users are tested positive, they are presented with a questionnaire through the app to input their memory of historical contacts. A contact tracing project called A-Turf was undertaken to accurately detect “encounters” between users within close proximity (e.g., less than six feet) using user feedback reported through a crowdsensing app and acoustic signals emitted by smartphones (Luo et al., 2020). By determining the “footprint” of infected individuals, crowdsensing and smartphone sensor-driven contact tracing systems help to test, isolate, and treat potential contacts of infected people.
Integrated social sensing and satellite-based environmental monitoring
Several recent studies in SPS have focused on applications integrating satellite-based remote sensing with social media and crowdsensing for capturing a wide range of visual features of the objects residing on the earth’s surface. Examples of such applications include urban land usage classification (Chi et al., 2017), predicting the poverty in underdeveloped areas (Zhao et al., 2020), post-disaster damage assessment (Huang et al., 2017), risky traffic location identification (Zhang et al., 2018f), and flood inundation mapping (Rosser et al., 2017). Figure 9 exemplifies an integrated social sensing and satellite-based environmental monitoring scheme for analyzing human mobility in urban areas (Shao et al., 2021). Harnessing the mutual efforts of human sensors and physical sensors installed on satellites results in: (i) a more pervasive and fine-grained representation of the objects residing on the earth’s surface (Zhang et al., 2018f), (ii) a reduction of their individual weaknesses (e.g., slow update interval of satellites, poor location accuracy of social sensing) (Zhang et al., 2016), (iii) localized and real-time information for closely monitoring the environment, which is helpful for applications involving emergency response, smart cities, and environmental hazards (Ghamisi et al., 2018), and (iv) a greater spatial resolution, which is crucial for applications like land cover classification, distinguishing urban-rural regions, damage assessment, target identification, and geological mapping (Chi et al., 2017). Scenario of integrated social sensing and satellite-based environmental monitoring.
The fusion of social sensing with empirical measurements from satellite-based remote sensing has opened opportunities for various interesting SPS applications. For example, Zhang et al. developed RiskSens, a multi-view learning approach to identify locations with high traffic risk by combining social media data with satellite imagery data (Zhang et al., 2018f). Chi et al. proposed Crowd4RS, a land usage and land cover classification scheme that combines satellite images in urban areas with geo-tagged social media photos for a more localized and fine-grained analysis (Chi et al., 2017). Rosse et al. designed a framework to infer flood inundation levels on different terrains by melding geo-tagged images from social media, optical satellite imagery, and high-resolution terrain mapping using a Bayesian statistical model (Rosser et al., 2017). Wang et al. presented an early warning system that fuses Twitter data with historical remote sensing data for detecting and predicting weather-driven natural disasters in near real-time (Wang et al., 2018). A Twitter-driven remote sensing approach has been developed to convert geo-tagged tweets into high-resolution raster images and integrate them with satellite-based nighttime lights to infer socioeconomic activities (Zhao et al., 2020). Another study has presented a framework to incorporate multi-sourced data from social media, remote sensing, and online databases through spatial data mining and text mining for post-disaster damage assessment (Huang et al., 2017). More recently, an integrated crowdsensing and remote sensing scheme has been proposed that combines remote sensing imagery and mobile phone positioning data for urban land usage mapping (Ghamisi et al., 2019). By exploiting the collective benefits of social sensing and satellite-based environmental monitoring, the above schemes facilitate a fine-grained interpretation of the earth’s geological features.
Anomaly detection using social airborne sensing (SAS) and social vehicular sensor networks (S-VSN)
4.3.1.Social airborne sensing (SAS)
Social airborne sensing (SAS) is progressing as a new SPS application domain where social signals are used to dispatch unmanned aerial vehicles (UAVs) for perceiving anomalous occurrences in time-sensitive applications (e.g., disaster response, wildfire monitoring) (Rashid and Wang 2022b). Figure 2 in Section 1 illustrates the concept of representative SAS schemes (Terzi et al., 2020). SAS is motivated by the agility and empirical sensing capabilities of UAVs fitted with physical sensors (e.g., camera, LiDAR, and thermal scanner) (Casbeer et al., 2005) and the ubiquity of social data platforms (i.e., social media and crowdsensing). Thus, SAS attempts to leverage the collective benefits of UAVs and social signals to provide a more rapid response and wider sensing scope than other SPS approaches (e.g., approaches that use satellite imagery or fixed sensors like surveillance cameras). Specifically, a more rapid and timely data acquisition can be delivered by SAS, especially in critical scenarios such as search and rescue missions, post-disaster response and recovery, and tracking potential suspects around crime scenes.
An SAS system collects and analyzes data from social media and crowdsensing platforms to locate probable events of interest (e.g., a person injured on a roadside, an area getting flooded, or buildings damaged by an earthquake) (Terzi et al., 2020; Rashid et al., 2019b). Afterward, UAVs are selectively dispatched to the extracted locations using various resource management policies (e.g., game theory, supply chain management, and reinforcement learning) to verify the authenticity of the event reports using their onboard physical sensors and augment the knowledge acquisition. Examples of SAS frameworks from recent literature include: (i) a path cheapest arc-based SAS scheme that incorporates calls for help from Twitter and dispatches UAVs for search and rescue missions (Terzi et al., 2020); (ii) a semantic web and machine learning-based SAS design for disaster management in urban areas (Sukmaningsih et al., 2020); (iii) a correlation-driven SAS solution for conducting disaster damage assessment in the aftermath of hurricanes (Yuan and Liu 2018); and (iv) a spatiotemporal-aware SAS framework that identifies latent correlations among reported event locations to dispatch UAVs selectively (Rashid et al., 2019a).
Social vehicular sensor network (S-VSN)
While SAS schemes offer pervasive and accurate information retrieval in critical scenarios, they still require dedicated UAVs, which are expensive and scarce resources having limited flight times Rashid et al. (2021). On the other hand, vehicular sensor networks (VSNs) have matured into a dependable networked sensing paradigm for vigilance and situational awareness along roadways that uses cars equipped with physical sensors (e.g., dashboard cameras) to opportunistically identify event occurrences (e.g., accidents on roads) (Zhang et al., 2008). Harnessing existing vehicular infrastructure does not require additional dedicated sensing equipment, which in contrast to UAVs, is more unobtrusive and reduces deployment cost and time since dedicated agents are not required. However, one limitation of traditional VSNs is that the information collected by vehicles is restricted to only those regions traversed by car drivers, restricting the scope of sensing for VSNs and their adaptability in unraveling new events.
To this end, an integrated SPS paradigm, namely, social vehicular sensor network (S-VSN), has recently been studied to integrate social sensing with existing ground-based VSN to provide more scalable and widespread anomaly detection (Rashid et al., 2019c, 2020a). Figure 10 shows the concept of an S-VSN scheme where social media users report events of interest (Rettore et al., 2019). A social signal distillation model analyzes the reports to determine the locations of the events, while a vehicular task allocation model assigns exploration tasks for car drivers to travel to specified locations and analyze the events using car sensors. The concept of social vehicular sensor networks (S-VSN).
By augmenting the outreach of vehicular sensors with the ubiquity of social sensors, S-VSNs attempt to provide widespread sensing coverage and greater sensing accuracy than standalone VSNs. In specific scenarios, such as identifying risky traffic regions or discovering essential resources in the aftermath of a disaster in large areas (e.g., locating gas availability at gas stations), an S-VSN might be more feasible than an SAS. Recent examples of S-VSN frameworks include: (i) a community-aware S-VSN architecture for road traffic anomaly detection (Qiu et al., 2018); (ii) an S-VSN system for performing accident investigation in smart cities (Rettore et al., 2019); and (iii) a road damage-aware S-VSN scheme that uses a Markov Decision Process (MDP)-based damage discovery scheme to locate roads affected by damage after a disaster (Rashid et al., 2020a).
Automatic license plate recognition using crowdsensing and physical sensors
One recent SPS application domain is automatic license plate recognition (ALPR) based on crowdsensing (e.g., smartphone apps) and physical sensors (e.g., roadside units, vehicular sensors, and smartphone sensors). Figure 11 illustrates an SPS-based ALPR application where information from traffic monitoring devices (e.g., roadside cameras and dashboard cameras) are melded with human inputs from crowdsensing apps (e.g., Citizen, Waze, and Neighbors) to track down the plate number of a potential suspect’s vehicle evading from a crime scene (e.g., a hit-and-run) (Ang et al., 2018). The analytics are typically conducted using deep-learning algorithms (Zhang et al., 2019; Ang et al., 2018). Thus, observations contributed by drivers, passengers, and commuters on roads might be integrated with knowledge from hardware sensors to narrow down searches by law enforcement personnel and swiftly locate the whereabouts of perpetrators. Overview of automatic license plate recognition using crowdsensing and physical sensors.
One crucial concern of SPS-based ALPR applications is their real-time requirements, where plate detection tasks are expected to be accomplished within certain time bounds in resource-constrained environments (e.g., the devices might have limited network bandwidth). Existing standalone ALPR approaches primarily focus on analyzing large volumes of video footage data collected from surveillance cameras and stored in the cloud platforms (Zhang et al., 2017c). However, such schemes often introduce a non-trivial amount of data transmission delay to offload the videos to the cloud, which is not favorable for the real-time car plate detection application. More recently, there is a growing development of ALPR schemes that harness crowdsensing combined with existing vehicular sensors and IoT devices (e.g., vehicles equipped with dash cameras and smart devices owned by citizens) to form a city-wide video surveillance network that tracks moving vehicles using the automatic license plate recognition (ALPR) technique (Du et al., 2012). Zhang et al. developed EdgeBatch, an SPS-based ALPR task management framework, where reports about license plates of probable suspects from concerned citizens in crowdsensing apps are combined with inputs from IoT sensors (e.g., surveillance cameras) using collaborative edge computing resources to detect the license plates (Zhang et al., 2019a). Trottier et al. presented the concept of a dashboard camera and crowdsensing platform-driven ALPR scheme for smart cities where video footage from dashboard cameras is analyzed by image processing algorithms and further augmented with inputs from crowdsensing participants through an app to recognize the number plates (Trottier 2014).
Despite their usefulness, ALPR approaches also instill privacy concerns in the collaborative sensing context of SPS applications. For example, car drivers might not be willing to share the metadata from their devices to the cloud for fear that such data may reveal their private information (e.g., location, speed, and driving behavior). With concerns about user privacy, Alcaide et al. proposed a privacy-aware ALPR scheme that maintains confidentiality of the users’ data and prevents unauthorized usage of private devices that are used for capturing and recognizing images of plate numbers (Alcaide et al., 2014). A privacy-aware ALPR scheme has been proposed that masks and protects the identity of the owners of license plates recognized using data from crowdsensing apps and roadside monitors (Yan et al., 2011). By exploiting the knowledge from crowdsensing and physical sensors, SPS-based ALPR applications aid in tracking down potential criminals on roads (Zhang et al., 2019).
Situational awareness using social media and crowdsensing melded with IoT (Social/CrowdIoT)
The prevalence of IoT alongside social media and crowdsensing has opened new domains for situational awareness in SPS. Examples of such applications include real-time crowd density measurement, search and rescue operations, and urban anomaly detection (Kucuk et al., 2019; Atzori et al., 2010; Zanella et al., 2014; Brabham 2013). By integrating social media and crowdsensing with the IoT paradigm, the emerging areas of SocialIoT and CrowdIoT, respectively, can achieve results beyond what is possible with traditional standalone situational awareness approaches. Figure 12 illustrates a SocialIoT-based situational awareness application where information from Twitter and IoT-enabled flood measurement sensors can be combined to estimate the density of flood (Mirza et al., 2022). The following subsections discuss a few variants of SocialIoT and CrowdIoT. Overview of situational awareness with SocialIoT.
Integrated social media sensing and IoT (SocialIoT)-based indoor localization and tracking
In recent times, there has been a surge in SPS applications that focus on indoor localization based on contextual information provided on social media and raw signals from IoT devices. While GPS provides fairly accurate outdoor location tracking, the applicability of GPS for indoor tracking is limited primarily due to the inaccessibility of satellite signals inside confined spaces and lower degrees of precision. As such, accurate indoor localization schemes require additional infrastructure support (e.g., ranging devices) or extensive training before system deployment (e.g., WiFi signal fingerprinting). In indoor localization, networks of IoT devices are used to track people or objects in confined places where GPS and other satellite technologies usually lack precision or fail entirely, such as inside multistory buildings, airports, alleys, parking garages, and underground locations. Location-based services, such as targeted advertisement, geosocial networking, and emergency services, are becoming increasingly popular for mobile SPS applications (Jun et al., 2013; Liu et al., 2010).
In order to help existing localization systems to overcome their limitations or enhance their accuracy, approaches have been developed that combine social media sensing with IoT for accurate location tracking indoors. For example, a scheme called Social-Loc has been proposed that integrates the physical traces of an individual posted through social media (e.g., check-ins to a particular shop in a shopping mall) with RSSI signals from WiFI routers to potentially derive the exact location of individual users within a building (Jun et al., 2013). Chu et al. designed SBOT, a social media and sensor network-driven indoor localization scheme which combines user statuses and updates posted through social media (using text mining techniques) with telemetry data from smartphone sensors (e.g., altitude, speed, and heading of the users) to pinpoint the location of the users (Chu et al., 2020). Liu et al. proposed a social-driven IoT system consisting of backpacks equipped with 2D laser scanners and inertial measurement units augmented with historical social network traces of the users to perform indoor localization and visualization of complex environments such as staircases or corridors (Liu et al., 2010). However, with the increasing facilities for geo-locating people using their digital footprints, concerns for individuals’ privacy also prevail. As we will discuss later in Section V-E, the metadata obtained from the social and physical sensors in SPS for locating people exposes the risks of revealing their private information. A few privacy-preserving SocialIoT schemes have been developed which aim to protect people’s identities while geo-locating them indoors (Hamza et al., 2020; Perera et al., 2015).
CrowdIoT-based context awareness
Several exciting CrowdIoT applications have emerged that are crucial to society’s well-being, including criminal identification and disaster response (Dunphy et al., 2015). Dunphy et al. proposed an integrated crowdsensing and CCTV-based video surveillance framework where surveillance footage collected from CCTVs spread across a city is assigned to Amazon MTurk participants to tag instances of abnormal occurrences in real-time (e.g., traffic accidents, crimes) (Dunphy et al., 2015). Abu et al. designed an integrated risk assessment framework using crowdsensing apps and fixed urban IoT sensors (e.g., proximity sensors, acoustic sensors, radars, air quality monitors, etc.) that predicts the possibility of crisis such as multi-vehicle accidents, major weather events, and large fires(Abu-Elkheir et al., 2016). Vital information from the framework might assist emergency personnel such as firefighters and first responders. Beyond surveillance-centric context awareness applications, stationary CrowdIoT-based SPS schemes are also used for locating regions of adverse weather and climatic conditions. For example, Horita et al. developed a flood inundation mapping (FIM) system that integrates crowdsourcing data with data from in situ weather radars to infer probable locations of a flood (Horita et al., 2018). Thus, building upon the tight integration of crowdsensing and fixed-sensor IoT devices, stationary CrowdIoT solutions (like the ones discussed above) facilitate providing rich context-aware SPS applications.
Another emerging context awareness sub-domain within SPS involves integrating crowdsensing with mobile devices and portable IoT devices, otherwise known as mobile crowdsensing (MCS). Applications integrating mobile sensors with crowdsensing in MCS utilize users with mobile devices capable of data capturing, computation, and communication to collectively share data and extract information to measure, assess, estimate, or predict processes of shared interest (Ganti et al., 2011; Ma et al., 2014; Guo et al., 2015; Wang et al., 2014a). Such mobile devices include smartphones, wearables, and tablet computers equipped with hardware sensors (e.g., GPS, microphones, and heart rate monitors) and sufficiently robust processing units (e.g., CPU, FPGA, and GPU). The ubiquity of such
In addition to the above critical mobile crowdsensing schemes, there have been significant works on utilizing smartphones and wearable sensors (e.g., sociometric badges, smart glass, fitness trackers, and smartwatches) for less critical applications such as: i) monitoring environmental conditions like noise (Zappatore et al., 2016) and air quality (Vahdat-Nejad and Asef 2018); ii) assessing infrastructural conditions such as traffic congestion (Yan et al., 2017) and road damage (Li and Goldberg 2018); and iii) determining most fuel-efficient travel routes (Ganti et al., 2010). The integration of crowdsensing and mobile sensors has also opened up new possibilities for exciting applications in disaster response. Han et al. (Han et al., 2019) proposed a crowdsensing and mobile-IoT integration model that aims to improve disaster response by using important metrics such as weather conditions, damage reports, and infrastructure accessibility derived from crowdsensing apps and portable devices equipped with RFID technology. Driven by the unification of crowdsensing with sensors contained in mobile devices, mobile crowdsensing schemes aim to provide a more holistic representation of the environment in SPS applications.
The following section discusses key research challenges prevalent in current SPS applications.
Fundamental challenges in SPS
Summary of schemes targeting the challenges in SPS and open research questions.
Data collection challenge
Before valuable knowledge can be interpreted in SPS, the relevant data must first be located, extracted, and organized. Thus, one of the critical challenges in SPS lies in simultaneously harvesting the raw sensor data from myriads of social and physical sensors (Stieglitz et al., 2018; Wang et al., 2011c; Zhang et al., 2019c).
The first obstacle in data collection is to systematically locate useful data from the inherently Example of data collection challenge in SPS Applications.
Existing literature on social sensing has proposed methods to overcome the noise from social data platforms with techniques such as machine learning (ML) (Nur’Aini et al., 2015), artificial neural networks (ANNs) (Jagannatha and Yu 2016), estimation theory (Wang et al., 2019a), and adaptive sampling (Zhang et al., 2018h). Studies on physical sensors have proposed methods to reduce sensor noise using approaches like image enhancement with super-resolution (Zhang et al., 2020e), deep learning-driven noise reduction (Lai et al., 2018), and graph neural network-based data extrapolation (Wang et al., 2014d). However, such standalone approaches fail to address the intrinsic interdependence between the noise from social and physical signals in SPS, which is non-trivial to quantify and model.
The second obstacle is gaining access to sensing data from devices owned by individuals. While there is an abundance of connected devices that are able to perform a wide range of data capture, computation, and communication tasks, a significant number of them are privately owned (e.g., smartphones, IoT devices, and surveillance cameras) (Johnsen et al., 2018). Consequently, gaining access to such sensors’ data is difficult primarily because the individual entities might not be willing to share their personal devices due to reasons such as inconvenience, draining of battery on mobile devices, usage of cellular data, and privacy concerns (Wang et al., 2009).
Recent literature has presented several solutions like: i) privacy-aware schemes such as game-theoretic task allocation (Zhang et al., 2020a) and non-invasive distributed private data collection (Zhang et al., 2011); ii) energy-preserving data transmission schemes such as Bluetooth low energy (Heydon and Hunn 2012); and iii) bandwidth-conserving data sharing tools such as signal compression (Johnsen et al., 2018) and hop-by-hop flow control (Hull et al., 2003). These approaches might potentially help to convince people to provide access to their devices for obtaining sensor data. However, beyond the willingness of people to share their personal devices, the devices in SPS might be unavailable for capturing or providing access to the data. For example, a user might be using her smartphone to play video games or watch videos, making the device unavailable for capturing images and processing them efficiently. Therefore, collecting data from the social and physical realms that direct to the appropriate information remains an open challenge in SPS.
Human-cyber-physical interactions challenge
In SPS, one elemental challenge is handling the complex interactions between the human, cyber, and physical (HCP) components when integrating social sensing with physical sensing. As events in the real world play out, human and physical sensors are expected to spontaneously contribute knowledge through the social and physical data platforms to recover the truthful states of real-world occurrences. Given this basis, developing a closed-loop system that seamlessly integrates the social and physical sensing paradigm is crucial.
In such a closed-loop system, the social and physical sensors effectively communicate and complement each other to accomplish the assigned sensing tasks jointly. Existing research on human-computer-interactions (HCI) has explored the need for designing effective interfaces to connect the human and cyber worlds, which include examples such as web interfaces, mobile applications, online forms and survey questionnaires, virtual reality (VR), and motion capture (Wich and Kramer 2015). In recent times, there has been a surge in research on cyber-physical systems (CPS), which explores the interactions between the cyber and physical worlds with a focus on the problems in sensing, computation, and control of a CPS system (Zeng et al., 2020). Recent studies in CPS have proposed techniques such as embedding human intelligence into cyberspace and augmented reality-driven assistive technology for humans to reduce the gap between the human and cyber worlds (Hu et al., 2012; Zhang et al., 2020b). However, handling the HCP interactions in SPS is much more complex and challenging than the problems studied by existing HCI and CPS research.
While human users typically act as sensors in SPS applications, they must also carefully consider their roles as actuators. Let us consider an example in Figure 14, which shows a smart water monitoring application where crowdsensed water quality measurement is combined with physical water quality sensor data (Fascista 2022). Here, crowdsensing participants act as actuators. If the participants do not contribute data of sufficient quality (i.e., not enough reliable data or low participation level), incentives can be applied to encourage them to provide better-quality data. The incentives serve as control signals, and the participants act as actuators. Upon receiving higher incentives, the participants might potentially take a response/action in the physical world by: (i) collaborating to contribute more data; (ii) validating the data of their peers; (iii) or encouraging more people to participate by referring them to use the app (Peng et al., 2015). Thus, the incentive serves as a signal from the cyber world (i.e., through smartphone apps) to control response in the human world (i.e., the human participants). When humans receive the incentives, they respond in the physical world (i.e., collect and contribute higher quality data). Such an adaptive closed-loop system requires a careful design that systematically models the complex HCP interactions. Example of human’s role as actuators in SPS.
Current literature has proposed methods to develop closed-loop systems encompassing various sensors (e.g., cameras, GPS sensors), actuators (e.g., robotic arms, motorized doors), and controllers (e.g., proportional–integral–derivative (PID) controllers, fuzzy logic controllers, and reinforcement learning) for establishing effective cooperation between them using techniques such as linear quadratic Gaussian (LQG) control (Lee et al., 2019), supply chain theory (Zhang et al., 2019a), and blockchain-based smart contracts (Sathiyanarayanan and Sokkanarayanan 2019). However, the closed-loop challenge at the intersection of human, cyber, and physical spaces in SPS has not been fully addressed by existing research for several reasons. First, current solutions often do not explicitly model the human participants as actuators, which is a crucial feature of SPS applications. Second, current literature on incentive design in crowdsensing frameworks has not addressed how to use the physical sensors to validate the information contributed by human sensors. Third, existing approaches have not fully explored measures to leverage social signals to effectively control the performance of the physical sensors. Last, current solutions have not explicitly considered the joint dynamic nature of the human, cyber, and physical worlds to tightly coordinate their interactions. As such, addressing the HCP interaction prevalent in SPS systems remains an open challenge.
Device and data heterogeneity challenge
While the abundance of physical and social sensors in SPS provides a rich influx of knowledge across various sensing applications, an inherent challenge in SPS lies in managing the diverse range of devices involved in the sensing process and the different types of data they generate. We deem this challenge as Scenario of data and device heterogeneity challenge in SPS Applications.
As identified in Section 3, SPS applications are centered around a diverse collection of devices that encompasses data acquisition, communication, and computation. In particular, the physical sensing components rely on the capabilities of hardware sensing devices (e.g., cameras, UAVs, and robots), while the social sensing components obtain observations from human sensors through crowdsensing and social media by implicitly leveraging user devices (e.g., connected tablets, laptops, and smartphones). Such devices have distinct characteristics in terms of sensing and computation capabilities, sensitivity, power requirements, frequency of data capture, communication protocols, access control and authentication methods, and runtime environments (Zhang et al. 2019a, 2019e, 2020e; Chu et al., 2016; Shang et al., 2019), which often presents a unique difficulty in managing them in SPS applications. For example, in the SAS application of Figure 2 in Section 1 (Rashid et al., 2019a), smartphones capture human observations and send them to social media platforms which are then used to dispatch UAVs to recover the veracity of the reports. Standalone social or physical sensing applications are unlikely to have such diverse devices working together. As such, little work has been done in earlier research to bridge the knowledge gap in SPS and construct a unified framework that can efficiently manage such diverse devices.
A few efforts have attempted to mitigate device heterogeneity in sensor networks and distributed systems primarily using abstraction-based approaches such as: (i) sensor emulation, device clustering (Shao et al., 2018), and sensor abstraction layer (Gigan and Atkinson 2007) for data acquisition devices; (ii) containerization (Scheepers 2014) and dynamic binary translation (Jun et al., 2019) for computation devices; and (iii) software-defined networking (Kirkpatrick 2013) and sensor network virtualization (Khan et al., 2015) for communication devices. However, in the context of SPS, existing solutions are inadequate in addressing the device heterogeneity challenge due to several reasons: (i) the devices in SPS are mostly privately owned (i.e., smartphones, IoT devices), which makes it hard for an SPS application to apply global policies and control the devices from a central authority perspective (Zhang et al., 2019a) (e.g., it might not be possible to install a middleware application on a personal device); (ii) the extent of heterogeneity of the devices in SPS is more evident due to the added heterogeneity of tasks and architectures which current solutions overlook (Zhang et al., 2019a); and (iii) the devices in SPS often have complex interdependence of the tasks (Zhu et al., 2019), which existing solutions might not preserve (Wei et al., 2019).
Beyond the diversity of the devices, the social and physical sensors in SPS typically generate data that widely vary across modalities and formats. For example, the input data type can range across text, image, location, audio, and video (Birke et al., 2014), and each type can further encompass different dimensionality, making the data heterogeneity even more pronounced (Zhai et al., 2014). For example, for image data, the dimensionality can be edges, corners, blobs, and ridges, while for text data, the dimensionality can be document frequency and
Dependency and correlation challenge
One fundamental challenge in SPS lies in characterizing the dependencies between the social and physical data sources and correlating the collected data across the two sensing paradigms (Stieglitz et al., 2018). While this challenge has been studied in social and physical sensing independently, it is more pronounced in the context of SPS applications and more challenging to solve due to several hurdles.
The first hurdle is building a unified analytical framework to model the source dependency and data provenance in SPS, given the diversity of source dependencies in social and physical sensing. For example, human sensors tend to be naturally correlated through social networks (e.g., Twitter followers tend to re-tweet their friends’ tweets). In contrast, physical sensors do not typically inherit any such social correlations and are more likely to be correlated through the underlying physical phenomena or geographic locations (e.g., two air quality monitors are likely to report similar measurements if they are in close proximity). Such disparity in source dependency and data correlation makes it non-trivial to seamlessly integrate the diverse social and physical sensor measurements under a principled framework (Wang et al., 2014b). Current knowledge discovery and data mining approaches in social and physical sensing such as semantic pattern recognition (Dey et al., 2018), trust and influence modeling (Asim et al., 2019), and covariance intersection (Ahn and Park 2011) model the dependencies across social and physical sources independently. However, due to the different source dependencies within the social and physical sensors, such approaches are largely inadequate for SPS applications. A unified source dependency modeling framework to meld the social and physical sensors in SPS is yet to be developed.
The second hurdle is imposed by the presence of strong causal relationships between the physical and social sensor data in SPS applications (Giridhar et al., 2016). For instance, during a traffic accident, as illustrated in Figure 16, people might report the accident along with its location on Twitter, while traffic flow monitoring units placed at a different segment on the same road might detect unusually slow traffic movements (Giridhar et al., 2016). While the traffic accident and congestion reported by different sensing channels might be seemingly unrelated at first glance, aligning the temporal and spatial information from the input signals (e.g., geo-location information and timestamps of the events) might reveal an inherent causality between them (i.e., the traffic congestion was probably caused due to the traffic accident) (Tsapeli et al., 2017). Thus, even though there might not be any direct relation between the reported events across social and physical data platforms, the sensors across the two paradigms might report the same chain of occurrences or the same context but in different formats. While this context information might help to explain the cause of anomalous incidents, it is a non-trivial task to explore such causality across social and physical data platforms. Example of causality among sensors in SPS.
Given the diverse source dependency profiles and the potential presence of causality across the physical and social sensors in SPS, it is challenging to design a holistic framework that can effectively connect the disparate social and physical sensors for interpreting real-world event occurrences. Consequently, extensive exploration and modeling of the dependency and correlation within the social and physical domains remain an outstanding challenge in SPS research.
Social and physical privacy challenge
Due to the integrated nature of the social and physical sensors in SPS, one critical challenge in SPS applications is to efficiently address the privacy issues of end users of SPS applications (Liu et al., 2019). Figure 17 illustrates an SAS application where UAVs need to be dispatched based on locations derived from social media (Terzi et al., 2020). However, due to concerns about privacy, a good proportion of users refrain from sharing their GPS data, due to which the UAVs would be unable to determine the locations where to fly to. Existing literature has proposed several privacy-aware sensing approaches for social sensing, which include source identity obfuscation (Toch et al., 2012), blind signatures and data shuffling (Liu et al., 2019), ring signatures (Vance et al., 2018), and data perturbation (Ganti et al., 2008). In a similar fashion, for alleviating privacy issues in physical sensing, current approaches have developed schemes such as slice-mixed aggregation (Li et al., 2009), isolated virtual networks (Al-Fuqaha et al., 2015), trace-free location tracking (Toch et al., 2012), and routing with random walk (Li et al., 2009). Despite the effectiveness of the above approaches in preserving user privacy in social and physical sensing separately, several unique difficulties in SPS restrict their usefulness in solving the privacy challenge in SPS systems. Example of social and physical privacy challenge in SPS.
First, social and physical sensors in a connected environment often deliver complementary information that can be exploited to expose the users’ personal information. For example, in a fitness tracker application using social media and wearable sensors, reports of daily exercise activities posted by people through social media (e.g., jogging in a park) might be correlated with user-shared historical health data from wearable sensors (e.g., blood pressure, pulse rate, body temperature) to potentially infer the medical history of an individual (e.g., whether a person has a chronic illness).
Conversely, in SPS applications, the data from physical sensors might also be exploited to maliciously extract the private information of individuals when augmented with social signals. For example, in an anomalous crowd detection application that combines images captured by surveillance cameras with reports of crowd gatherings posted on social media to infer the onset of sudden crowds, the surveillance cameras can only capture the image of a person at a specific location without further details of that person. However, if that particular person periodically shares their shopping history alongside geo-location information on social media, the image data from the cameras might be correlated with the additional data to unravel the socioeconomic status of the individual (Xiong et al., 2019).
Second, due to the inherent heterogeneity of the devices and data in SPS, it is a non-trivial task to apply unified privacy-preserving policies in SPS applications. As discussed in the
Interrelated dynamics challenge
A pivotal challenge in SPS is handling the interrelated dynamics induced by the fusion of the social and physical domains. SPS applications innately rely upon the tight integration between social and physical realms, both of which are dynamic in nature and exert impact over one another.
The dynamics arising from the social domain tend to influence the performance of physical sensing directly. Let us consider an integrated social media and UAV-driven crowd analysis application as shown in Figure 18 (Kaiser et al., 2017). If social events related to public protests are initiated and organized on social media, dynamics in the social domain (e.g., more people tweeting, different locations being targeted, people publicizing the activities to a greater level) might cause dynamics in the physical world (e.g., more new activities related to protests such as speeches and concerts, more people joining, events taking place in locations far away from one another). Given that mobile physical sensors such as UAVs and robots often suffer from constraints such as energy, communication, and speed, such physical sensors might not be able to explore or investigate all the events reported by social sensors within set deadlines. Such a scenario is also illustrated in Figure 18, where the initiated crowd events are located at various locations with different deadlines. Due to the presence of the social domain dynamics, the UAVs, with their physical constraints, might not be able to sense all the crowd events before their deadlines. Thus, careful choices need to be made on which subset of reports from the social data platforms to prioritize for the physical sensors, which existing solutions have not addressed. Example of how social domain dynamics affect physical sensors in SPS.
On the other hand, the dynamics from the physical world might affect the performance of both the social and physical sensors (Hu et al., 2012). For example, let us consider an S-VSN application in the aftermath of a disaster (Rettore et al., 2019) as shown in Figure 19. The disaster might cause road damage around the affected area. Such damage can potentially restrict the travel of cars across certain roads, which can cause car drivers to be unable to locate and report events of interest on social media (e.g., gas availability in a gas station). Moreover, network infrastructure can also get damaged, leading the car drivers to lose access to network connectivity and be unable to post any event reports (Wisitpongphan et al., 2007). Eventually, fewer observations might be reported by car drivers across social media, yielding poor coverage from the human sensors. In the physical world, the event occurrences might also be accompanied by unforeseeable circumstances such as unfavorable weather (e.g., extreme temperatures) or damaged infrastructure (e.g., disconnected power lines), which might impede physical sensing. For example, strong wind or cloud might impact the readings from different sensors such as cameras or gyroscopes on UAVs, and bumpy roads might negate the performance of vehicular sensors on cars (e.g., shaky images captured by dashcams) (Li et al., 2019; Wang et al., 2013b). Therefore, careful consideration must be given to adapting the SPS systems to accommodate such physical world dynamics on-the-fly, which has not been extensively explored by current literature. Example of how physical domain dynamics affect social sensors in SPS.
Figure 20 provides an overview of the fundamental challenges in SPS. We note that while some of these challenges might also be studied in AI literature, the two areas are sufficiently different and not directly comparable to each other for several reasons. First, SPS is a sensing paradigm that leverages the collective knowledge from human and physical sensors to perceive the state of the world (Qiu et al., 2016; De et al., 2017). By contrast, AI is a much broader topic that encompasses the theories and algorithms driving systems that can perform tasks that typically require human intelligence (Joiner 2018). Second, SPS and AI have fundamentally dissimilar problem contexts. For instance, while Overview of fundamental challenges in SPS Applications.
Roadmap for future work
In this section, we present several exciting avenues for future work in the domain of SPS. As we outline each avenue, we enlist a few potential directions of research to pursue.
Uncertainty quantification in SPS
Since SPS applications often rely on the noisy social and physical signals contributed by a diversified set of human and physical sensors, one potential direction for future work lies in quantifying the uncertainty generated by the diverse sensors in SPS applications. As discussed in the
It is essential first to realize why existing literature has not extensively explored the domain of uncertainty quantification in SPS. Several disparities between social and physical sensors lead to difficulty in rigorously quantifying their signals’ uncertainty. First, the social and physical sensors in SPS generate dissimilar types of data (e.g., social sensors typically generate text data while physical sensors generate continuous and discrete time signals) (Mitchell and Chen 2014; Wang et al., 2019a). Second, the dependencies between the sensors in social data platforms are different from that within the sensors in physical data platforms (Stieglitz et al., 2018). Third, the dynamics in the social domain are characteristically contrasting to the dynamics in the physical world (Hu et al., 2012; Zeng et al., 2020). Fourth, the rates of data generated by social and physical sensors are different from physical sensors (e.g., the speed at which UAVs capture images is different from the frequency at which people report incidents on Twitter) (Misra et al., 2020; Mourtzis et al., 2016). In addition, factors such as biased opinions from human sensors in social sensing and the failure cases of physical sensors (e.g., out of battery or affected by bad weather) implicitly aggravate the uncertainty quantification in SPS (Wang et al., 2019a; Diez-Gonzalez et al., 2020).
One direction for further research in SPS is to focus on rigorously quantifying the uncertainty of social and physical signals and leverage the quantification results to improve SPS systems’ social and physical sensing components jointly. For example, in anomaly detection with an SAS application, if the uncertainty from the social signals can be determined, it may help to dispatch the UAVs better. Similarly, if the uncertainty in the captured UAV data can be measured, it can be used as feedback signals to improve reliable source selection in social sensing. Another probable research direction in SPS can be to design schemes that can deduce the uncertainty in the social and physical sensing data while simultaneously considering the SPS challenges such as the data heterogeneity, the diverse source dependencies, the social and physical world dynamics, and the contrasting social and physical sensor data generation rates. Existing studies on statistical analysis have proposed principled approaches based on estimation theory. Examples of uncertainty quantification approaches include maximum likelihood estimation (MLE), Cramer-Rao lower bounds (CRLB) (Wang et al., 2011b, 2012a, 2011a, 2013c, 2015b). Alongside quantifying the uncertainty of estimation results, future SPS schemes can focus on incorporating the accompanying factors (e.g., human bias, physical constraints) in the uncertainty propagation models. We envision that techniques from multiple disciplines might be applicable for alleviating the above hurdles and modeling the uncertainty in SPS applications which includes Bayesian networks (Zhang et al., 2018d), Monte Carlo methods (Harris and Cox 2014), evidence theory (Bae et al., 2004), Markov Chain formulation (Abdar et al., 2020), mixed integer linear programming (MILP) (Constantinescu et al., 2010), and polynomial chaos expansion (Kaintura et al., 2018).
Handling trade-off between privacy and sensing quality in SPS
As identified in Section 5-E, mitigating privacy issues is critical in SPS applications. However, in attempts to ensure user privacy, often current SPS schemes have to compromise the sensing quality. For example, metadata such as geo-location information might be concealed from privately owned devices to protect the identity of human sensors on social media. However, the location information might be critical for mobile sensors, such as robots, to be dispatched to events of interest. Figure 21 shows an example scenario where concealing private data affects sensing quality. Thus, SPS applications often require the knowledge of supporting information such as locations, timestamps, and contextual information from reported social sensing data, which often conflicts with the end users’ privacy requirements. Since ensuring user-level privacy and maximizing sensing quality often turn out to be two potentially conflicting objectives in SPS (Xu et al., 2018), it is imperative to design schemes that carefully strike trade-offs between privacy and sensing quality for an optimized SPS system. Example of trading-off between privacy and sensing quality in SPS.
Existing approaches in data-driven social and physical sensing schemes have proposed techniques to manage user privacy by obfuscating identifying information such as geo-location tags from the raw data from personal devices (e.g., laptops, smartphones) (Park et al., 2017; Zhang et al., 2018e). However, existing privacy-preserving schemes have not addressed effectively handling the trade-off between privacy and sensing quality in SPS. Several reasons make it difficult to simultaneously establish privacy and sensing quality in SPS. First, the unpredictable nature of human users in SPS applications makes it difficult to ensure that the users will strictly abide by policies to protect their privacy. Second, given the unique data and device heterogeneity in SPS applications, designing a unified framework to enforce individual privacy policies across all the devices is a challenging task (Vance et al., 2019). Third, regardless of the robustness of privacy-preserving schemes, the complementary aspects of the contributed data through social and physical data platforms can be exploited to steal sensitive user information (Vance et al., 2018).
One future research direction to pursue for optimizing the privacy and sensing quality in SPS applications is to design multi-faceted cryptographic techniques such as blockchain technology (Ali et al., 2017), smart contracts (Christidis and Devetsikiotis 2016), and ring signatures (Vance et al., 2018). While existing cryptographic approaches have come a long way in balancing privacy and sensing quality individually in social and physical sensing (Henry et al., 2018), it is difficult to apply unified cryptography-based solutions in an SPS setting where a diverse range of devices are associated (Marin et al., 2015). Future work in this domain can constitute developing cryptographic SPS approaches that can cater to the heterogeneity of the devices in SPS and effectively trade off privacy and sensing quality. Another potential direction for future work on quality-aware privacy preservation in SPS is to explore and incorporate approaches like differential privacy, where noise is deliberately added to the user data to conceal the sensitive information of users (Abadi et al., 2016; Kairouz et al., 2015). While current differential privacy techniques have been applied in participatory sensing and crowdsensing, such approaches have not considered the data heterogeneity issue prevalent in SPS. Due to the wide variety of the data generated by the social and physical sensors in SPS (e.g., text, image, audio, and location data), injecting deliberate noise for concealing user identity into different data might be computationally intensive and resource-demanding. As such, further work can concentrate on alleviating the data heterogeneity in SPS applications by developing efficient differential privacy techniques.
Ensuring fairness and accuracy of detection in SPS
While SPS applications deliver a multifaceted sensing package using a combination of social and physical sensors, one remaining issue is ensuring fairness alongside accuracy for the data obtained from diverse demographics (Kairouz et al., 2019; Zhang et al., 2020c). With the advent of numerous data acquisition platforms and processing techniques, there is a heightening concern from various civil rights organizations, governments, and analysts regarding the fairness of the detection process in SPS applications and their prevalent algorithmic bias towards specific demographic groups (Roselli et al., 2019). For example, in a contact tracing SPS application, as illustrated in Figure 22, overrepresented classes of data might cause a certain age of people (e.g., teenagers) to be incorrectly represented as the prime sources of the disease. One issue that arises when trying to ensure fairness and accuracy in input data distribution is the loss of model accuracy. Specifically, in order to reduce the bias, it is crucial to incorporate a wider distribution of data from different classes (e.g., race, ethnicity, and nationality) of data contributors. However, to reduce bias by incorporating a wider distribution of data, the inference models in SPS need to train over a larger sample of data, causing the overfitting/underfitting problem, which often leads to the reduction in model accuracy (Dressel and Farid 2018). The fairness and accuracy issue in SPS is further exacerbated by the fact that specific demographics might be more inclined to use smart devices more often than others. For example, in an anomalistic crowd investigation application using SAS, younger people might use their mobile devices to post crowd-related events more frequently on social media while senior people might not report their observations so often on social media. As such, a crowd inference model might be overfitted with a younger demographic. Current fairness and accuracy-optimizing schemes are limited in addressing such diverse device usage scenarios present in SPS applications. Example of algorithmic bias affecting fairness in an SPS-based contact tracing application.
Existing schemes have attempted to reduce algorithmic bias by using heuristic approaches such as genetic algorithm (Kosmidis and Frith 2010), optimizing the model’s loss function (Iosifidis and Ntoutsi 2019), or ensuring that the model training process satisfies the given fairness constraints (Zhang and Ntoutsi 2019). The problem with current approaches is that they have been originally designed for fairly good-quality input data. However, in the context of SPS applications, both social and physical sensors are prone to systematic noise, which is hard to quantify and model due to their complex interdependence with each other (Qu et al., 2020). Thus, further research can concentrate on optimizing the fairness and accuracy of SPS applications while concurrently offsetting the noise generated by the social and physical sensors. Techniques such as deep learning (DL)-based collaborative filtering (Bobadilla et al., 2020), discrimination-aware channel pruning (Zhuang et al., 2018), and selective adversarial networks (Adel et al., 2019) could be explored to develop such fairness and accuracy-optimizing methods. In addition to mitigating the noise contained in the input signals in SPS, one strand of research can be focused on developing user-friendly and accessible interactive interfaces (e.g., interactive kiosks, smartphone applications, responsive websites, and augmented reality experiences) for collecting fair data samples in SPS given the potential demographic bias in the participants.
Harnessing adaptive artificial intelligence in SPS
One route for future research in SPS can be focused on addressing the interrelated dynamics in SPS. As discussed in Section V-F, a critical task in SPS applications is handling the interrelated dynamics caused by the constantly transitioning social and physical environments. Adaptive Artificial Intelligence (AI) algorithms are known to adjust their parameters to cater to changing stimuli (McMahan et al., 2021). As such, AI algorithms might help to adapt to the constantly changing social and physical environments. However, several limitations inhibit off-the-shelf AI algorithms from being directly applied to SPS applications to mitigate the dynamics challenge.
First, as identified in Section 5-B, one recurring issue stemming from the human-cyber-physical interactions challenge in SPS applications is the inconsistent availability of the social and physical sensors, known as
Several future avenues for research can be explored to tackle the above difficulties. One potential realm of further work can focus on using
Conclusion
In this paper, we present a comprehensive survey of SPS, an emerging integrated sensing paradigm that exploits the collective strengths of physical and social sensing to acquire and interpret observations from the environment. Empowered by the ubiquity of versatile data capture, communication, and computing technologies, SPS melds the human wisdom-driven data acquisition from social sensors with the multifaceted sensing capabilities of physical sensors to deliver a deeper perception of the real world, both physically and socially. In particular, this paper surveys the various aspects that are important for constructing compelling SPS systems, which includes a detailed overview of SPS, the key motivation behind its origin, the crucial technologies and protocols that enable SPS, real-world SPS applications and state-of-the-art solutions, the key challenges prevalent in SPS, and the potential avenues for further work to address the challenges. We hope this paper will bridge the knowledge gap from the current literature on SPS and motivate future studies to design novel SPS systems for a more holistic perception of real-world phenomena.
