Abstract
The analysis of human mobility behavior through computational techniques finds applications in various domains and provides valuable insights for urban planning, transportation services, and a deeper understanding of human interactions in Smart Cities and Smart Environments. In this scenario, this study presents a Systematic Literature Review (SLR) with the following main question: How are computational techniques being used to analyse human mobility behavior in Smart Cities and Smart Environments? A total of 5989 articles were initially found and filtered, resulting in 56 articles reviewed. As the main contributions, this study provides responses to 19 research questions. A list of the challenges and the computational techniques identified is provided. The algorithms, machine learning techniques and data-sources used by the reviewed studies are also presented and organized through taxonomies. A comprehensive discussion of the identified techniques is conducted, finishing with a compilation of challenges, open issues and research opportunities. To the best of our knowledge, this is the first study that reviewed human mobility behavior covering a wide range of scenarios, including urban mobility, public transport, points and regions of interest, ridesharing, bike-sharing, traffic analysis, driving behavior, electric vehicle charging stations planning, mobility on demand, crowd analysis and others.
Introduction
In the contemporary era, Information and Communication Technologies (ICT) plays a pivotal role in addressing diverse human challenges across sectors such as health, energy, transport and others. In recent decades, the advent of Smart Environments has become increasingly evident, a consequence of advancements in Ubiquitous Computing through Internet of Things (IoT), focusing on different application areas such as healthcare [6,21,60], monitoring [12], assisting [19,49], and executing everyday tasks. In various types of Smart Environments (e.g., stadiums, convention centers, shopping malls), the analysis and monitoring of human mobility behavior offer a wide range of applications. The aforementioned applications encompass statistical monitoring and analysis of the stores that are visited most often, monitoring air quality and temperature to automatically activate corresponding devices, recommending indoor routes and visits to locations based on past user data, and optimizing energy consumption through the regulation of lighting and air conditioning in unoccupied areas [80]. In these examples, the historical movement data of individuals within the environments serves as a valuable foundation for conducting behavior analysis.
Smart Cities are also Smart Environments in a large scale, as a result of the widespread use of technology to improve the well-being of citizens [10,72]. The establishment of Smart Cities presents multiple challenges, offering extensive research opportunities with high socio-economic impact areas, such as public transport and urban mobility, security, public inspection, and the management of public and private services [3]. The utilization of smart devices and IoT for monitoring and delivering these services results in the generation of substantial datasets. Examples include the monitoring of smartphones to identify population density in certain regions [25], the use of public transport ticketing data from smart cards for mobility patterns analysis [11], use of Telecommunications BigData also for analysis of mobility behavior, and analysis of security videos to identify anomalous human behavior (e.g. violence, robberies) [10,55].
Mobility issues are part of the main challenges in big cities. According to the Brazilian Institute of Geography and Statistics (IBGE),1
The growth of the active fleet, especially of small private vehicles, has a wide impact on traffic and mobility planning in cities, such as plan alternatives to traffic jams, need to widen streets to improve traffic, parking with restricted space in large business an economical centers, public transport ineffective, not covering people travel needs or economically impracticable.
Effective planning of transport services and the implementation of Mobility-on-Demand (MoD) strategies demand a thorough comprehension of travel behavior patterns and their evolution over time. This understanding should be coupled with insights into city functional regions and land-use dynamics. As an example of behavior changes, reports from the Center for Studies in Regulation and Infrastructure at FGV [20,58] highlighted the profound impact of the COVID-19 pandemic on the public transport sector, resulting in a significant decline of 70% to 85% in passenger numbers in certain capitals. The rise of remote work practices, including the adoption of hybrid work models (combining office and remote work), has further contributed to a profound shift in the usage patterns of public transport. Also related, the absence of effective transport routes and schedules in regions experiencing significant demand, coupled with a dearth of information regarding public transport options (especially buses), makes individuals to seek alternative modes of transportation, potentially leading to the acquisition of personal vehicles.
The movement of individuals and vehicles, when recorded, is usually stored through logs containing georeferenced coordinates with timestamps, generated by equipment supporting the Global Positioning System (GPS) service. Databases containing these historical records constitute a rich terrain for conducting behavior analyses, searching for mobility patterns through computational models. The application of clustering [11,36,38] and Dynamic Time Warping (DTW) analysis [36,43,68] on this historical data are some examples of context and behavior pattern exploration.
The use of Machine Learning (ML) techniques on different types of datasets (e.g. images, videos, travel history) to identify objects (vehicles, pedestrians, traffic signs etc) represents an important strategy for multiple areas, such as assisted and autonomous driving [28], plate recognition [71] and data selection and filtering [84].
In this scenario, we realized a Systematic Literature Review (SLR), with the following central question: “How computational techniques are being used to analyse human mobility behavior in Smart Cities and Smart Environments?”. The review used a consolidated methodology described in [37], considering 6 databases of articles. A total of 5989 articles were initially found and filtered, resulting in 56 articles reviewed. To the best of our knowledge, this is the first study to review human mobility behavior covering a wide range of scenarios, including urban mobility, public transport, points and regions of interest, ridesharing, bike-sharing, traffic analysis, driving behavior, electric vehicle charging stations planning, mobility on demand, crowd analysis and others.
This article is organized as follows. Section 2 provides an overview of related work, specifically other literature reviews addressing similar research questions about computational solutions for identifying and predicting human mobility behavior. Section 3 outlines the applied methodology, while Section 4 presents the results of this study. Section 5 engages in a discussion of the findings, highlighting gaps and identifying research opportunities. Finally, Section 6 presents the conclusions.
This section presents reviews found during the filtering process of this study. We considered related work the studies that also review computational techniques for human mobility behavior analysis.
The state of the art of Crowd Analysis was reviewed in [10], with two subdivision axes: crowd statistics and crowd behavior analysis. The study summarizes and presents taxonomies and models of crowd behavior from other studies, focusing on: identifying the use of deep neural networks but not limited to; identifying which datasets are used (public/private) and data annotators. Pedestrian and group detection are treated as a specific session on the review, as it is considered essential tools for crowd analysis by the authors. A table is presented for each study, with date, research axis (such as: group behavior analysis, motion tracking, crowd statistics, anomaly behavior detection, crowd recognition, motion prediction, and others), dataset used, use or not of deep learning and availability of source-code. Also a table with detailed sensor and datasets types are presented. The purpose of the review is to find subareas in crowd analysis, that are still unexplored or that seem to be rarely addressed through the prism of Deep Learning. Finally, authors concludes about: the importance of data annotation tools, somehow neglected by the research community; group analysis-related tasks are not widely explored using Deep Learning methods, despite their widespread use in crowd analysis; massive crowd analysis for motion tracking and/or anomaly detection is not widely explored by the Deep Learning literature, due to the non-existence of relevant datasets.
Data generated by Telecommunication Companies (Telco) was explored in [14], presenting a small tutorial to convey a basic and advanced understanding of the unique characteristics, challenges and opportunities of Telco big data management. An overview the state-of-the-art in Telco big data analytics, on a set of basic pillars, namely: (i) background and respective architectures; (ii) real-time analytics and detection; (iii) experience, human behavior and retention analytics; (iv) privacy; and (v) storage. To quantisize the data-size challenges, the example of city of Shenzend, China is presented, producing about 5TB of data per day from 10 million users. Also studies exploring customer churn prediciton and groups behavior are presented. The authors conclude with open problems and future directions.
Video techniques for human activity recognition was surveyed in [55], reviewing 20 studies and classifying which algorithms/techniques recognized which activities. Going further, the authors evaluated some of the techniques (e.g. LSTF – Local Space Time Features, MIMM – Multiple Instance Markov Model, MRF – Markov Random Field, SFG – String of Features Graph) with three annotated datasets to find its precision. Finally, a table with the accuracy results is presented, with average values between 58% (SFG) to 93% (MIMM).
The use of GPS installed in taxis was reviewed in [13], to analyze mobility from various perspectives. The authors provided a formalization of the datasets and an overview of data processing methods. A timeline table was shown, categorizing the found articles into three main categories: social dynamics, traffic dynamics, and operational dynamics. Within social dynamics, articles were classified into techniques for extracting frequently visited locations and urban computing. Within operational dynamics, articles were categorized into passenger/taxi-finding strategies, route planning, route prediction and anomaly detection. Finally, discussions on the findings and directions for future research are presented.
Different from the aforementioned works, we reviewed the literature covering a wide range of scenarios related to human mobility behavior and computational techniques. This study explored six databases to contemplate more articles, following the guidelines proposed in [37], which recommend that systematic reviews encompass as much of the literature as possible. The main contributions of this study are:
An overview of the challenges associated with analyzing behavior through computational techniques; A list of the techniques used by the studies; A compilation and taxonomy of the algorithms currently employed; A compilation and taxonomy of the machine learning techniques identified; A taxonomy of the data sources utilized, accompanied by a list of the identified datasets; A list of the devices and sensors employed in the reviewed studies; A list of the behaviors that are analysed by the techniques; Research opportunities and open issues related to the study theme.
Methodology
This section explains the methodology employed in this Systematic Literature Review (SLR), addressing the central question: “How computational techniques are being used to analyse human mobility behavior in Smart Cities and Smart Environments?”. The review used the methodology proposed in [37], considering 6 databases of articles to investigate 19 research questions, divided into 1 General Question, 16 Specific Questions and 2 Statistical Questions. A total of 5989 articles were initially found and filtered, seeking to identify the answers to the questions on the literature.
A systematic literature review seek to find, organize, centralize and summarize the knowledge found in existing works regarding a specific research topic, aiming to identify open gaps, trends and opportunities in the subject, through a systematic approach to provide unbiased results. Following the guidelines proposed in [37], the review was executed according to the following protocol:
Define the research questions;
Choose article bases;
Define the criteria for inclusion and exclusion of articles;
Perform selection, sorting, analysis and data extraction;
Organize and summarize the results found in a final report.
The research protocol was all planned and executed in the English language, due to the fact that the article databases that were searched are also in this language.
Research questions
Table 1 presents the research questions defined for this study. The General Question seeks to understand how studies are using computational techniques to analyse human mobility related behaviors in Smart Cities and Smart Environments. Specific questions focus on extracting data and relevant information about the challenges, technologies, techniques, data, and other resources that were found. Finally, statistical questions are asked to evaluate publication sources and years.
Research questions
Research questions
We defined the search string with the following focus: search for “smart environments in general” (e.g. Smart Cities, smart homes) and “behavior”. To clarify, Smart City is also a Smart Environment, but on initial searches, we observed that a limited number of articles related to “smart cities” used the term “smart environment”. The same was identified with “smart homes”.
The first part of string contains terms related to Smart Cities and their synonyms. The second part and third part has terms related to Smart Environments and smart homes, respectively. The last part is related to ‘behavior’ and its synonyms. Table 2 shows the search terms organized by domains, alternative terms, the string summary semantics and the full string that was applied.
Search terms
Search terms
At the beginning of the protocol definition, initial searches helped to define the search terms, covering a wide range of scenarios related to behavior and computational techniques. After defining the search strings, the databases were selected, performing the search and review in April 2022. Table 3 shows the list of article databases and summarizes the search strategy applied in each one.
Databases reviewed
After completing the search process, all data from the articles were organized and passed through a filter, considering Inclusion Criteria (IC) and Exclusion Criteria (EC). This literature considered only the articles that met all the inclusion criteria and excluded the articles that met at least one exclusion criterion. This method enables the exclusion of studies unrelated to the research topic, such as noise caused by database search algorithms. resulting in a refined selection of articles closely aligned with the review’s scope. Table 4 shows all criteria.
Research criteria
Research criteria

Articles classified by study area. Selected articles in green.
An algorithm was developed to filter studies related to the specific criteria of IC2, which requires the article to have at least one term of each group in the title, keywords or abstract, as IEEExplore and SpringerLink restrict the search method to fulltext. Detailing EC2, the interpretation of the main criteria is based on the selection of studies whose main objective is to analyse behaviors (e.g. identify and/or predict), based on any computational method found. Articles describing changes in user behavior only as a validation method were discarded. EC3 is only related to study access, and EC4 is mainly interpreted as the domain of studies should be related to Smart Environments, Smart Cities or smart homes.
After applying the EC1, EC2, EC3 and EC4, a total of 357 articles resulted, related to behavior analysis in Smart Environments and Smart Cities. Figure 1 shows the articles classified by study area. The majority of studies focus on Ambient Assisted Living (AAL) with 266 articles (74.5%). Activity Recognition (AR) and ‘Health and Elderly Support’ emerge as the two primary subareas within AAL, featuring 128 and 111 articles, respectively. Human mobility related behavior represents 15.7% of the studies, with 56 articles. Other identified areas include Security (14–3.9%), Social Dynamics (13–3.6%), Database Behavior Exploration (5–1.4%), and Mental Health (3–0.8%). Some studies could be classified into two or more subareas, with the most specific association being selected.
Figure 2 presents other reviews found during this study, classified by focus area. Despite representing 15.7% of the studies identified, the analysis of human mobility behavior covering a wide range of scenarios was not yet explored by other systematic literature reviews. Therefore, the exclusion criterion EC5 was added to concentrate the research in this area. The application of this criterion removed 301 studies, resulting in 56 articles. Figure 3 shows the complete filtering process described in this section.

Reviews classified by focus areas. Related works in green.

Review study filtering overview.
This section aims to answer the research questions presented in Table 1. All articles selected in this review are listed in Table 14, located in the Appendix. A code was assigned to each article for better organization within the text, for example, ‘A01’.
SQ01 – What are the challenges to analyse human mobility behavior through computational techniques?
The following items present the main challenges found in the articles:
Data generated from experiments with volunteers or crowdsourcing can also generate inaccurate results, since the forms may not always be filled out correctly, or not all data may have been collected by user forgetting [63]. On platforms that are gamified, or that have some aspect of competition, users may also tend to fill only advantageous data about their behavior [35]. Studies that use sensing on the observed entity itself (e.g. person or vehicle) have the same cost problem, for example: cameras in vehicles [56]; Use of GPS in vehicles and accelerometer sensors [67,68], among others. Another problem is the lack of annotated datasets. In the study of [77] researchers manually annotated 1000 vehicles to generate the training database. This problem is also discussed by [10], a review in which the authors express concern about the non-existence of these annotated datasets. DiDi,3
Techniques used to identify and predict behaviors
(Continued)
(Continued)
Table 5 presents the techniques applied on each article. The study found several combinations of techniques, types of sensors and data crossings, ranging from the use of smartphone sensors to identify aggressive driving, use of public transport ticketing databases for analysis and prediction of travel patterns, to the use of Telco bases for urban mobility analysis.
SQ03 – What algorithms are being used?
Table 6 lists the algorithms used in the articles. Not all studies identify the names of the used algorithms. The aim of this review is not to extensively and comprehensively explore the algorithms used, but rather to identify the methods employed. Figure 4 presents a taxonomy of the algorithms grouped by type. The articles IDs associated with each algorithm are enclosed in brackets.
Multi-Object Tracking algorithms are represented in Table 6 only as MOT, because that is a vast area of research, which explores video processing techniques for tracking objects, involving a wide range of algorithms (e.g. FairMOT, TransCenter, CenterTrack, CTracker). This link5
Algorithms found
(Continued)

Algorithms being used on studies.
This question delves deeper into understanding the Machine Learning (ML) techniques identified in this review. Despite this, the objective of this research is not to comprehensively explore the entire field of ML, which would require dedicated research on the subject. Table 7 lists the ML techniques identified.
During the review, we observed that 20 articles used ML, with a total of 35 techniques. The articles that most used ML are: i) [45] brought 10 references; ii) [38] with 8; iii) [78] with 7; iv) [39] with 4. Most of these algorithms was used as comparison baselines to validate the proposed models. Articles in ii) and iv) has authors in common, suggesting a potential affiliation within the same research group. Figure 5 provides an additional taxonomy of Machine Learning techniques organized by type.
Figure 6 presents a bar chart of techniques ordered according to their frequency of usage. The Support Vector Machine (SVM) emerges as the most frequently employed method in the reviewed articles, with a total of 4 occurrences. This is followed by Long Short-Term Memory (LSTM) and Gated Recurrent Unit (GRU), each appearing 3 times. Support Vector Regression (SVR), Diffusion Convolutional Recurrent Neural Network (DCRNN), and Convolutional Neural Network (CNN) are each utilized twice. All other techniques are represented by a single instance of usage.
Figure 7 illustrates the temporal evolution of ML usage. The data show an initial increase in use beginning in 2015 and continuing until 2017, followed by a subsequent increase from 2019 to 2022, indicating a trend of growing interest.
Machine Learning algorithms
Machine Learning algorithms

Machine Learning techniques being used on studies.

Machine Learning techniques: usage count.

Yearly count of articles using machine learning.
Figure 8 presents a taxonomy of the data sources. The dataset usage description are listed below:
Telecom databases was explored by 8 articles.
Big data containing GPS trajectories collected through industry applications was used in 8 studies.
6 articles used ticketing data through Bus/Public transport.
One article used a shared-bike pick/drop dataset.
One article used an electric vehicle recharge dataset.
Local Position Registering was explored by 8 articles, with 5 using WIFI technology, 3 using Bluetooth (two also used WIFI), and 2 using RFID readers.
User-filled data was used by two articles, with one utilizing Location Based Social Networks (LBSN) and one using form/survey data.
Smartphone sensors were explored by 9 articles, with 4 using accelerometers, 6 using GPS, and 1 using a microphone.
Image-based sources were used by 11 articles, with 8 using public/security cameras, 1 using thermal cameras, 3 using traffic cameras, 1 using vehicle internal camera, 2 using LIDAR, 1 using a vehicle counter, and 1 using a vehicle plate recognizer.
Government or Public Data was explored by 13 articles, including one using GTFS, one using land-use datasets, 3 using public surveys, and 7 using Points of Interest (POIs) datasets.
GPS Travel History datasets were explored by the majority of the articles, with 16 articles. 8 used users’ smartphones to generate the datasets, 1 used shared-bike history, 1 used Goods transport history, 1 used public transport, 4 used Ridesharing datasets, and 3 used private vehicles datasets.
In the studies of [62] and [63], data was generated through volunteering by crowdsourcing with smartphone app, with 1,000 people generating data on 125,000 trips [62], and 8,000 generating 150,000 trips [63]. Both articles have authors in common, probably being the evolution of the research over time. Data collection through volunteering also poses major challenges, from coordination, motivation, and others already reported in SQ01.
Figure 9 presents a bar chart of data-source type usage by articles, demonstrating the predominant interest in utilizing GPS Travel History datasets from various segments in research, with 16 articles. Nonetheless, the importance of public datasets and open data policies is also evident, as identified by the use of Government or Public Data in 12 articles. The same rationale can be applied to Usage/Ticketing databases, used by 11 articles, and mobility databases based on Telecom data, utilized in 8 studies. Together, these four groups represent 35 articles (some articles use more than one type of data-source), highlighting the importance of strengthening partnerships with industry and public open data policies to support scientific research.

Data sources used on the studies.

Total or articles using each data source group.
In both industry and academia, trajectory storage is typically accomplished by collecting multiple GPS coordinates along with the record date-time (Timestamp), forming the contextual history of the observed moving entity. This question is intended to explore other types of data that are utilized and analyzed in conjunction to identify behaviors. Table 8 lists data-types identified.
Data-types identified in review
Data-types identified in review
Installing sensors to cover entire cities and geographically wide environments is a challenge that demands many resources, already explored in SQ01. Table 9 shows devices and sensors identified on the review. Common types of equipment used for data collection were identified, including Telecom using Radio Network Controllers (RNC); Bigdata using location data generated by users’ smartphones; videos generated by cameras; smart card transactions being processed by Bus Ticketing equipment, among others.
It was also observed the use of less explored sensors, such as accelerometers of smartphones to identify driving behaviors [5,67,68]. The use of the accelerometer is also explored by [80] to monitor displacement behavior between indoor points, in addition to the use of audio collected by the microphone to identify crowding levels around a pedestrian.
Figure 10 presents a bar chart of sensor type usage, demonstrating a strong tendency towards the usage of smartphones and GPS data, with 23 and 18 studies, respectively. Camera were used by 9 studies, followed by Radio Network Controller (RNC) and Ticketing equipment, each used in 7 studies.
Sensors/devices used to collect data by articles
Sensors/devices used to collect data by articles

Sensor type use count by article.
As already explained in SQ01, one of the great challenges is the scarcity of databases to be explored by researchers, annotated or not. During the execution of the research the following databases were found.
The source-codes produced by the articles that are publicly available are listed below. Source-codes related to Multi-Object Tracking (MOT) was already mentioned in SQ08.
Context can be understood as any specific information of some entity stored in a certain period of time, this entity can be anything, like a vehicle, a person, an object, a place etc [66]. This information can be its geolocation, its size, speed, temperature, color, weight, or other information about its state at the given moment [27]. Storing a collection of these contexts over time allows exploration of the past behavior of the entity being analyzed. In the literature, the name “Contexts History” is given to this collection [16].
Context histories allow the exploration of patterns in the past [23], similarity [26] and make predictions of future contexts [16]. Predictive models use past contexts combined with present context to make predictions of future contexts [16,48].
None of the articles found literally express in the text the use of Context History and Context Prediction, in the terms presented above. Despite this, as a ‘History’ criterion, articles using historical data, such as the geolocated trajectories of users/vehicles, were considered, and as a ‘Prediction’ criterion, articles that predicted future contexts. The result of this classification is shown in Table 10.
Articles that uses context history and context prediction
Articles that uses context history and context prediction
Of all the studies reviewed, 51 (91.1%) were conducted and validated in the context of Smart Cities. Study A08 (1.8%) was conducted in Large Crowded Buildings (LCB), while A19 and A25 are related to any type of Smart Environment (3.6%). Studies A22 and A34 make reference to any Indoor/Outdoor Crowded Space (3.6%).
Figure 11 presents a chart categorizing the studies by their environment type. As the figure shows, mobility-related behavior is extensively studied within the context of Smart Cities.

Studies classified by Smart Environment.
Table 11 lists the identified areas in which the studies have been applied. Many articles are associated with more than one area, demonstrating the interdisciplinary nature of the researches. For example, articles related to Crowd Analysis and Regions of Interest often overlap with Urban Mobility and Traffic Analysis.
Areas that the studies are applied
Areas that the studies are applied
Figure 12 presents the number of articles by area. The broad field of Urban Mobility has a significant number of articles related to mobility behavior, distributed across various subareas, such as Traffic Analysis, Public Transportation, Crowd Analysis, Regions of Interest, and others.
Electric Vehicle Charging Stations Planning is under-explored within the mobility behavior perspective. There is a need for more targeted research to address the unique challenges in this field, such sustainable infrastructure development, given the growing interest in the topic.
The use of technologies such as cameras, LIDAR, and GPS in areas like Crowd Analysis and Traffic Analysis indicates the integration of advanced technologies in research. Future discussions could explore how emerging technologies like AI (Artificial Intelligence) and IoT can further enhance these studies.
Research in areas like Public Transport, Risk and Emergency Management, and Traffic Safety has direct implications for public policy and urban planning. Modern mobility solutions like Bike-sharing, Ridesharing and Mobility-on-Demand are also being studied within the mobility behavior perspective, with 3, 4 and 2 articles respectively.

Area of application by studies.
Table 12 lists the behaviors that are being observed by the studies. The meaning of each behavior is also described.
The substantial number of studies related to Movement Patterns (29), as shown in Fig. 13, indicates a significant interest in understanding the movement of people and vehicles across different environments. Additionally, the studies related to Transport Modal (14), Semantic Travel Intention (2), Presence, Stay-time or Arrive-Stay-Leave (15), and Boarding and Disembarking Patterns (7) demonstrate a growing interest in comprehending these behaviors, which assists in planning and guiding public policies related to urban mobility.
Analyzing the mobility behaviors of electric vehicles and measuring the demand for recharging represents an important research area, given its potential to contribute to the development of sustainable urban mobility solutions.
Many behaviors overlap with multiple articles, demonstrating the interdisciplinary nature of research in this domain. For example, Transport Modal and Movement Patterns often intersect with other behaviors like Presence, Stay-time or Arrive-Stay-Leave.
Behaviors observed by the studies
Behaviors observed by the studies

Number of studies by observed behavior.
Studies that allow the identification of a specific entity, and then extrapolate this analysis to the collective (micro to macro) were identified in Table 13 as observing single and multiple entities. Studies that conducted analyzes of specific groups of entities within the observed population were categorized accordingly. Studies that exclusively analyzed the collective, without the ability to identify individual entities, were designated as only observing crowd behavior.
Articles that uses context history and context prediction
Articles that uses context history and context prediction
Studies of [9] and [4] used data collected through an API called Sii-Mobility [52], built based on the KM4City [8] ontology. These three articles share common authors, suggesting that they were likely conducted by related research group. Additional information about this ontology can also be found in a specific disclosure link .20
The study from [70] utilized the Ontology FIESTA-IOT with the specific M3-lite taxonomy [1]. The objective was to provide seamless interoperability and information transparency from IoT systems to crowd management applications. These two articles also share common authors, suggesting that they were conducted by related research group.
Various methodologies were identified and applied in the validation of the studies. The following list provides an overview of these techniques:
Studies that conducted their research through the application of computational models in datasets that the researchers had access, in a general view, validated their research in two ways: The results of the computational models proposed by the research were compared with results of related computational models, as explored by [45]. Te study proposal was compared with 9 other computational models. Some of the studies based on Machine Learning models used part of the dataset for training, leaving another part for validating the results: The studies that used transaction data from smart cards in public transport [50,65] validated their boarding and disembarking prediction algorithms based on data from subsequent days generated by users. Computational models that observe the behavior of aggressive driving [67,68], using the accelerometer of the smartphone, validated the research in real time with tests in vehicles monitored by the researchers. The research that used computer simulation [85], analyzed the simulation data itself to validate the models. One study had access to multiple datasets [56], and used a dataset for training and another for validation.
STQ01 – Where the studies are being published?
Figure 14 illustrates the number of publications by database and year. In ScienceDirect, 12 publications related to mobility and behavior were selected. In the ACM, 3 publications on the subject were found. IEEE Xplore represented the majority of the publications, with 23 articles selected. In Springer Link, 3 articles were selected, and one from Wiley Online Library.
The Scopus database indexes articles from other databases, potentially leading to duplicate articles. In Scopus, 14 articles were found (out of a total of 28 articles), and 14 articles were identified as duplicates. However, these duplicates were attributed to their original ScienceDirect, ACM, IEEE Xplore, and Springer Link databases.

Publications per year and databases.
Figure 14 also illustrates the number of publications per year. The initial publications related to mobility and behavior emerged in 2014, showing fluctuations and a general upward trend in subsequent years. In 2017, a total of 9 articles were identified. The peak of articles occurred in 2020 with 13 publications. In 2021, a total of 6 articles were found. This review collected articles until April/2022, with 7 additional articles identified during that period.
Discussion
This section explores and discusses the results presented in the previous section. Firstly, the central question of the review is addressed. Subsequently, a discussion on challenges and open research questions is provided.
How computational techniques are being used to analyse human mobility behaviors in Smart Cities and Smart Environments?
The analysis of mobility behaviors through computational techniques finds applications in various domains and provides valuable insights for urban planning, transportation services, and a deeper understanding of human interactions in Smart Environments. A significant number of studies have focused on examining how different transportation modes are being utilized, 14 in total. Research efforts include the analysis of incentives to use environmentally sustainable modes of transportation, such as bike-sharing and others, resulting in benefits for individuals (health) as well for the population, with the reduction of CO2 emissions and improvements in traffic flow [62,63].
The identification of origin-destination patterns, through passenger boarding and disembarking data, provides valuable information for public transportation demand planning, achieving accuracy rates of up to 85% in predicting this behaviors on specific regions [46,50]. With known origin-destination patterns, the implementation of mobility-on-demand services are facilitated, enabling the deployment of other last-mile transport modals near mass transit boarding and disembarking stations (such as trains and buses) [65]. Taxis and ridesharing services are crucial modes for providing this last-mile transportation. Analyzing the mobility patterns of these vehicles offers even more granular and specific insights into the reality of urban mobility [33,39,78].
The analysis of trajectories from small private vehicles provides crucial information about traffic patterns and congestion in large economic districts [77]. Specifically, a significant subset of articles, totaling 18, focuses on traffic behavior analysis. The techniques employed to achieve these objectives are diverse. A private vehicle datasets was utilized in [45,77] to cluster urban regions, evaluating Arrive-Stay-Leave patterns in these areas. Wifi routers was used in [25] and [9] to collect people’s crossing information by capturing the MAC addresses of smartphones passing within the routers’ range. With a similar purpose, [71] employed vehicle counters and vehicle type identifiers for traffic analysis.
Eight studies explored algorithms applied into Telecom databases, with various focuses such as the analysis of Points and Regions of Interest (POIs and ROIs), and the mobility of users between these regions. Accessing this type of dataset involves challenges and resources, as some Telecom companies commercialize this access.21
Individual behavior analysis, specifically driving behavior, was explored by 4 articles. Data collected from the smartphone’s accelerometer and GPS was used by [68] and [67] to identify driving behaviors and traffic situations. Videos from vehicular cameras and image processing was used in [56] to identify traffic violations and illicit maneuvers performed by other drivers. A test vehicle with various sensors installed was used by [5] to collect comprehensive information about driving behavior, including pedals pressure, gear changes, brakes, and accelerations.
A total of 15 articles explored crowd mobility behaviors (crowd analysis). The use of cameras and object tracking processing algorithms is one of the most commonly employed techniques [84]. A thermal camera was used in [53] for this monitoring, thus preserving users’ privacy. The use of LIDAR in indoor movement tracking was explored by [80]. In addition to LIDAR, the authors used the accelerometer and gyroscope of the mobile phone to monitor this movement, combined with the use of the microphone to identify the volume of conversations around the person, thus estimating occupancy and visitation of indoor locations.
Related to Machine Learning techniques (ML), Fig. 7 illustrates the temporal evolution of ML usage. The data show an initial increase in use beginning in 2015 and continuing until 2017, followed by a subsequent increase from 2019 to 2022, indicating a trend of growing interest.
About the data-sources used, Fig. 9 demonstrates the predominant interest in utilizing datasets of GPS Travel History from various segments in research, with 16 articles. Nonetheless, the importance of public datasets and open data policies is also evident, as identified by the use of Government or Public Data in 12 articles. The same rationale can be applied to “Usage / Ticketing data” databases, used by 11 articles, and mobility databases based on Telecom data, utilized in 8 studies. Together, these four groups represent 35 articles (some articles use more than one type of data-source), highlighting the importance of strengthening partnerships with industry and public open data policies to support scientific research.
The majority of studies related to mobility behavior are extensively realized within the context of Smart Cities, as displayed on Fig. 11. Another observation is the low number of studies that shared the source code produced in their research. Only [17] and [83] made their model source code available in online repositories, a practice that seems to be more common in the field of Multi-Object Tracking [84].
About what behavior was observed, the substantial number of studies related to Movement Patterns (29), as shown in Fig. 13, indicates a significant interest in understanding the movement of people and vehicles across different environments. Additionally, the studies related to Transport Modal (14), Semantic Travel Intention (2), Presence, Stay-time or Arrive-Stay-Leave (15), and Boarding and Disembarking Patterns (7) demonstrate a growing interest in comprehending these behaviors, which assist in planning and guiding public policies related to urban mobility.
The strategies for analyzing behavior, in general, are directly related to the datasets used in the research. Figure 8 provides an overview of the types of datasets explored, and Fig. 9 presents usage count of each type. The complete list of techniques employed are listed in Section 4 (Results and answers – SQ02, in Table 5).
During the review, we noticed the limited use of ontologies. The studies [9] and [4] utilized the KM4City and Sii-Mobility API ontologies, while [70] employed the FIESTA-IOT ontology. In both cases, the ontologies shared common authors.
This section aims to discuss the open issues and research opportunities, especially related to SQ01 (Section 4.1).
Limitations
This review has limitations that may affect the scope of the results. We reviewed six databases to reduce bias, and defined the search string considering major terms and the broader spectrum of computational techniques to analyse behavior. The search process requires a combination of string parts that form the search string, which may limit the studies selection, since related studies can address behavior, Smart Cities and Smart Environments, without specific terms, such as “smart spaces” or “behavioral”. This review considers studies published only in the English language and thus may exclude relevant articles published in other languages. Finally, the search engine algorithms from each database may influence the results, as some databases could implement synonyms automatically or other search strategies.
Conclusion
This study presented a Systematic Literature Review (SLR) with the following central question: How computational techniques are being used to analyse human mobility behaviors in Smart Cities and Smart Environments? A total of 5989 articles were initially found, filtered, resulting in 56 articles reviewed. As the main contribution, this study provides responses to 19 research questions, starting from an exploration of challenges, to listing computational techniques utilized. The algorithms, machine learning techniques and data-sources used by the reviewed studies are also listed and presented through taxonomies. A list of sensors/devices employed on the reviewed studies, and the data-types utilized are also presented. Finally, a comprehensive discussion of the identified techniques is conducted, finishing with a compilation of challenges, open issues and research opportunities.
The analysis of mobility behaviors through computational techniques finds applications in various domains and provides valuable insights for urban planning, transportation services, and a deeper understanding of human interactions in Smart Environments. A significant number of studies have focused on examining how different transportation modes are being utilized. Research efforts include the analysis of incentives to use environmentally sustainable modes of transportation, resulting in benefits for individuals as well for the population. The identification of origin-destination patterns through passenger boarding and disembarking data provides valuable information for public transportation demand planning, achieving accuracy rates of up to 85% in predicting this behaviors on specific regions [46,50].
The analysis of taxis and ridesharing mobility patterns offers granular and specific insights into the reality of urban mobility [33,39,78]. Trajectories from small private vehicles also provides crucial information about traffic patterns and congestion in large economic districts [77], with a significant number of studies in this area.
The exploration of Telecom databases affords important mobility insights, allowing the identification of Points and Regions of Interest (POIs and ROIs), and the mobility of users between these regions, specially when cross-referenced with additional geospatial information [73]. Smartphone’s GPS and accelerometer also allows the a analysis of individual’s driving behavior [67,68]. The use of cameras and object tracking processing algorithms is one of the most commonly employed techniques for crowd mobility behavior analysis [84], with some studies using thermal cameras [53] and LIDAR sensors [80] to indoor and outdoor applications. These large datasets offer a rich foundation for the exploration of patterns in the data through Machine Learning techniques, as identified in this study.
Future work
Future reviews may extend this study by considering deeper behavior-specific research by application area. New studies can explore the challenges and open issues listed in Section 5.2. Strategies to analyse human mobility behavior are directly related to the datasets available. Research efforts can be focused on generating annotated mobility datasets, development of annotation tools, behavior analysis, Machine Learning techniques and big-data computational challenges.
