Abstract
Keywords
Introduction
In the Internet of things (IoT), the Internet is used as medium to control all kinds of physical devices for collecting real-time environmental data and transferring the data to middleware’s specific function modules. And the middleware returns the results to users. At present, the IoT is widely employed in intelligent home, intelligent city, intelligent healthcare, intelligent transportation, and other fields. For example, apps on smart phones enable many sensors to monitor traffic jam and optimize routes in intelligent transportation. Type-B ultrasonic can use ultrasonic sensors to know every move and development status of the fetal, which can avoid some unexpected circumstances. Numerous sensors are applied in our daily life.
Due to the development of communication technology and the emergence of various intelligent objects, a large number of physical objects are connected to the IoT. 1 Cisco Corp. predicts that the number of objects in the IoT will be increased to 50 billion in 2020. And a large amount of sensor data are expected to be generated continuously. The generated data contain two categories. One describes sensors’ attributes, and the other captures realistic environment data. However, different sensor stores data in different data structures, which causes the heterogeneity between sensors. Although the scale of the IoT is huge, the heterogeneity of sensor data and environment hinders the interaction between sensors, which is not conducive to the development of the IoT. In order to solve this problem, the concept of semantic sensor network 2 was proposed. The specific definitions and models were also given. In the face of the descriptive data in the semantic sensor network, many traditional sensor recommendation algorithms need to process large-scale sensor data, such as collaborative filtering algorithm, convolutional neural network, and other algorithms, which must converse all semantic sensor data into normal form. This processing of data conversion takes a lot of time and energy, so this traditional sensor recommendation algorithm is not applicable in semantic sensor networks. How to recommend sensor based on semantic sensor data is a problem to overcome.
In recent years, many studies3–6 based on SPARQL retrieval have made significant progress in sensor recommendation under the semantic sensor network. Zhang et al. 4 proposed a schedule that it expressed sensor data into Resource Description Framework (RDF) form based on semantic sensor network standard, including XML-based mapping language algorithm (SASML) and Sensor Data to RDF Mapping (SDRM) algorithm. Bermudezedo et al. 5 reduced the semantic description’s complexity and processing time in the semantic sensor network. They listed 10 semantic model design principles, then design models (such as IoT-Lite) with better scalability by following these principles. The above paper solved the problem of semantic data modeling in semantic sensor networks and laid a foundation for sensor recommendation. Rasyid et al. 6 created a project to provide users with sensor data. The PROTEGE editor was used to model the sensor data in the csv file into an entity and store it in the database in the form of tuple. And it used the SPARQL integrated in the Sesame framework to search database and display results to users. Gong et al. 8 constructed a CASSF framework, and they used SRSF to analyze user needs’ semantics and SPARQL to retrieve sensors. After semantic analysis, Comparative Priority-based Weighted Index (CPWI) was used to characterize and rank sensor options. Relying heavily on SPARQL for parsing semantic sensor data in sensor recommendation algorithm has a big problem that ignoring sensor attributes and user preferences causes the recommended sensors fail to meet user needs.
Based on the above problems, we propose a graph-based sensor recommendation model. This model parses the sensor semantic data based on the semantic sensor network ontology to model a weighted data graph, and it takes user’s preference for sensor attributes into account to model a weighted search graph. From the perspective of the user, the weighted data graph is characterized and pruned using threshold pruning algorithm. Then, we use the improved fast non-dominated sorting algorithm and Simple Additive Weighting (SAW) algorithm to sort sensor options. The main contributions of our model are as follows:
We redefine the matching method between the two weighted graphs, where we parse semantic sensor data to model a weighted data graph and get the user’s preference of sensor’s attributes to model a weighted search graph.
We propose a threshold pruning algorithm to narrow the matching range and improve the matching efficiency.
We use the improved fast non-dominated sorting algorithm to obtain the local optimal solution of the sensor data set after pruning, which improves the accuracy of the algorithm.
The rest of this article is organized as follows: section “Related work” investigates the related work of sensor recommendation algorithms under the IoT. The proposed graph-based sensor recommendation model is introduced in section “Sensor recommend model.” Section “Experiment evaluation” reports and discusses the experimental results. Finally, we present our conclusions and future work in section “Conclusion.”
Related work
In this section, we first discuss related work of sensor recommendation algorithms in the IoT and summarize the differences between our proposed model and previous work. Zhou et al. 7 divided existing sensor recommendation algorithms into two categories, which include context-based algorithm and content-based algorithm.
Context-based sensor recommendation algorithms rely on different accessible context information in the user’s description of sensor. The existing sensor recommendation algorithms conduct the sensor recommendation from two aspects. On one hand, previous sensor recommendation works in semantic sensor network mainly contain two parts. Some simplify the sensor ontology for easy implying, and others use contextual content to sort sensors after using SPARQL to retrieve semantic sensor data. On the other hand, sensor recommendation algorithms based on general data sets improve themselves by increasing precision or reducing response time.
Parsing semantic sensor data is the key in the semantic sensor network, which is why many studies are based on SPARQL. Rasyid et al. 6 created a project to provide users with semantic sensor data. In terms of data processing, the information collected by the sensor was stored in the csv file, and the PROTEGE editor was used to transform sensor information into entity which was stored as the tuples in the database. The Sesame framework is used to retrieve the database after the user’s needs converted into the SPARQL search form, and the search information was fed back to the user. The CASSF 8 framework enriched the sensor attributes of the semantic sensor network ontology and used RDF to store entity data for maximizing the preservation of semantic information, where the CASSF framework also used SPARQL for retrieval sensor data. Perera et al. 9 proposed a context-based sensor recommendation model CASSARAM, which considered user preferences of sensor attributes, such as reliability, accuracy, and battery life. This model employed CPWI to characterize and sort sensor options. Chirila et al. 10 proposed a proxy-based architecture, which can perform sensors’ discovery and recommendation at the same time. And it used a web service clustering method during recommending to reduce the search space of candidate services. Mecibah et al. 11 proposed a mechanism based on concept catalogs to speed up the search of semantic resources. When users searched for semantic resources, they will first search the directory. It used SPARQL to search the database if queries were in directory, which avoided an invalid search. Gomes et al. 12 proposed the semantic-based service discovery architecture QoDisco, which was composed of a set of independent repositories. QoDisco also designed synchronous and asynchronous retrieval mechanisms. The above research is based on semantic sensor network for sensor recommendation, which utilized SPARQL to retrieve sensor data.
In terms of improving the accuracy of traditional sensor recommendation algorithms, many excellent models had been proposed. Neha et al. 13 proposed the Vector Based Sensor Ranking (VBSR) model, which calculated a preference-based weighted index (PBWI) for each sensor option, which relies on the sensors’ attributes value and context attributes value input by the user. Nunes et al. 14 took sensor recommendation as a multi-criteria decision analysis problem. In order to combine the high accuracy of the fast non-dominated sorting algorithm and the low response time of TOPSIS algorithm, 15 they proposed ES algorithm, which reduced the time complexity of the fast non-dominated sorting algorithm by limiting the size of sensor data set. Kertiou et al. 16 used the dynamic skyline algorithm to filter sensor data, which improved the accuracy of multi-criteria decision analysis algorithm. After filtering the data, the sensor data were characterized, sorted, and recommended to the user. In order to select the most desired sensor according to user needs, Nithya et al. 17 proposed a clustering method to optimize the sensors’ selection in the IoT. Bharti et al. 18 proposed a value of information-based sensor ranking mechanism (VoISRAM), which considered sensor context information and sensor service level as information attribute values for modeling. And this model attempted to balance between services’ QoS requirements and energy consumption. Above research is about non-semantic sensor recommendation, and some algorithms are used to improve the accuracy of recommendation, such as the fast non-dominated sorting algorithm, the dynamic skyline algorithm, and clustering algorithm.
Content-based sensor recommendation algorithms retrieve the sensor history output data based on the user’s demand for the sensor. 19 They are required to process a large amount of sensor historical data. If algorithms process semantic sensor data, it mainly consumes a lot of time and energy of the sensor. Therefore, the content-based sensor recommendation algorithm cannot refer to the semantic sensor data. The existing content-based sensor recommendation algorithms make improvements by increasing the accuracy and decreasing the response time of sensor recommendation algorithms. In recent years, machine learning models have also been involved. Truong et al. 20 borrowed the idea of Google retrieving pictures and adopted the “case-by-case search,” which avoided inaccurate input of keywords by users. It took the sensor’ historical output as a comparison object and used fuzzy set to efficiently calculate the similarity score for obtaining a ranked list of matching sensor options. Ostermaier et al. 21 proposed a real-time prediction model Dyser for the IoT, which supported web infrastructure to publish sensor entity data and retrieve data based on the specified sensor’s type. When Dyser returned the search results, it outputted part of the relevant sensor data in descending order according to the predicted consequences, which reduced a lot of communication overhead. Truong and Romer 22 proposed a lightweight prediction model CSS based on fuzzy logic to estimate the probability of a sensor option matching a search query. This model implemented content-based sensor search in the IoT, which had low communication overhead and computational efficiency performances. Zhang et al. 23 proposed a low-cost and high-precision prediction model based on quantitative values. This model used a multi-step prediction method, and they apply approximate values to evaluate the next state of the sensor at any time. Zhang et al. 24 proposed a sensor state prediction method to estimate the short-term sensor’s state. This prediction method could make the best use of the time correlation among sensor data, and then it accurately sensed the future trend of sensor readings. Vasilev et al. 25 proposed a scalable model based on hypergraph representation for evaluating the cooperative relationship between sensor nodes, which well covered the dynamic characteristics of complex sensor networks. Tang and Zhou 26 proposed SMSTK search engine, whose encoding index could realize efficient query processing in the search of object devices based on the spatiotemporal keywords. Chen et al. 27 used a latent probability model to learn user preferences and embedded the social relationships of smart objects in a shared low-dimensional space to estimate the social similarity of smart objects, and they used item-based collaborative filtering to generate a recommendation list. Later, Chen et al. 28 proposed a physical store recommendation model by learning user preferences from user-generated heterogeneous information.
In recent years, machine learning has also been applied to recommend sensors in IoT. Mietz and Romer 29 used the highly correlated characteristics of many sensors’ output to learn the relevant structure from sensors’ historical data and then modeled it as a Bayesian network (BN). This model could estimate the sensor option’s recommended probability without knowing the current sensor output, and it recommended sensor options to the user with a superior acquisition probability. Zhang et al. 30 established a prediction model based on historical temperature and humidity information, after using the collected environmental data to learn and train the back propagation (BP) network. When the change trend of environmental parameters exceeds the threshold, an early warning is performed to discover hidden danger of the equipment in advance. Li et al. 31 first introduced deep learning to the edge computing environment, and they had also designed a new offload strategy to optimize the performance of deep learning applications based on edge computing due to the limited processing power of existing edge nodes. Although there are machine learning algorithms involved in the sensor recommendation field, they are not mainstream because a lot of time cannot be expended in training models in the IoT.
At present, there are relatively few sensor recommendation algorithms for semantic data, but this aspect is of great significance for the IoT to solve the heterogeneity of devices. Most of recommendation algorithms for the semantic sensor network rely excessively on the use of SPARQL to retrieve semantic sensor data, which results in the recommended sensor options being locally optimal. The SPARQL query semantic database operation is equivalent to query a relational database using SQL statements, which enable the query results fail to meet the user’s needs without involving user’s preference information. Therefore, we propose a graph-based sensor recommendation model. This model can make better use of RDF semantic data stored in graphs, and it considers sensors’ attributes and user’s preference for sensor recommendation.
Sensor recommend model
In this part, we will first give the problem definition of our model. Then, we introduce background knowledge of recommending semantic sensor network. Finally, we state our graph-based sensor recommendation model.
Problem definition
For ease of the following presentation, we define the key data structures and notations used in the proposed model. Table 1 lists the relevant notations used in this article.
Notations used in this article.
Definition 1 (data graph)
In this article, we parse the semantic sensor data and store them in different files according to the sensor type, which is convenient to access. Then, we read the specific file based on the user required sensor type to model a weighted data graph
And each

An example of data graph.
Definition 2 (search graph)
We model a weighted search graph

An example of search graph.
Definition 3 (the first non-dominated front)
In multi-criteria decision analysis problems, there may be conflicts and incomparability between multiple criteria. One solution may be the best in one criterion, and the worse in another. We select two sensor options (
With the aforementioned definitions, the problem of graph-based sensor recommendation model can be formally stated as follows:
Given semantic sensor data set and user input sensor preferences of sensor’s attributes, the problem of graph-based sensor recommendation model aims to select top-
Background
RDF
RDF is advocated by W3C in order to describe resources on the World Wide Web and their relationships. The core of the RDF data model includes resources, properties, RDF statements, and so on, where each resource has a Uniform Resource Identifier (URI). Using RDF data model to describe data in the form of triples eliminates the heterogeneity between devices.
Sensor ontology description
According to the Single Sign-On (SSO) design pattern, the semantic sensor network ontology can be described as four main aspects: sensors, observations, systems, features, and attributes. This article mainly describes the sensor information from the perspective of the sensor. The main concepts and relationships are shown in Figure 3.

Sensor ontology description.
Graph matching algorithm
The graph matching algorithm is mainly to find the subgraph isomorphism of the search graph
Given search graph
The adjacency matrices of
where
If there is an isomorphic matrix of figure
Therefore,
Finally, whether this is right of mapping matrix
As shown in Figure 4, there is a mapping relationship between the data graph

An example of isomorphic subgraph.
The sensor recommendation model
The current semantic sensor recommendation model relies too much on SPARQL retrieval. When retrieving semantic data, the user’s preference information of sensor attributes and the semantic information of the sensor itself cannot be considered, so that the sensor options recommended cannot meet user’s needs. In order to solve the problem of using SPARQL retrieval and better use of RDF graph semantic information, we use graph matching algorithm to filter sensors and provide users with sensor option. When matching the weighted search graph and the weighted data graph, we not only use the classic graph matching method to obtain the isomorphic graph but also propose a threshold-based pruning method to filter the sensors. The hierarchical structure of sensor recommendation model based on graph matching is shown in Figure 5.

The hierarchical structure of sensor recommendation model.
Step 1: filtering nodes in data graph
1. Data processing. We parse RDF data, and then we write sensor options to different files according to sensor type. In order to obtain the range of values of different sensor attributes, we created a table to record the attributes of different sensor types and their range of values during parsing RDF data. The content of the table is shown in Table 2, where “qualitative” is used to measure sensor attribute characteristic. When the qualitative value of the sensor attribute is 1, it means that the larger the sensor attribute value, the better. On the contrary, if the qualitative value of the sensor attribute is −1, the smaller the sensor attribute value, the better. When the data are updated, only the corresponding sensor file needs to be changed.
2. Construct data graph and search graph. We create a user visual interface, and let the user input required sensor information including sensor type, location, attributes, and corresponding preferences. Then, we extract the user input information as a weighted search graph, where the weight is the user’s preferences of the sensor’s attributes. When constructing the data graph, we first match user input sensor’s type with all the sensors’ type in sensor attribute table. If the matching is unsuccessful, the result will be returned. If the matching is successful, the corresponding sensor file will be parsed according to the sensor type, and the sensor attribute table will be read. Then, the data graph’s weight is calculated according to the sensor semantic information and the value range of the sensor attribute. Finally, the data graph is constructed
where
3. Threshold-based pruning. We first use the traditional graph matching algorithm for pre-prune. We first compare the degree of the search graph with the degree of the node in the data graph. When the degree of the node in the data graph is larger than the degree of the search graph, the node is retained. When the degree of the node in the data graph is lower than the degree of the search graph, the node is deleted. In addition, in order to prune the data graph according to the user’s preferences of sensor’s attributes, we propose a threshold pruning algorithm, which mainly considers two aspects:
where
where
Sensor property table.
The pruning algorithm above requires the sensor type
Step 2: recommending top-k sensors
Sensor recommendation can be considered as a multi-criteria decision analysis problem. How to balance user requirements on multiple sensor’s attributes has become the primary task of sensor recommendation algorithm. In previous studies, many scholars used local optimal solutions of sensor data set to improve traditional multi-criteria decision analysis algorithms, such as the fast non-dominated sorting algorithm and dynamic Skyline algorithm, but the high time complexity of these algorithms greatly affects the response time of the sensor recommendation. In order to conquer this problem, we proposed the improved fast non-dominated sorting algorithm
1
last year. This algorithm combines the idea of quick sort algorithm to improve the fast non-dominated sorting algorithm, and it reduces the time complexity of the fast non-dominated sorting algorithm from
Assuming all possible inputs are equally likely, and then all partitioning cases are equally likely. We will take each value in the interval, 1 with equal probability, so the average time complexity of the quick sort algorithm is as follows
The initial case of the recursion:
Then, the time complexity of the quick sort algorithm is
The improved fast non-dominated sorting algorithm requires data set
After using the improved fast non-dominated sorting algorithm to obtain the local optimal solutions, we use the SAW algorithm 33 to characterize and sort the local optimal solution. The SAW algorithm is one of the most commonly used in multi-criteria decision analysis algorithms, 34 which is the basis of other multi-criteria decision analysis algorithms. The SAW algorithm mainly includes the following three steps:
1. Normalizing the sensor data set
where equation (10) is for the maximization criterion, and equation (11) is for the minimization criterion.
2. Calculating the score for each sensor option
where
3. Sorting sensor data set is in descending order according to the sensor options’ score
Example of model applications
In order to simplify compute, we assume the sensor data set contains five sensor nodes in different regions. We need to use the graph-based sensor recommendation model to get the optimal recommendation among sensor data set.
Step 1: filtering nodes in data graph
According to the sensor’s location, we divide the sensor data set and record data in different files. After analyzing the input information in the user interface, we get sensor’s location and user’s preferences of different sensor’s attributes. We construct search graph with user input information. After filtering sensor data set according to the sensor location of the user input information, we have determined three sensor nodes that meet the requirements and the specific information of the nodes is shown in Table 3. According to equation (4), the response time weights of sensor nodes S1, S2, and S3 are calculated as 0.5, 1, and 0, respectively, and the sensitivity weights of sensor nodes S1, S2, and S3 are 0, 1, and 1, respectively. We construct the data graph with weight information, and the data graph and the search graph are shown in Figures 6 and 7, respectively.
Sensor property table.

The construct process of data graph.

The construct process of search graph.
Next, using the threshold pruning algorithm to sort the size of the data graph is mainly used with large data sets. When the size of data set is small, we can directly omit this step. In the current, there are only three sensor nodes in the data graph. In order to clearly show the specific implementation of the threshold pruning algorithm, we use the threshold pruning algorithm as set in the model. According to equations (10)–(12), the scores of nodes S1, S2, and S3 in the data graph are −4.72, −13.84, and 1.13, respectively. According to the threshold setting in the original model, these three sensor nodes do not meet the requirements, but the original model mainly solves the sensor recommendation problem under big data, so we only delete the worst S2 node.
Step 2: recommending top-k sensors
According to the improved fast non-dominated sorting algorithm, the response time of sensor node S3 is better than S1 and the sensitivity of S3 is also better than S1, so S3 dominates S1. Therefore, we need to recommend sensor node S3 to users. The process of recommending options nodes is shown in Figure 8. When the number of local optimal solution sets obtained using the fast non-dominated sorting algorithm is greater than

The process of recommending optimal nodes.
Experiment evaluation
In this part, we describe the specific preparation content of the experiment from three aspects including data set, comparison methods, and evaluation indicators, and assess the performance of the sensor recommendation model based on graph matching and the price comparison method through the test results.
Experimental settings
Data sets
There is currently no large-scale public data set to provide sensor and context information. In order to obtain large-scale data, we simulate sensor data according to the 22 different sensor property rules described by the public sensor website “Array of Things,” where the sensor property rules are shown in Table 4. We integrated real data and simulation data to construct four scale sensor data sets, including 50,000, 100,000, 150,000, and 200,000. This combination of data can provide a large amount of sensor data, which contribute to better understand the behavior of the sensor recommendation algorithm.
The value range of sensor attributes.
According to the definition of the sensor ontology in the semantic sensor network, we use Protege to construct the sensor semantic model after constructing the sensor data set. And then, the four sensor data sets constructed are poured into the Protege, and they are converted into semantic data. In Protege, ontology is formed on various annotations, including RDF/XML, N3, N-Triples, and so on. We use RDF/XML to annotate the ontology, which is also the simplest form.
Comparative methods
TOPSIS: TOPSIS algorithm 15 normalizes the sensor data set matrix. Next, it obtains the best point and the worst point according to the objective function, and it calculates the distance between each sensor option in the sensor data set matrix to the best point and the worst point. Finally, it characterizes, ranks, and recommends sensor options.
ES: after using TOPSIS to sort sensor options, ES algorithm
14
sets SR parameters to limit the number of sensor options input to the fast non-dominated sorting algorithm. Finally, top-
Dynamic skyline algorithm: this method 16 obtains the user request, and it calculates the local dynamic skyline for reducing the size of sensor data set. Next, it calculates the global dynamic skyline to filter sensor options again. Finally, it uses the SAW algorithm to sort the sensor options.
An efficient preference-based sensor selection algorithm: this algorithm 1 narrows the sensor data set based on the user’s preferences of sensor’s attributes, and it sets the number of sensors input to the improved fast non-dominated sorting algorithm. Decreasing the number of processing sensors makes the algorithm’s response time lower than the original algorithm. Then, the obtained results are sorted and recommended to users through TOPSIS algorithm.
Evaluation methods
In the IoT, the computing and storage capacity of sensors is not as good as those devices in the Internet, which requires our algorithm to provide optimal sensor recommendations under low time complexity. From the analysis of the characteristics of the IoT and the satisfaction of user, we use two evaluation indicators including precision and response time to estimate our model. Response time refers to the time interval from a user submitting a request on the application to feedback. The lower the response time, the better the algorithm. The precision mainly refers to the ratio between the number of sensors satisfied by the user and the number of sensors recommended by the algorithm. A higher ratio indicates a higher precision of the algorithm, as following
where
Experimental results
In this part, we first introduce the effect of parameter settings on the graph-based sensor recommendation model. After determining the parameters of the model, we will compare the performance of our proposed model with the above four sensor recommendation algorithms in terms of the response time and precision.
Impact of parameter setting
In the graph-based sensor recommendation model, we need to determine two parameters including the maximum number of sensors
As described above for the threshold pruning algorithm, the threshold pruning algorithm characterizes the nodes in the data graph. The total score

Score distribution.
We assume the threshold is 0.9, and we conduct the maximum number of sensors

The precision and response time of the maximum number of sensors selection experiment.
In the above experiment, we have determined that the maximum number of sensors is 250, and we make a threshold selection experiment based on this result. In order to eliminate the influence of the size of the data set on the threshold selection experiments, we use four data sets in this experiment. In addition, total score of nodes in the data graph computed by our threshold pruning algorithm is less than 1.2, so we use 0.01 as the basic unit and threshold varies from 0 to 1.2. The precision of our model (as shown in Figure 11(a)) and the response time (as shown in Figure 11(b)) show as follows. As shown in Figure 11(a), the precision of our model is more stable as the size of the data set becomes larger, and the precision curve almost overlaps when the size of the data set is 150,000 and 200,000. When the threshold of the four data sets is less than 0.92, the preference score is less than the latitude and longitude score, which shows in Figure 9(a). And this keeps precision unchanged with threshold less than 0.92. From Figure 11(a), we can know that the precision is 79.8% and 92.3% separately when the size of data set is 50,000 and 100,000. And the precision is 92.3% when the size of data set is 150,000 or 200,000. The precision of the four data sets changes when the threshold is greater than 0.92. When the data set size is 100,000, 150,000, and 200,000, stability occurs when the threshold is greater than 0.95. At this time, the preference score is greater than the latitude–longitude score, which makes the precision higher with an accuracy of 94.87%. When the threshold is greater than 1, the precision of the four data sets greatly reduce, because the number of sensors selected at this time is less than 500 (as shown in Figure 9(a)) and the coverage area of sensor is too small. As shown in Figure 11(b), the number of sensors that are filtered out is smaller and smaller with the increase in the threshold on the four data sets; therefore, the response time of the algorithm also decreases. Based on the above analysis, we choose a threshold of 0.95, at which time the precision remains stable and the response time is very low.

The precision and response time of threshold selection experiment.
After the above two experiments, we finally set the maximum number of sensors to 250 and set the threshold to 0.95. Based on the above settings, we compared the proposed graph-based sensor recommendation model with the other four algorithms in terms of the precision and the response time. In order to reduce the experimental error of equipment or accident, we will conduct experiments to obtain the average value of precision for five times and response time as the final value. During experiments, in order to reduce the uncontrollable error of human operation, we use quantitative indicators when evaluating user satisfaction with the recommended sensor, which not only avoids human operational error but also improves the efficiency of experiments. Below we evaluate the efficiency of the proposed graph-based sensor recommendation model and the other four sensor recommendation algorithms in terms of precision and response time.
The precision comparison of sensor selection algorithm
We evaluate the precision of our proposed graph-based sensor recommendation model and the other four recommendation algorithms on four data sets, as shown in Figure 12. The precision of five algorithms has different change tendency from recommend one sensor to recommending ten sensors for user under different data sets, but the precision of the algorithm between them is relatively stable. Among the five algorithms, the graph-based sensor recommendation model has the highest precision while the TOPSIS algorithm has the worst precision. The remaining three sensor recommendation algorithms are ranked as an efficient preference-based sensor selection algorithm, ES algorithm, and dynamic skyline algorithm. Taking the data set of 150,000 as an example, the precision difference between of the five sensor recommendation algorithms is specifically analyzed from recommend one sensor to recommending three sensors for user. The precision of our proposed graph-based sensor recommendation model, an efficient preference-based sensor selection algorithm, ES algorithm, dynamic skyline algorithm, and TOPSIS algorithm is, respectively, 98.29%, 81.26%, 76.99%, 63.88%, and 46.44%.

The precision in different data sets: (a) Dataset1, (b) Dataset2, (c) Dataset3, and (d) Dataset4.
The TOPSIS algorithm is not appropriate for recommendation under large-scale data set. As the size of data set increasing, the precision of the TOPSIS algorithm decreases, and the TOPSIS algorithm is unstable. As shown in Figure 12, the precision of TOPSIS algorithm is 50.00%, 46.15%, 42.30%, and 38.46% under the four data sets when recommending a sensor to user. And the precision of the TOPSIS algorithm first increases and then decreases when the size of data sets is 150,000 and 200,000, which indicates the TOPSIS algorithm has poor ability to characterize sensors under large-scale data sets. The precision of other four algorithms including graph-based sensor recommendation model, an efficient preference-based sensor selection algorithm, ES algorithm, and dynamic skyline algorithm increases with the size of the data set increasing. Under the large-scale data sets, these four algorithms are under a strong ability to represent sensor options. Among the four algorithms, the precision change trend of the graph-based sensor recommendation model, an efficient preference-based sensor selection algorithm, and the ES algorithm is reduced from recommending one sensor to recommend ten sensors for user. In this way, when the above algorithms recommend sensors for users, users can find satisfactory sensors among as few sensors as possible. In contrast, although the dynamic skyline algorithm is suitable for recommendation in large-scale data set, it cannot recommend satisfactory sensor options for users when there are few recommended sensors.
The response time comparison of sensor selection algorithm
The response time is the time interval from the user submitting the request on the sensor recommendation model to return recommendation results. In the experiment to obtain the response time of the sensor recommendation algorithm, we calculated the average of the response time of experiments for each algorithm on four different data sets, and the experimental results are shown in Figure 13. Because the response time of the five sensor recommendation algorithms is relatively large, the table is listed for more detail. The performance of these five algorithms on different data sets is relatively stable and the response time of these five algorithms is ranked from low to high as the graph-based sensor recommendation model, an efficient preference-based sensor selection algorithm, TOPSIS algorithm, ES algorithm, and dynamic skyline algorithm. We can analyze the above results from the perspective of algorithm time complexity. The time complexity of the graph-based sensor recommendation model is

The response time in different data sets.
Stability of different sensor recommendation algorithms in response time is measured by the difference of response time between the data set of 50,000 and the data set of 200,000, are shown in Table 5. The proposed graph-based sensor recommendation model is the best while dynamic skyline algorithm is worst. Specifically, the difference in response time of the graph-based sensor recommendation model is 0.0605 s. The difference in response time of an efficient preference-based sensor selection algorithm is 0.618 s. The difference in response time of the TOPSIS algorithm is 1.373 s. The difference in response time of the ES algorithm is 2.223 s, and the difference in response time of the dynamic skyline algorithm is 14.546 s.
The detail information of the response time of different data sets.
Conclusion
In order to deal with the heterogeneity of devices in the IoT, scholars proposed a semantic sensor network, which has made the development of the IoT a giant step forward. However, sensor recommendation based on the semantic sensor network is still a problem. Some existing research mainly relies on SPARQL to retrieve the semantic database to recommend sensors for users, where SPARQL matches data in the database without considering the user’s preferences of sensor’s attributes, which leads to this type of sensor recommendation model fail to meet the needs of users. Therefore, we propose a graph-based sensor recommendation model. For the first time, the weighted search graph and the weighted data graph are employed in sensor recommendation of semantic sensor network, where this maximizes the use of RDF semantic graph. In order to narrow the scope of sensor search, we propose a threshold pruning algorithm. Experiments show that threshold pruning algorithm can efficiently characterize the nodes in the data graph. Next, we adopt the improved fast non-dominated sorting algorithm to obtain the local optimal solutions in the sensor data set after pruning, and we use the SAW algorithm to sort the local optimal solutions. Experiments show that our proposed graph-based sensor recommendation model is superior to other algorithms. In the following research, we intend to further parse the semantic data dynamically and expand the existing model.
