Abstract
Keywords
Introduction
With the continuous development of the Internet, more and more devices and daily necessities can be networked, and real-time network communication with their owners provides the foundation for the birth of the Social Internet of Things (SIoT). The SIoT assigns the characteristics and principles of social networks to the Internet of Things, creates social interaction between objects, and then uses some of the attributes of social networks to solve practical problems in the Internet of Things. Social Internet of Things is a combination of social networks and the Internet of Things. It uses Internet of Things technology, including the perception layer of radio frequency identification (RFID) and sensors, and forms a network of objects on the basis of topological structures of upper applied social networks. At the same time, along with the combination of Internet of Things and social networks, a new batch of technologies is also used. The integration of item data of the perception layer can be realized, and the items and data can be networked in the form of a social network.
With the rapid development of SIoT, an increasing number of users are recording their lives and communicating with friends through SIoT. In order to better communicate and obtain information of interest, users can add friends by searching relevant educational information, interests and so on in the SIoT. As these users share the same information and interests, they are more likely to make friends with each other. In SIoT, the network structure formed by a user’s friends and the relationship between these friends is called the network of relationships. In the network of relationships, the user’s friends can be divided into different groups according to closeness. Members in the same group are closer than those in different groups. All these groups are the communities in the network of relationships. 1 Studies have shown that users of current online SIoT lack attention to personal privacy, and some only show some concern, but the privacy is speculated to be underestimated, in order to understand the user’s attention to their privacy and individuals in the SIoT. The way privacy is leaked, researchers have conducted some observations and research on user behavior, cognition, policy, and law.2,3 Among them, a survey on Carnegie Mellon University students carried on the famous social networking site – Facebook, showed that 80% of users use personal photos as their own avatars, while only 2% of them have turned on the privacy settings for accounts. According to other surveys, people’s thoughts about their privacy are based on their own environment and whether they have similar characteristics to the target group they communicate. In this sense, if the user thinks that he and the group he communicates have common attributes (such as hobbies, interests, etc.), or have certain characteristics, these are points that cause themselves to be of interest to the surrounding people, causing great effect.
Related work
Reasoning attack (inference attack)4–6 is the study of an indirect disclosure of personal information through social relations approach. When the user chooses to hide the private information, the user’s private information is inferred by the information related to the user. Assume that a user’s check-in location information is hidden, but the attacker guesses the user’s location through the user’s public friend information, in order to achieve the purpose of obtaining information. The Literatures5,6 uses Bayesian networks to achieve this kind of reasoning. They study the factors that affect the accuracy of reasoning. It is more conducive to the protection of privacy through plausible privacy information. The Literature 4 also studies that online group or group information that users join may also be used to infer sensitive hidden attributes, and the list of members of these groups is public. Zheleva and Getoor 7 have speculated the users’ private information by using the users’ friends and their groups, as well as made comparison and contrast analysis with different attack models. Finally, both of them concluded that the results obtained by using group relations are more accurate. In the experiment both authors have assumed that the information of 50% users’ friends are in the open, while the experimental data used in this article completely real from the SIoT. In the literature, 8 through modified Naive Bayes method, the writer of this paper has speculated users’ other information like their political standpoints by analyzing these users’ personal information and their relationships with their friends. Besides, this author has also made contrast with the results obtained from using personal information only, using relationships with friends only, and combing both respectively. Dey and Tang together with others 9 have speculated the users’ age and so on by applying different methods based on the users’ disclosed information at different levels. These methods are: (1) the school time disclosed by users themselves; (2) the age of users’ list friends disclosed by users themselves; (3) retrieving users’ friends for those users disclosing neither their school time nor their friends list, and then conducting step 2. Author concludes an iterative algorithm, not only the use of the user’s friend’s information but also friends of friends and friendships layer 3 to infer the user’s age. In Dey et al., 9 in the community classification structure, through sensitive link relationships, information can be published on third-party platforms, and information in the privacy of the community can be found. In other applications, 10 the user’s medical data privacy information is protected, and multiple layers are divided into encryption controls. The application of the community11–14 is to use the classification of the community to carry out related research work.
Although these actions have employed the public information and friends disclosed by users to speculate users’ private information, the common point is that they all use users’ friends networks as a whole to speculate users’ private information. Unlike them, this paper is going to speculate users’ information by employing the information disclosed by users at critical nodes. The SIoT key node spread is the central point of contact a member of the user, the amount of information, if the community key node information disclosure, the information to other nodes can be presumed by the amount of information; if the key node information is not disclosed, through times key node or nodes connecting node key information speculate key node information through the key nodes infer other non-public information to speculate node private information. The schematic diagram of the structural model based on SIoT proposed in this article is shown in Figure 1.

Structural model diagram of SIoT.
The main work is as follows: (1) defining critical nodes and convergence proof; (2) defining community division; (3) speculating users’ other private information according to the critical node private information already disclosed; and (4) speculating users’ other private critical node information. The four tasks proposed in this year are all described by the speculation of privacy location information. In this article, the privacy node is speculated, and it is necessary to determine the key node as the first step. It is necessary to prove the relevant nodes and prove the convergence. The second step is divided by the community, and key nodes are identified in the community. The purpose of community partitioning is to reduce the amount of computation and improve the efficiency of key nodes. The third step is to identify key nodes to speculate on non-critical nodes and to easily speculate on non-critical nodes. The fourth step is to infer the key nodes through non-critical nodes, and speculate that the key nodes are determined by a relatively large number of non-critical nodes.
SIoT and related proof
SIoT structure
With reference to the three-layer architecture model of the Internet of Things, the application layer of the Internet of Things is expanded, and a hierarchical architecture model of the Social Internet of Things is proposed, as shown in Figure 2. Perceptual layer: the perceptual layer solves the problem of acquiring data (including video, logo, various physical quantities, audio, etc.) in the physical world and the human world. The perception layer is deployed in the environment. Wireless sensors and other sensing devices use bluetooth, infrared, industrial fieldbus, and other transmission methods to upload the collected sensing data to the network layer. These include human machine interface, object interface, and service application programming interface (API). They are all people, machines, and services that sense each other and can sense data in time.

SIoT structural model.
The network layer is mainly responsible for information transmission, and the communication network is used in the physical network, including object analysis, owner control, service discover, integrity management, ID, service combination, and the like.
The base layer implements storage management of data and related descriptors, records description information and social relationships of nodes, and activity information of objects in the real world and virtual world. These include metadata, ontology, and semantics, which implement basic relationships and logical relationships for related services.
The component layer includes functions such as relationship management, service discovery, service composition, and trust management. Including cellular, network, Wlan, and so on is a related combination of network services.
The application layer supports project-based social behavior to develop interactive applications that target people, projects, and third-party services.
SIoT key node definitions
The most ideal core node of SIoT, that is, the node that is considered to be connected to all nodes in the network is the most important core node. For example, the central node in the star network is obviously the most important “core node” in the network. The core node is an important guarantee for its stability in the entire network. If there is a security problem at the core node, the entire network is not secure. The privacy of the entire network is speculated by speculating on the privacy of the core node. However, in the SIoT is a sparse matrix, little connection between the various communities, and a large amount of information exchange within the community, as shown in Figure 3.

SIoT key node model.
Definition 1
Provided a node is one of critical nodes of a SIoT, it is also the critical node of a community; and vice versa. The critical nodes set is represented by
“
PageRank algorithm based on key nodes
Widely, eigenvector centrality and its variants applications, for example, in the field of the most famous Page Rank PageRank algorithm 15 is the core of the algorithm Google search engine. The initial time, giving each node (web page) the same PR value, and then we iterate it. At each step, we equally distribute the current PR value to all the nodes it points to. The new PR value of each node is the total PR values it has obtained. Thereby, the PR of Pi at time t is
Among them,
PageRank algorithm convergence proof
As the PageRank algorithm has already given a long time, it’s more likely to obtain a stable PR value after times of iterating. But, how many times of iterating is the relevant precision required by PageRank, and
Theoretically
Definition 2
If the definition of a matrix of values of all columns is 1, then the matrix is a matrix of convergence
where
where

The
when
so
In this article, it is determined whether the PageRank algorithm can be implemented by proving that
SIoT community-based division of user privacy leak
SIoT community structure definition
Community network structure refers to the group consisting of nodes and connections between nodes within the group more closely, and the connection between cluster nodes is relatively sparse. Specific defined as follows:
16
The graph
If the connections of node
Suppose that the attribute eigenvector of the community
The parameter “modularity“ is the a measure specifically proposed by Newman for the community structure, which is the difference between the community structure in the network and the community structure in the random network, and its function is expressed 17 as follows
Where, Ki and Kj are the degrees of node
The
Community discovery has emerged many ideas and algorithms after many years of development. Among them, the main representative algorithms are spectral clustering algorithm,18,19 hierarchical clustering algorithm, 20 label propagation algorithm,21,22 and modularity optimization algorithm.23,24 In this article, VD Blondel et al. 25 proposed a modularity-based community discovery algorithm. The algorithm first takes each node in the network as a community, and then moves the node to the community that maximizes modularity each time until modularity no longer increases, or only one node remains. According to the Literatures,26,27 SIoT have the same characteristics.
Users of information are more likely to become friends, and they have strong connections, so they form communities. Conversely, in SIoT, users can be divided into different communities according to the degree of close contact between users. Users in the community have common information, such as educational experience, hobbies, and so on.
User privacy information speculations
In SIoT, a large number of users’ personal information is disclosed, 28 it is possible to frequency information disclosed in the user’s personal information appearing in the community by statistical one to infer within the community of users shared information, that other information is not disclosed user private information.
For user
where
where

Key node information speculate.

The key nodes and other nodes Hide.

Other nodes Hide speculation.
If

The key nodes speculate other nodes.

Public key nodes and node speculation.
In this paper, if the
The algorithm in this paper is as follows
Through the key node, the information of the connection privacy node is speculated, and then other key nodes are speculated or all key nodes are speculated first, and then other nodes are speculated;
If
By connecting nodes, suggesting that the key node information privacy, then presumably other connection nodes;
If
Through the key nodes, suggesting that the connection node information privacy, then presumably other critical nodes.
Experimental analyses
We select the popular online SIoT Twitter and Sina Weibo as the two network topologies for this article, and the related data are all from Internet resources. At the same time, this article uses the relevant algorithm to generate two The simulation networks are ER random network and NW small world network. We assume that the edges of each network are undirected and unprivileged. The relevant topological characteristics of each network are shown in Table 1.
Basic structure.
ER: Erdos-Renyi; NW: Newman and Watts; Com: community; Coe: coefficient.
Social structure analysis
In this article, 5%, 10%, 15%, and 20% of the attribute values are randomly selected as anonymous attribute sets on the above four data sets. The anonymous attribute set can be included for the data publisher’s anonymous requirements. Sensitive attribute values, irrelevant attribute values, and other high-recognition attribute values. Experiments examined the changes in social structure and attribute distribution of two data sets before and after anonymity. Through experimental analysis, the algorithm can be very good. In addition, the experiment also examines the error rate change of attribute speculation based on the node’s social degree. By comparing the data before and after anonymity, it can be seen that after anonymity, the ability of the attacker to determine the target attribute through social structure is greatly reduced. The user’s privacy properties can be protected.
The ER random network data set, NW small world network data set, Twitter data set, and Sina Weibo data set were analyzed to the degree of change in the SIoT structure of users before and after anonymity. Among them, Figures 10–13 analyze the aggregation coefficient and Q-Modularity of the user’s social structure before and after anonymity (note: the integer part of the X-axis coordinates indicates that the proportion of the anonymous attribute set is 5%, 10%, 15%, and 20%, the fractional part indicates that the attribute correlation threshold is 0.3, 0.5, 0.7, and the original item is the initial state of the data set).

Analysis of SIoT structure before and after ER network anonymity.

Analysis of SIoT structure before and after NW Network anonymity.

Analysis of SIoT structure before and after Twitter anonymity.

Analysis of SIoT structure before and after Sina anonymity.
The above figure analyzes the relationship between the proportion of anonymous attribute sets and the selection of attribute relevance thresholds and the availability of anonymous results. Figures 10–13 respectively shows ER random network data set, NW small world network data set, Twitter data set and Sina Weibo data set, as well as the structural analysis of the data set. It can be seen from the four graphs that as the proportion of the anonymous attribute set selected by the user is expanded, the availability of the data is gradually reduced, but for the same anonymous attribute set ratio, different threshold parameters are selected for the anonymous result. The effect is not significant. It shows that the correlation of attributes is obvious in the local area; the different thresholds are selected, and the disturbance of social structure segmentation and attribute segmentation is small.
Figures 14–17 analyzes the distribution of the social degree distribution before and after the data set anonymity.

Analysis of SIoT degrees before and after ER network anonymity.

Analysis of SIoT degrees before and after NW Network anonymity.

Analysis of SIoT degrees before and after Twitter anonymity.

Analysis of SIoT degrees before and after Sina anonymity.
The algorithm can maintain the degree distribution characteristics well. Figures 14–17 reflect the ER random network data set, NW small world network data set, Twitter data set, and Sina Weibo data set. Before and after the first two data sets, the trend of node social degree distribution is almost unchanged. In the latter two data sets, the previous node degree has an upward trend, and the distribution trend of social degree is also consistent. However, because the algorithm may cause the social connection of the newly generated node part to be lost when the node is split, or because there are more nodes splitting in the local area, the social connection of the node with more attributes is increased. Therefore, in the anonymous result set of the algorithm, the nodes with lower degrees and higher scores are more than the original data sets. However, it can be seen from the figure that the anonymous results of the algorithm can still faithfully reflect the node distribution trends of the data sets.
Community analysis in space SIoT
The method proposed in this article is based on the core node method of the community to realize the speculation of private information. The change in the number of communities reflects the change of the whole SIoT before and after anonymity (Notice: when the proportion of the integer part of the X-axis represents 5%, 10%, 15% and 20% of the anonymous attribute set, the fractional part represents the threshold of attribute correlation of 0.3, 0.5 and 0.7, respectively, and the original item is the initial state of the data set. In the case of different data anonymity, the number of associations will change, as shown in Figure 18.

Communities number before and after anonymity in four spatial SIoT.
As can be seen from Figure 18, the number of four associations, with the comparison before and after the anonymity, has decreased. The number of ER random network data sets and NW small world network data sets are relatively small. The number of associations and the number of associations after anonymity increases, and the number of associations decreases. The number of nodes in the Twitter data set, Sina Weibo data set, and associations the number is relatively large, and the corresponding number becomes more obvious. There is also an increase in the number of associations and attribute correlation thresholds after anonymity, and the number of associations is correspondingly reduced.
Speculative comparison of key node privacy information
With the speculation of network privacy attributes, this article selects the core nodes in the four networks, takes 5%, 10%, 15%, and 20% attributes for anonymity, and takes the average value for comparative analysis. Using the key nodes and non-critical nodes for user privacy on the four networks, the ratio of the privacy nodes in these four networks is shown in the following Figures 19–22.

Relations between ER stochastic network and privacy node discovery rate.

Relations between NW small world network and privacy node discovery rate.

Relations between Twitter and privacy node discovery rate.

Relations between Weibo and privacy node discovery rate.
By comparing the above four networks, it is found that the privacy nodes are increasing with the size of the network. When the network size is small, the privacy node discovery time is relatively short. When the network size is relatively large, the privacy nodes are found to be large in time. As the size of the network and the changing community evolve, the privacy nodes are found to be different at the same time.
Conclusion
In this article, we propose a privacy user information inference method based on SIoT key nodes, which infers other privacy node information through key nodes in SIoT community. This inference method is based on the public information of the community and the number of key nodes, and the inference method is relatively simple. The latter part of the work is to find out the role of key nodes in the process of information dissemination by learning the complete social information. The similarity rules of information between non-key nodes and key nodes are derived, and the inference rule data set is obtained. Then the inference method of other non-key nodes or non-key nodes for key nodes in SIoT is deduced. Finally, four kinds of network models are used to analyze the inference of key nodes and non-key nodes, and the effect of key nodes to infer that private nodes have obvious advantages. Future work mainly focuses on data characteristics in different data sets, and analyzes unknown information through known information. For social data sets with the same characteristics and structure, this speculation makes it easier to analyze unknown private information.
