Abstract
Introduction
With the remarkable development of modern wireless technologies, researchers and automobile companies have applied vehicular ad hoc networks (VANETs) for inter-vehicle communications (IVC). The nodes in VANETs stand for the vehicles and the communication is done through wireless connections or with infrastructure. 1 For example, vehicles with the purposes of comfort, safety, and entertainment for the drivers and passengers will communicate with each other, using the communication devices installed in the vehicles by the automobile companies or with the access point on the road (i.e. roadside unit (RSU)). Furthermore, numerous attracting applications of vehicular networks are designed and envisioned,2–5 such as safety warning, intelligent navigation, and mobile infotainment. Data dissemination is a particular promising type of applications which is related to both safety and commercial services. The main goal of data dissemination is to perform efficient data transmission from a source (e.g. access points (APs)) to other vehicles located in the area of interest (AoI). The specific applications of this type include multimedia advertisements, weather report, an update of the global positioning system (GPS) map about a city or a scenic area, and sometimes emergency information and eventual accident. Whereas, data dissemination represents a major challenge in VANETs due to the high node mobility and the lossy wireless medium under vehicular environments. The high node mobility can cause frequent changes in the network topology and constant fragmentation, 6 while the lossy wireless links cause often packet losses and collisions, leading to prolonged downloading delay and decreased efficiency. 7
Toward solving these problems, relay-assisted retransmission strategies have been introduced to data dissemination in VANETs. The relay selection is generally divided into two kinds, namely, centralized and distributed relay selection. For the centralized relay selection, relays are selected by a central server (CS) according to various selection criteria. Since the CS needs to collect all the channel information and process the selection criteria, a high complexity burden is posed on it, rendering the centralized method additional overhead.8,9 Whereas, distributed relay selection does not require unified control center, and without global information gathering system, each node only needs to collect local network information, thus could elect relay by simple calculation.10,11 Therefore, a distributed selection scheme can alleviate the burden of a control center and improve the system efficiency greatly, as well as reliability. In VANETs, due to the dynamic network topology characteristic, and that each vehicle has the capability of processing information and completing tasks independently, distributed relay selection is preferred.
In this article, we propose the non-collision relay selection with network coding (NCRS-NC), a distributed relay selection strategy with network coding multicast for data dissemination in VANETs. Relay vehicles are selected dispersedly to enhance the reliability of wireless multicast in VANETs and to ensure system scalability. The data dissemination consists of two stages, the relay selection stage and the relay retransmission stage. The novelty of this work lies in the following sides. First, in contrast with existing relay selection algorithms, the proposed NCRS-NC scheme jointly considers the vehicles’ geographical locations and moving velocities, the channels’ conditions and the packets’ reception statuses of vehicles, and combines the multiple parameters into the design of relay selection process. With elaborate study, the NCRS-NC relay selection method can help each unsatisfied vehicles find a suitable vehicle which can relay packets efficiently. Second, in view of the fact that the selected relay vehicles will apply instantly decodable network coding (IDNC) technique 12 for the retransmission stage, the packets’ usability is studied and exploited thoroughly through the relay selection stage, to guarantee the maximization of network coding gain. The packets’ usability is described from two aspects, one is the availability of packets which are lost by the unsatisfied vehicles, and the other is measured by the number of each relay candidate’s targeted vehicles. High packets’ usability indicates that the corresponding vehicle has received a large amount of packets that are lost by other vehicles, and the packets' combination which is constructed on this vehicle's received packets set is thus able to target more vehicles. Therefore, vehicles with high-packet usability are preferred in the relay selection process, under the same channel conditions. Third, different from conventional approaches based on the random-channel-access protocol, the NCRS-NC advocates a relay establishment process for acknowledging the relay relationship among the selected relay vehicle and its serving vehicles. Using an explicit acknowledgement message, the selected vehicle can be informed of the selection result and its serving vehicles, and thus performs the relay retransmission stage without collisions. In this way, the overhead of the proposed approach can be controlled and the waiting time before the selected relay that starts to multicast is minimized accordingly. Finally, the proposed NCRS-NC utilizes the IDNC technique to generate the optimal coded packet for each multicast during the relay retransmission stage. In contrast to classical randomized linear network coding (RLNC), the network coding related decoding delay of NCRS-NC is reduced to zero and the average dissemination delay is minimized.
The main contribution of this work can be summarized as follows:
We design a data dissemination strategy for multicasting packets efficiently in the vehicle-to-vehicle (V2V) communications in VANETs. The V2V communication consists of two phases: (a) relay selection phase, where unsatisfied vehicles select relay vehicles among their neighboring vehicles based on multiple measurements and (b) relay retransmission phase, where each selected relay vehicle retransmit lost packets for unsatisfied vehicles with the assistance of IDNC technique.
We propose a non-collision relay selection algorithm in the relay selection phase, which is distributed and based on multiple parameters such as packets’ reception statuses, vehicles’ positions and speeds, and packet erasure probabilities. Due to specific consideration for the maximization of the serving capability of relay vehicles, the algorithm is guaranteed to find at least one group of optimal relays at each time, with affordable overhead, to retransmit packets for unsatisfied vehicles.
We apply the IDNC coding technique for relay multicast in the relay retransmission phase. Using the IDNC technique, the proposed method is able to reduce the average downloading delay for data retransmission, and lower the encoding/decoding delay in the mean time. We present how to migrate the IDNC method into relay multicast in VANETs, so that the selected relays can help unsatisfied vehicles more effectively.
We implement the proposed strategy and evaluate its performance by extensive simulations. We compare our approach with the representative relay content distribution scheme for VANETs, namely, CodeOn, which is a push-based, network-coding-based data dissemination protocol and represents the current state-of-the-art. We also evaluate the compared algorithms’ performance with and without network coding. Significant improvement in average downloading rate and in average dissemination delay is obtained for highway scenarios with different parameter settings.
The rest of this article is organized as follows: Section “Related works” outlines the related works on data dissemination in VANETs. Section “System model” describes the system model. The proposed relay-assisted data dissemination strategy is provided in detail in section “NCRS-NC.” Section “Performance evaluation” provides the simulation results and discussions. Finally, section “Conclusion” concludes this article.
Related works
Relay selection for data dissemination in VANETs
Nandan et al. 13 first introduced a swarming protocol, known as “SPAWN,” for VANETs. Using SPAWN, a vehicle requests data from an RSU if it is within the RSU’s range, and it requests data from neighboring vehicles on the contrary. A peer-to-peer protocol was used when the vehicle requests from neighboring nodes. After that, SPAWN was extended to the method AdTorrent. 14 With AdTorrent, RSU continuously multicasts the common information to the vehicles within its coverage. When vehicles move out of the coverage, the provider delivers top-ranked content to the end user using swarming. However, this transmission control protocol can be highly overhead yet poorly performed due to the highly mobile lossy wireless links in VANETs. Various protocols are proposed to improve the transmission reliability by repeatedly multicasting messages or selecting the farthest node to relay messages.15–17 However, repeated multicast cannot completely guarantee the transmission reliability but degrades the resource utilization.
The relay selection approaches for improving the content dissemination fall into two categorizes. One is the centralized methods and the other is the decentralized approaches. For the centralized methods, it is of great importance to have a control center which is responsible of collecting channel information and processing the relay selection. In VANETs, the control center is represented by the AP/RSU/source and is often inclined to become the system bottleneck. In view of this, the distributed selection schemes have been studied and proposed. Ma et al., 18 proposed a scheme to enhance the multicast reliability, including the preemptive priority in safety services. The suggested protocol selects the backbone vehicles as relay nodes, considering vehicle movement dynamics and link quality between vehicles. Yoo et al. 11 designed an opportunistic relay selection scheme which chooses the candidate relay with high chance to access channel and forward packets for the source vehicle. However, the scheme performance suffers from collisions for that when the network gets densified, the transmission collisions happen with high probability. Given the random-channel-access protocol, the retransmissions which are induced by the collisions further intensifies the collisions. Bi et al. 10 proposed a novel composite relaying metric for relaying node selection by jointly considering the geographical locations, physical layer channel conditions, and moving velocities of vehicles. A distributed relay selection scheme based on these composite metrics was designed to guarantee that a unique relay is selected to reliably forward the emergency message in the desired propagation direction. However, the proposed scheme failed to consider the receiving statuses of vehicles, which is a great impact factor for the efficiency of relay selection and data dissemination.
Network coding for data dissemination in VANETs
Network coding (NC) was first proposed in 2000. 19 It has been proven that a source node can send packets at the theoretical upper bound of transmission rate by employing network coding.20,21 Therefore, NC has great potential to substantially improve transmission efficiency, throughput, and delay over multicast erasure channels. The traditional RLNC 22 allows nodes to combine different data packets together and achieves the maximal network throughput. However, the optimal throughput is often obtained at the price of large decoding delay. Since each encoded packet requires to be decoded before the delivery to higher layers,23,24 the successful decoding of each encoded packet asks for receiving a certain number of encoded packets. In contrast with RLNC, IDNC12,25,26 is proposed to reduce the decoding delay by carefully choosing the set of data packets to be coded together in each transmission. By doing so, IDNC guarantees a subset of (or if possible, all) receivers be able to instantly decode one of their wanted packets upon successful reception of the coded packet.27–29 Moreover, IDNC encoding can be implemented using binary XOR, which eliminates complicated operations over large Galois fields as required by RLNC.
To improve the efficiency of packet delivery to different mobile receivers, network coding has been widely used in VANETs.7,30–34 Wang and Yin 30 studied the network-coding-based retransmission scheme on RSU and proposed an IDNC multicast retransmission algorithm that considers multiple vehicular communication characteristics. Simulation results that approved the algorithm are efficient in improving the data dissemination delay and download rate in VANETs. However, the overall system performance can be degraded since all works need to be done by the RSUs, which is the potential system bottleneck. In Wu et al., 31 the random network coding technique is incorporated with the cooperative multicast to address the bottleneck problem. A moving window network coding technique is used to upper bound the average decoding delay of a receiver. The decoding complexity scales as the window size increases. Liu et al. 32 analyzed the V2V data dissemination with network coding in two-way road networks, where the vehicles move in opposite directions. The problem of multi-hop data dissemination in VANETs was considered in Wu et al., 33 where the authors proposed a lightweight protocol which employs dynamically generated backbone vehicles to multicast packets so as to reduce the MAC-layer contention time at each node while maintaining a high packet dissemination ratio. Khlass et al. 34 proposed using cooperative relaying of vehicle for the poor channel quality between the nodes and the RSU to improve the connectivity of the network. For the bidirectional communication between the vehicle and RSU, it further uses analog network coding in relay node to improve the communication efficiency. Li et al. designed a method called CodeOn 7 which is based on RLNC. In CodeOn, each node’s waiting time before data relaying depends on the node’s data remulticasting utility for its neighbors, which can be calculated after each node exchanges the decoding status with their one-hop neighbors. The protocol works well for lossy wireless problems in VANETs. However, frequent collisions may arise since CodeOn assigns each vehicle a back-off duration that is inversely proportional to its utility value, and thus degrade the system performance. Besides, the RLNC used in CodeOn requires complex encoding techniques and results extra decoding delay.
In this article, we propose a collision-free, distributed relay selection algorithm to coordinate the transmissions of vehicles. The proposed algorithm first selects a number of relay nodes among all vehicles based on the vehicular nodes’ speeds, locations, and multicast contents. After the relay selection, the IDNC scheme is applied at each relay node while constructing the NC packets for retransmission. Unlike the RLNC which requires sufficient coded packets to realize the full-rank decoding condition, the IDNC enables the instantly decodable property by including one wanted packet in the coding combination at most, thus minimizing the decoding delay and significantly improving the downloading rate and progress performance.
System model
We consider the general system model of data dissemination in VANETs that have been considered in many related works.7,9 The APs (or RSUs) are the data sources, and vehicles inside the AoI are the end receivers. Each RSU is connected with the core Internet through the wired backhaul and is deployed in a straight road. Specifically, we consider the highway scenario where vehicles pass through the front of the RSU in lanes of the road at various speeds. Two types of data dissemination involved in the communication process: (1) vehicle-to-infrastructure (V2I) communication, where the roadside infrastructure (APs or RSUs) acts as information source that multicasts the data to the vehicles within its coverage and (2) vehicle-to-vehicle (V2V) communication, where the selected relay vehicle shares data with other vehicular nodes. APs can be placed either deterministically or randomly. The service architecture is illustrated in Figure 1.

System model.
We make the similar assumptions as those in Li et al. 7 and Shen et al. 9 To model the coexistence of safety and commercial applications, we consider two representative channels. One is the control channel and the other is the service channel. Safety messages which may contain information of vehicles’ locations, speeds, and so on are multicasted through the control channel, and common messages are transmitted through the service channel. The interval between two consecutive safety messages should be smaller than 100 ms to guarantee the success transmissions. 35 Time is divided into periodical 100 ms/slot, and the transmission of each packet consumes one time slot. Besides, synchronization is required among all vehicles and RSUs to switch simultaneously between the control channel and the service channel. The utilization of time and channels is depicted in Figure 2.

Time and channel utilization of each vehicle and each AP.
In the control channel, each AP and each vehicle multicast one beacon message in each slot.7,36 A vehicle will merely listen to an AP’s content multicast in the service channel if it is in the range of the AP. Otherwise, a vehicle may share its received content cooperatively with its neighboring vehicles. Besides, the content distribution process involves only vehicles within the AoI.
Assume that vehicles can obtain their real-time locations and synchronize their clocks through the equipped GPS devices when they are within satellite coverage, and use auxiliary techniques to determine their locations when they are temporarily out of satellite coverage. 7 Due to the highly dynamic topology of VANETs and the lossy nature of the vehicular wireless channels, some vehicles may not receive the complete information when APs/sources disseminated the messages.
Further assume the number of packets for transmission is
The relay vehicles are selected based on multiple parameters that are contained in the exchanged messages. Once the relay nodes are determined, they apply the IDNC coding technology on the set of retransmission packets and multicast the coded packet to all the vehicles which have non-empty Wants sets. The coding and retransmission procedure repeats until all vehicles receive their Wants sets successfully.
NCRS-NC
After the
Construction of relay candidates set
For any vehicle
In order to construct the relay candidates set,
where
The vehicle
Utility calculation
In the construction of relay candidates set, a vehicle can be selected by multiple nodes as their relay candidate, and receives multiple
where
Given
After the utility calculation, the relay candidate
Relay establishment
When the requesting vehicle
Applying IDNC coding method
At each retransmission, the selected relay generates a coded retransmission packet using the IDNC coding method. To achieve the appealing advantage of NC in network throughput without introducing too much decoding delay as the RLNC does, the minimum decoding delay problem for NC is formulated as a maximum weight clique problem over a well-structured graph in the IDNC method.
To minimize the decoding delay in IDNC, all possible packet combinations that are instantly decodable by any subset or all requesting vehicles should be explored. Generally speaking, if two packets in the Wants sets of two receivers are combinable in IDNC, the resulting coded packet can be decoded instantly by these two receivers. This can only happen when these two receivers are requesting the same packet or the packet wanted by each receiver is in the Has set of the other receiver. The combinations between all packets in the Wants sets of all vehicles can be represented as a graph model, denoted as the IDNC graph.28,37
The IDNC graph
Here, we give an example to explain the construction of IDNC graph
Packets’ reception status before the relay retransmission phase.
Applying the vertex generating and edge drawing rules of the IDNC graph construction, we have the IDNC graph of the given example as shown in Figure 3. In this graph, the black dot represents the lost packets of requesting vehicles. For example, vertex

IDNC graph
Generally speaking, let
The pseudo code of the proposed NCRS-NC data dissemination strategy is presented as in Algorithm 1. The relay selection stage is executed independently by each vehicle. During the process, vehicle
Based on the relay selection strategy and the IDNC network coding method, the overall procedure of the proposed non-collision relay selection with network-coding-based data dissemination approach is shown in Figure 4. The relay selection phase and relay retransmission phase are performed orderly and periodically in the data dissemination procedure. Note that if all vehicles in the AoI have already received the whole packets set

Flow chart of the overall proposed data dissemination approach.
Performance evaluation
We evaluate the performance of the proposed data dissemination strategy in this section. Specifically, we compare the proposed strategy with the classical random-access method, CodeOn-Basic,
7
which is based on the RLNC, and the purely multicasting methods with relay. With the CodeOn-Basic method, the number of right-decoded packets
Simulation settings
We consider the highway scenarios and simulate the proposed algorithm, as well as the compared algorithms in Java Development Kit. The mobility model we use is similar to the Freeway Mobility Model (FMM) proposed in Mahajan et al., 38 which is well accepted for modeling the traffic in highway scenarios. In FMM, the simulation area includes many multiple lane freeways without any intersections. At the beginning of the simulation, the vehicles are placed uniformly at random in the road area and move at different speeds. The vehicles randomly accelerate or decelerate with security distance between two subsequent vehicles in the same lane, and no change of lanes is allowed. In the highway environment, the map has been simplified to the AoI which consists of a unidirectional four-lane highway, each lane with a length of 6 km, as shown in Figure 5. The channel model is considered as the simple yet effective packet erasure channel with erasure probability generated randomly between [0.3, 0.4].

Highway scenario.
To evaluate the impact of topology and traffic density, we simulate sparse and dense traffic for the scenario. The simulation parameters’ setting is as follows:
Vehicles’ speeds are randomly drawn from [20, 30] m/s.
The total number of vehicles in sparse highway scenario is 100, and 300 vehicles in the dense traffic scenario.
The number of packets that the AP/source multicasts varies between the range [20, 50].
In the simulations, we evaluate the delay performance of each strategy from two aspects: (1) average downloading progress, which is the average downloading percentage of the packets from the beginning of data dissemination to the end of it and (2) average dissemination delay, which is the average elapsed time from the start of data downloading to full completion.
Sparse versus dense
We evaluate the average downloading progress in Figure 6 and the average dissemination delay in Figure 7 for both sparse and dense traffic scenarios. There are two key observations. First, the average downloading progress of NCRS-NC is the fastest that NCRS-NC always completes its downloading progress first and achieves the shortest average dissemination delay in all shown cases. Second, NCRS-NC maintains high downloading rates with different numbers of packets and for both sparse and dense traffic scenarios. The downloading rate of NCRS-NC is three times quicker than the silver medal winner. Moreover, the NCRS-NC is less affected by the traffic density than the CodeOn-NC does, since the increment in completion times of the NCRS-NC in sparse and dense networks is much smaller than of the CodeOn-NC (with the same number of packets), meaning that NCRS-NC is more robust to variations in topology and vehicle density.

Average downloading progress in sparse and dense scenarios: (a) sparse highway (20 packets), (b) sparse highway (50 packets), (c) dense highway (20 packets), and (d) dense highway (50 packets).

Average dissemination delay in sparse and dense scenarios: (a) 20 packets and (b) 50 packets.
Number of packets
We study the effect of packets’ number on the average dissemination delay in Figure 8 with (a) 100 vehicles and (b) 300 vehicles. The dissemination delay of NCRS-NC is much lower than those of the three others with different numbers of packets. Furthermore, the curve of NCRS-NC’s dissemination delay ascends little when the number of packets increases, while the compared strategies’ curves raise rapidly, which indicates that the performance gain of NCRS-NC over the compared algorithms grows gradually with the packets’ number.

Average dissemination delay with different number of packets: (a) 100 vehicles and (b) 300 vehicles.
Number of vehicles
Figure 9 depicts the average dissemination delay with different numbers of vehicles. It is shown that the number of vehicles affects the dissemination delay of compared strategies little due to the good scalability among the compared distributed strategies. Meanwhile, it is clear that the NCRS-NC has the shortest dissemination delay among all strategies with

Average dissemination delay with different number of vehicles: (a) 20 packets and (b) 50 packets.
Comparisons of strategies
To sum up, we can see that the proposed NCRS-NC significantly outperforms the CodeOn, CodeOn-NC, and NCRS strategies. In general, both strategies with network coding (CodeOn-NC and NCRS-NC) outperform the non-coding strategies (CodeOn and NCRS), which proves the advantage of network coding.
The advantage of the proposed NCRS-NC over the CodeOn-NC derives from the non-collision transmissions and the IDNC. On one hand, for the CodeOn-NC, the utility of data relayed by each vehicle is almost the same with that of its neighborhood. It directly results in a negative effect that the waiting time before data relaying at each vehicle and its neighborhood can be very close, which leads to the frequent transmission collisions and thus degrades the performance. On the contrary, the proposed relay selection algorithm avoids the collision issue by computing relaying utility based on multiple parameters and explicitly acknowledge the relay relationships among vehicles. On the other hand, the CodeOn-NC requires the receivers to successfully collect
Conclusion
In this article, we propose a novel data dissemination strategy for VANETs, which combines distributed relay selection and network coding multicast. The relay selection algorithm is designed collision-free, thus decreasing the packet loss ratio and reducing the dissemination delay. A network coding scheme based on IDNC technology is applied during the retransmission process to increase the throughput and to improve the downloading delay. Simulation results show that compared with the CodeOn-NC with RLNC and the non-NC transmission strategies, our proposed strategy performs better in terms of the average downloading progress and the dissemination delay. Moreover, benefitting from the IDNC, the performance advantage of the proposed strategy becomes more distinguishable in the dense traffic scenarios, which indicates the proposed strategy’s good practicability.
