Abstract
Keywords
Introduction
The Internet of Things (IoT) is expected to comprise billions of heterogeneous smart devices that have sensor terminals, which may create an unprecedented access and exchange of information. With this information, it is possible to construct many smart systems, such as the automatic prevention of traffic jams and the development of smart grids, 1 or even a smarter planet. 2 Satellite networks (e.g. BeiDou Navigation Satellite System 3 ) with their global coverage and bidirectional communication capabilities are considered candidates to build the backbone of the IoT. 4 In satellite networks, an ingress satellite collects data from sensor terminals and sends these data to the destination terminals through the egress satellites. Compared with Wi-Fi, Bluetooth, and terrestrial cellular networks, satellite networks have distinct advantages in conducting remote facility monitoring and real-time asset management that spans a broad geographical area. However, with the large-scale deployment of sensing devices, the IoT also presents a set of data storage and processing challenges. Cloud computing infrastructure is considered to have the capability of compensating the IoT and accelerate its development and deployment by providing on-demand “unlimited” computing and storage resources in an efficient and cost-saving manner.5,6 By using satellite networks, sensing devices in remote outlying regions that have no stable, terrestrial broadband infrastructure can easily access the cloud, and cloud operators can also easily deliver new configurations and tasks to distributed IoT devices at anywhere and anytime. 7 Additionally, satellite networks can benefit IoT and cloud computing infrastructure with multicast capability for branch locations and offloading nonessential background traffic from the primary network connection.8,9
To effectively transmit IoT data on satellite networks, an efficient routing scheme is critical. However, from a global perspective, the IoT traffic requirements for satellite networks are unbalanced. The satellites that cover developed areas generally have higher traffic loads than the satellites that cover underdeveloped areas, oceans, or mountains. 10 In addition, when traditional shortest path routing algorithms are adopted, irregular constellation structures and varying lengths of inter-satellite links (ISLs) may cause traffic accumulation in specific areas, such as the minimum horizontal ring. 11 Because the bandwidth resources provided by satellites are often limited, the above factors may quickly lead to severe traffic congestion and significant packet drops in certain satellites while leaving other satellites poorly utilized.
Researchers have proposed load balancing routing schemes12–25 to improve this situation. Among these schemes, two strategies are commonly used: global strategy and local strategy. Global strategy typically involves gathering global traffic state information and making routing decision based on the current global view. It alleviates global congestion and guarantees better traffic engineering but suffers from slow reaction times and large communication and storage overheads. Local strategy allows satellites to independently make routing decisions based on local traffic state information. It provides rapid traffic adaptability and introduces less overhead but is easily trapped in local optima. However, to the best of our knowledge, most load balancing routing schemes embrace the extremes of the design spectrum: they are based on either a global strategy12–19 or a local strategy.20–25 As either strategy has certain merits and flaws, none of these schemes offers the ideal performance.
Another significant defect in load balancing routing schemes is that they passively perform load balancing to mitigate observed congestion but neglect to proactively disperse traffic to prevent congestion. The root cause of this behavior may be that these schemes are designed on the assumption that traffic requirements are un-known and stochastic. Thus, they only monitor the current traffic state and continually eliminate building congestion. However, unlike wireless sensor networks (WSN),26,27 Low Earth Orbit (LEO) satellite networks are highly deterministic systems. Their satellites rotate on fixed orbits. Their traffic requirement, which is closely related to geographical distribution of IoT devices, the daily evolution of traffic intensity, and the coverage relationship between satellites and geographical zones show strong periodical and predictable features.28–30 This suggests that the amount of traffic that needs to be routed by each satellite at a particular time can be approximately estimated. Henceforth, load balancing routing schemes should take this into account and take measures in advance to prevent traffic concentration as opposed to reacting to the onset of congestion.
In this article, we propose a hybrid global-local (HGL) load balancing routing scheme that aims to eliminate congestion and optimally transmit IoT traffic through LEO satellite networks. Given the predictive nature of LEO satellite networks, the inter-satellite traffic demand is decomposed into a predictable long-range baseline and unpredictable short-range fluctuations. The long-range baseline, which is mainly derived from deterministic satellite movement and stable changes in IoT traffic density and intensity, represents the majority of ongoing changes in inter-satellite traffic demand. It can be statistically modeled and considered a crude estimation of the actual traffic demand. Short-range fluctuations, which are typically caused by bursts in IoT traffic requirements and network anomalies, represent a minor and stochastic portion of inter-satellite traffic demand. Traffic decomposition makes it possible to separate out major and predictable components of traffic demand and use them to perform preliminary traffic allocation. As a result, the majority of traffic volumes can be proactively dispersed from the entire network, which effectively prevents traffic concentration and congestion. In addition, if congestion still occurs after preliminary traffic allocation, only incremental portions of traffic need to be managed rather than all of it.
Based on this idea, HGL is implemented in two stages. First, it employs a global strategy to accomplish preliminary optimal traffic allocation based on long-range baseline traffic demands of all satellite pairs. Load balancing routing in this stage is mathematically formulated as a multi-commodity flow optimization problem seeking to minimize the total flow cost of the entire network. Next, HGL employs a local strategy to conduct real-time route adjustments based on the unpredictable fluctuations. In this stage, a satellite explicitly informs multiple upstream satellites to reroute a portion of traffic once it finds a burst of traffic resulting in queuing beyond a certain threshold.
HGL not only uses the global view of global strategy to proactively disperse the majority of traffic, which in turn significantly reduces the need for subsequent adjustments, but also allows nodes to use local strategy to promptly adapt to locally changing conditions without incurring a large overhead for global updates. In addition, because the baseline traffic demands are slow to change, HGL relaxes the burden of frequently updating the global strategy. We list the main contributions of our work as follows:
The remainder of this article is structured as follows. In section “Related works,” we summarize the related works and note their limitations. In section “IoT traffic demand of LEO satellite networks,” we introduce the decomposition of traffic demand time series and provide the methodology for baseline modeling. In section “Technical design of HGL,” we highlight the technical design of the proposed HGL and make a detailed analysis for parameter setting. In section “Performance evaluation,” we evaluate and analyze the performance of HGL using simulations. Finally, in section “Conclusion,” we conclude this article with a summary.
Related works
Nowadays, satellite networks are becoming increasingly important for broadband communication, navigation, and remote sensing. Although a lot of matters are important for satellite networks, such as data encryption,33–39 image encryption,40–46 and data coding,47,48 to transmit data, routing should be solved first. While traditional routing algorithms in satellite networks mainly consider propagation delay, load balancing routing algorithms also take queuing delay and congestion status into consideration when selecting paths for packet transmission. HC Yan et al. 12 proposed a LEO satellite network link state routing (SLSR) scheme, which considers using the summation of propagation delay and expected queuing delay as the link cost for load balancing. In SLSR, satellites must first periodically flood the expected queuing delay to gather the global link state. Then, they choose the paths with minimum link cost as the optimal paths for data transmission. Using the same link cost metric, C Chen and Ekici 13 introduced a satellite grouping and routing protocol (SGRP) for Multi-Layered Satellite Networks. In this scheme, LEO satellites are divided into groups that are managed by Medium Earth Orbit (MEO) satellites, and MEO satellites are responsible for collecting global link state information and calculating routing tables. In addition to queuing delay, load balancing routing protocol based on mobile agent (LBRP-MA), which was proposed in Rao et al., 14 also takes into account the geographical position when setting the link cost. It aims to migrate heavy traffic from the Northern Hemisphere to the Southern Hemisphere and employ mobile agents to gather global link states and calculate routing tables. SLSR, SGRP, and LBRP-MA provide traffic adaptability by taking into account queuing delay or geographical factors in the estimation of the shortest paths. However, only one outgoing path is provided for each traffic demand. As a result, it is difficult to distribute heavy traffic through multiple satellites. To solve this problem, JJ Bai et al. 15 proposed a compact explicit multi-path routing (CEMR) algorithm which improves upon the above schemes by allowing each satellite to split traffic among the K-shortest paths. In addition, CEMR adopts a plane speaker mechanism to more effectively collect global traffic information. However, CEMR simply splits the traffic equably between the K-shortest paths. As optimal traffic allocation typically involves multiple paths with different traffic shares, the outcome of CEMR may deviate substantially from the ideal. HS Chang et al. 16 proposed a finite-state-automation (FSA)-based routing algorithm that realizes optimal traffic allocation through heuristics. It divides the system period into equal-length intervals and regards the topology in each interval as fixed. Then, load balancing routing is formulated as a set of optimization problems in fixed topology networks. Unfortunately, FSA is an offline scheme that calculates the optimal link assignments based on the fixed propagation delays in each interval. Thus, it is not adaptable to real traffic conditions. Kucukates and Ersoy 17 employed the Routing Set concept and proposed an online load balancing routing algorithm called the maximum-flow minimum-residual (MFMR) algorithm. This approach assumes that ISLs below the 50th latitude are of equal length and that each traffic source and destination can be placed on the diagonal position of a Routing Set. To reduce the large overhead caused by state messaging to the whole network, MFMR only periodically floods traffic state information in the Routing Set. Then, the maximum flow is minimized over the available paths in the Routing Set (not all the network) to balance the load in this area. MFMR provides a feasible solution for realizing optimal traffic allocation based on real-time traffic information. However, MFMR does not address traffic requirements, which are necessary for optimization. Moreover, traffic that is outside the 50th latitude and passes through the Polar Regions cannot be optimized by MFMR. Bertaux et al. 18 and Bao et al. 19 introduced two schemes for solving load balancing routing by applying software-defined networking (SDN) principle. However, the complexity and cost still need to be evaluated before making such a solution a reality.
As can be seen, although the above schemes have many differences, they all use a global strategy (we refer to them as global schemes). That is, the routing decisions in these schemes are made based on the traffic information gathered from the whole network (or the Routing Set). A key advantage of this type of scheme is global optimality. However, the need to create and maintain a global view of the traffic state often results in a significant amount of communication and storage overhead. Moreover, it is difficult for these schemes to manage congestion in real-time, as the information that is gathered can be outdated due to large propagation delays.
Faced with these shortcomings, researchers began to pursue new schemes that enable nodes to independently perform load balancing routing. Taleb et al. 20 proposed an ELB routing protocol that exclusively uses traffic load information from neighboring satellites. In ELB, a satellite explicitly informs its neighboring satellites to reroute a portion of traffic via alternative paths when it suffers heavy traffic loads that exceed a pre-defined threshold. ELB alleviates heavy traffic loads essentially in real-time fashion. It reacts rapidly to changes in traffic compared with global schemes. However, ELB only transmits the congestion notification to one-hop neighbors; thus, heavy traffic cannot be promptly dispersed by multiple upstream satellites. Moreover, the limited range within which traffic is rerouted may be a source of cascading congestion on neighboring satellites. Ö Korçak et al. 21 proposed a priority-based adaptive routing (PAR) technique in which a routing decision is made at each hop by a priority mechanism depending on the past utilization and buffering information about the links. When sending packets, it tends to choose the links that are less utilized. PAR does not collect traffic information from other satellites, thus minimizing overhead. However, as the outgoing link state may not reflect the real state of the next hop node, PAR may sometimes send packets to satellites plagued by severe congestion. GH Song et al. 22 proposed a traffic-light-based intelligent routing strategy (TLR) that performs load balancing routing considering the congestion status at both the current outgoing link and the next hop node. A set of traffic lights is used to indicate the congestion status at the current outgoing link and the next hop node. Packets are transmitted along dynamic routes according to the real-time color of traffic lights at the current outgoing link and the next hop node. However, due to the absence of a global view, TLR can only migrate aggregated traffic to neighboring light-loaded satellites and cannot disperse them over the entire network. Moreover, TLR still bears the risk of cascaded congestion as ELB does. In addition to these schemes, Nishiyama and colleagues23–25 introduced some load balancing routing schemes for Multi-Layer Satellite Networks. The purpose of these schemes is to enable each satellite to transfer packets with fewer hops via the LEO layer and to transfer packets with more hops via the MEO layer or Geosynchronous Earth Orbit (GEO) layer. However, these schemes successfully utilize network capacity between layers but fail to distribute traffic in each layer.
As can be seen, the above schemes typically allow satellites to make routing decisions based on local traffic information. In other words, they use a local strategy (we refer to them as local schemes). Unlike global schemes, local schemes only gather the necessary information from neighboring satellites, thus generating less communication and storage overhead. In addition, as they can change routes in real-time, local schemes provide strong traffic adaptability. However, because there is no global view, routing decisions in local schemes occasionally cannot guarantee global optimality. Moreover, faced with highly dynamic traffic, local schemes may endure persistent feedback signals and frequent route oscillations.
Overall, the abovementioned schemes represent the typical load balancing routing solutions for satellite networks. However, as described, these schemes are based on either a global strategy or a local strategy. In addition, to the best of our knowledge, nearly all load balancing routing solutions are also based on a single strategy, as in the case of these typical schemes. Because either strategy has certain merits and flaws, no scheme can achieve ideal performance.
Another defect in existing load balancing routing schemes is load balancing routing is performed only when congestion is imminent; the predictability of satellite network is not used to proactively prevent congestion. Therefore, to overcome these weaknesses, our hybrid global-local scheme is proposed.
IoT traffic demand of LEO satellite networks
Exploring the characteristics of IoT traffic demand in LEO satellite networks is important to understand traffic distribution and to design better routing schemes. This section provides a detailed analysis of the global IoT traffic feature; it demonstrates how the inter-satellite traffic demand can be decomposed into a predictable long-range baseline and unpredictable short-range fluctuations, which form the basis of the development of our hybrid scheme. Moreover, this section provides the methodology for baseline modeling.
Predictable long-range baseline
In LEO satellite networks, each satellite provides access to services for all IoT devices within its coverage. The traffic demand between the two serving satellites depends on the potential traffic demand between the two coverage areas. Because satellite coverage varies geographically with the movement of constellations and because the traffic intensity in the coverage areas changes temporally, inter-satellite traffic demand is time varying. However, because the geographical variation of the satellite’s coverage area is cyclical and deterministic and because the temporal variation in aggregate IoT traffic intensity between two certain areas exhibits an evident daily pattern,28,29 the evolution of inter-satellite traffic demand presents a clear deterministic and periodic trend. In addition, the period of this trend can be obtained by calculating the smallest common integer multiple of the satellite orbital period, the Earth’s rotation period, and the IoT traffic intensity period.
We now explain how to model the baseline of this trend. The ideal constellation is assumed to be the Iridium constellation, 30 which has a period of baseline traffic variation that is approximately equal to 24 h. Because the Iridium constellation has 66 satellites equally distributed in six planes, we correspondingly divide the Earth into 6 × 11 geographical zones as shown in Figure 1. To hide the complex repercussions caused by the mobility of satellites when modeling the baseline, we adopt the virtual satellite mechanism.49–51 A total of 66 virtual satellites are fixed above the center of the geographical zones. At any time, each virtual satellite is represented by the physical satellite closest to it. When the physical satellite moves away and another moves within the range of the virtual satellite, the traffic demand and routing table information are switched to the successor satellite. Because the physical satellite periodically moves above the surface of the Earth, the mapping relationship between virtual satellites and physical satellites is cyclical and can be calculated in advance. Thus, if the baseline traffic demand between the two virtual satellites can be obtained, the baseline between the two physical satellites can also be derived. To simplify the analysis, we assume that IoT traffic demand in each geographical zone is mapped onto the virtual satellite that resides above the center of the zone and that each satellite is only responsible for one zone. Moreover, we follow21,52 and assume that the potential traffic requirement for satellite networks from each geographical zone is proportional to the number of IoT devices in each zone. The statistics from the Organisation for Economic Co-operation and Development (OECD) in 2015 10 for the IoT devices for each country are utilized to estimate the number of IoT devices in each zone. The results are depicted in Figure 1.

Earth zone division and IoT device density.
We choose virtual satellite
In addition to the geographical distribution of IoT traffic, the temporal variation of IoT traffic intensity in each geographical zone is also considered for baseline modeling. In Shafiq et al.
28
and Romirer-Maierhofer et al.,
29
the authors noted that the temporal variation of aggregate IoT traffic intensity between two certain areas exhibits an evident daily cycle. In addition, these authors identified that this variation is closely coupled with human working hours, and the reason for this phenomenon may be because that a majority of IoT devices are employed for business use. A daily profile,
21
which is shown in Figure 2 and defines the aggregated traffic volume of each hour as a percentage of the total traffic volume within a single day, can demonstrate this daily variation in IoT traffic intensity. Therefore, if we assume that the daily evolution of IoT traffic intensity is the same for all areas worldwide and the local time of each geographical zone is equal to the solar time of the respective zone’s center longitude. Then, the hourly traffic demand (packets/second) between satellites
where

Daily variation of IoT traffic intensity.
Unpredictable short-range fluctuations
The above method models the long-range baseline in inter-satellite (virtual satellite) traffic demand variation. Generally, the actual variation in traffic demand conforms to the baseline. However, in certain periods, traffic demand can significantly deviate from the long-range baseline. Two major factors may contribute to this phenomenon. One factor is the unpredictable traffic fluctuations caused by terrestrial IoT applications. The other factor involves anomalies in the satellite network topology.
As Shafiq et al. 28 and Wu et al. 53 stated, although most IoT sensors are based on a time-triggered mechanism, several event-driven IoT applications can still result in severe bursts of traffic. In addition, under some special circumstances, such as natural disasters and holidays, IoT traffic volume can also change dramatically from the usual traffic. Because satellite networks mainly carry traffic from terrestrial IoT applications, the short-range unpredictable characteristics of terrestrial IoT traffic will inevitably be translated to the satellite segment and finally lead to unpredictable short-range fluctuations in inter-satellite traffic.
Anomalies in the satellite network topology refer to the topology changes and network failures that are caused by configuration mistakes, equipment damage, and other factors. These anomalies often cause traffic shifts among ISLs, which may lead to traffic fluctuations or even severe bursts of congestion in the satellite networks. Unfortunately, because of the harsh environment of space and the limited life span of satellite equipment, these anomalies are quite frequent and further perpetuate the short-range unpredictable characteristics of satellite traffic.
Technical design of HGL
Overview of HGL
Based on the idea presented in section “IoT traffic demand of LEO satellite networks,” the inter-satellite IoT traffic demand of LEO satellite networks can be decomposed into a combination of long-range baseline and short-range fluctuations (also referred to as deviations to the baseline) as illustrated on the left side of Figure 3. The long-range baseline extracts the deterministic dynamics in traffic demand. It represents the majority of traffic demand variation and can be considered a crude estimation of the actual traffic demand. Short-range fluctuations refer to all unpredictable changes apart from the baseline. They occur stochastically and temporarily and reflect the deviations between the real traffic volume and the estimated average traffic volume.

An overview of the HGL scheme. The data used in this graph have been gathered from a nation-wide cellular operator in the United States during one complete week in August 2010. 28 The actual data reflect the aggregate uplink machine-to-machine (M2M) traffic intensity in the core networks on Monday, and the long-range baseline is calculated by averaging the traffic data from the complete week.
From the network view, long-range baseline traffic demands between all satellite pairs reflect the regular changes of all network traffic; they are preferably managed by global strategies. However, large short-range fluctuations tend to appear on particular satellites and are therefore more suitable for local strategies. Thus, we implement the HGL scheme in two stages, as shown in Figure 3. First, it concentrates on the baseline traffic demand variations and employs a global strategy to periodically conduct preliminary optimal traffic allocations. Load balancing routing at this stage is mathematically formulated as a set of multi-commodity flow optimizations that seek to minimize the total flow costs of the entire network. After implementing the calculated optimal routing tables, the network can disperse the network traffic in a balanced and effective manner. However, because the optimal routing tables are calculated based on the long-range baselines, they are not suitable for handling bursts of congestion caused by unpredictable traffic fluctuations. Continually updating the optimal routing table to eliminate bursts of congestion is costly and often too late. Therefore, a local strategy is employed to conduct real-time route adjustments during the global update intervals. In this stage, a satellite can explicitly inform its multi-hop upstream satellites to reroute a portion of traffic when its link occupancy rate exceeds a pre-defined threshold. As a consequence, the affected intermediate satellites can rapidly resolve the burst congestion instead of waiting for the next whole network update. Finally, by continuously combining global planning with local real-time adjustments, congestion can be effectively eliminated, and traffic can acquire a near-optimal distribution over the constellation. In the following section, we describe the two steps in detail.
Preliminary optimal traffic allocation
In the preliminary optimal traffic allocation phase, HGL periodically collects the global link state and utilizes the long-range baseline traffic demands to compute the optimal routing table every
Consider the case of the Iridium constellation in Figure 4(a). Most satellites maintain four ISLs (two intra-plane ISLs and two inter-plane ISLs) with neighboring satellites, except those at high latitudes whose inter-plane ISLs are turned off. The intra-plane ISLs between neighboring satellites in the same plane are maintained at all times, and the propagation delay on the intra-plane ISLs is always fixed. In contrast, the inter-plane ISLs between neighboring satellites in different planes are shut down in the Polar Regions and re-established outside of the Polar Regions, and the propagation delay on the inter-plane ISLs becomes shorter when the end satellites move toward the polar regions. All satellites move in the same circular direction within the same plane. As a consequence, any satellite that is observed from the Earth moving from South to North will be observed to start moving from North to South when it crosses the North pole. Hence, between the first and last planes, there exists a seam called cross-seam. The opposite movement of the neighboring satellites along the cross-seam results in frequent handovers in the cross-seam ISLs.

Network topology of the Iridium constellation: (a) 3D graphics and (b) 2D graphics.
Due to these frequent link switches and handovers, the topology of the satellite network becomes extremely dynamic, which makes routing problems complicated. To hide the mobility of satellites and consequent handover events from the routing protocol, we continue to adopt the virtual satellite mechanism, 49 which has been quoted in the section “IoT traffic demand of LEO satellite networks.” A total of 66 virtual satellites are supposed to evenly set above the surface of the earth and form a fixed mesh topology. The connectivity in this mesh topology is invariant. Each physical satellite represents a virtual satellite for a fixed period of time. When it leaves the area of the virtual satellite, it automatically transfers the routing table, traffic demand, and other relevant information to the successive satellite and execute a link handover. By using such mechanism, routing can be performed in the stable virtual topology, and the global routing update interval can be independent of link handover examination. Note that the delay between inter-plane virtual satellites is time varying due to the varying distance between physical satellites and the changing queuing delay. To effectively collect the global link state, we assign a plane speaker satellite (virtual satellite) for each plane. 15 At the beginning of each global update interval, plane speakers are responsible for collecting the link state and queuing delay information within their planes and exchange the collected information with other plane speakers to build the global routing information base (GRIB). Non-plane speaker satellites can only broadcast their state information to their intra-plane satellites. Moreover, we request that the plane speakers at adjacent planes have a direct link as shown in Figure 4(a).
In GRIB, each link has a link cost
where
where
where
Note that in
This objective achieves load balancing by reserving maximum residual bandwidth to all the links. However, it incurs longer delays because packets may be subjected to more detours to maintain low link utilization. This is inefficient, in particular in situations characterized by light loads. Therefore, in HGL, we suggest the optimization objective of minimizing the total flow cost, 32 which not only balances the traffic but also optimizes the overall delay performance
and the objective is subjected to
where
As can be seen, traffic allocation with the objective of minimizing the total flow cost and its constraints allows flows to preferentially use the lower-cost links but without exceeding the link capacities. In this way, HGL can find the cheapest way to send traffic based on the premise of load balancing.
If we regard the flow sent from
Having obtained the flow matrix
Routing table of satellite S40 before traffic detour.
Local route adjustment
So far, the optimal routing table has been calculated and can be distributed to each satellite. Traffic will be proactively distributed to multiple paths in a balanced manner rather than concentrating on the shortest paths. However, as stated previously, the optimal routing table is inadequate for addressing unpredictable traffic. Therefore, we design the local route adjustment to complement preliminary traffic allocation and address this burst traffic.
In the local route adjustment phase, each satellite periodically detects the buffer queue and the traffic sending rate of each outgoing ISL every
As Figure 4(b) shows, when a given satellite
Routing table of satellite S40 after traffic detour.
In addition to modifying the routing table, neighboring satellites have one additional task, that is, to check whether it is necessary to ask upstream satellites to further detour the traffic. If the neighboring satellites find that the detoured traffic exceeds the link capacity of the alternate paths, they will immediately send BSAs to their upstream neighboring satellites to request them to further reroute the excess traffic to alternate paths. As shown in Figure 4(b), once the neighboring satellite
To promptly and effectively eliminate burst congestion, the rational setting thresholds
The setting of
and
The key strategy behind an optimum setting of
where
Taking the special case
Combining the special case
As shown in equations (13) and (15), when a WTR
The setting of detour ratio
The setting of detour ratio
In the bottleneck satellite
In response to burst congestion, the bottleneck satellite generates BSAs and sends them to neighboring satellites. In the bottleneck satellite, let
To ensure a prompt recovery for the bottleneck satellite—one that lasts for at most
Therefore, we get
Thus, the detour ratio can be accordingly computed as
In the upstream satellites
As stated previously, when a neighboring satellite or multi-hop upstream satellite receives the BSA from the congested node, it first checks whether it needs to further transmit BSAs to its upstream satellites. Let
Therefore, we obtain
Conversely, if
After the queue occupancy ratio is brought back to
Loop-free design
To avoid traffic loops, HGL requires the packet to record the passed hops in its head as it travels in the satellite network. When the next hop appears in the packet head, the packet is sent in another feasible direction (randomly selected). If both directions appear in the head, the packet is sent back to the former node to find other feasible directions. In our model, the satellite number is 66 (6 * 11), and packets at most traverse
Performance evaluation
Simulation setup
In this section, we evaluate the performance of HGL using NS2 (network simulator). The experiments are conducted on the aforementioned Iridium constellation with LEO satellites residing at an altitude of 780 km. The Polar Regions are defined as the regions between latitudes
In HGL, plane speakers are configured to collect the link state and expected queue delay every 30 s (
In the simulations, we conduct two kinds of HGL with different optimization objectives, one for the case of minimizing the maximum flow (HGL-MMF) and another for our proposed case of minimizing the total flow cost (HGL-MFC). For comparison, we also conduct DSP, MFMR, and TLR protocols. DSP is based on the gathered global routing metric of “propagation delay + expected queuing delay,” and the routing metric and routing table are updated every 30 s to remain identical to that in HGL. MFMR represents the class of global load balancing routing schemes that only periodically minimize the maximum flow without adapting to unpredictable traffic changes. The refresh time of the optimal routing table is also set to 30 s. TLR is implemented over the DSP algorithm. It represents the benchmark of the local load balancing routing scheme, which only considers local route adjustment. Based on these comparisons, we can evaluate how much the performance can be improved by our hybrid scheme.
For traffic generation, we set 66 ground terminals in the center of each geographical zone as shown in Figure 1. To better model the IoT traffic with both long-range predictable and short-range unpredictable characteristics, we control each node to generate on–off flow that obeys a Pareto distribution with the shape equal to 1.2. The average burst and idle time are set to 200 ms, and the mean value of the sending rate between each pair of nodes is set according to equation (2); thus, increasing the total offered traffic A in equation (2) will increase the average flow rates and in turn increases the average load of each satellite. Note that as the focus of the simulations is to evaluate how much performance can be improved by the hybrid scheme compared with single-strategy schemes based on the existence of the baseline, we are not greatly concerned about the effect of the deviation between the realistic baseline and the modeled baseline. Therefore, the mean values of the sending rate are directly set according to the modeled long-range baselines. Finally, all scenarios are run 10 times for 24 h (a system cycle), and we consider the average values the final results.
Simulation results
Packet loss rate
First, we evaluate the performance of HGL in terms of the packet loss rate. Figure 5(a) graphs the total packet loss rate experienced by each scheme under different settings of total offered traffic. HGL (including HGL-MMF and HGL-MFC) has the lowest packet loss rate. This proves that the hybrid scheme is effective and successfully reduces the packet loss rate compared with single-strategy schemes. Note that HGL-MMF achieves a smaller packet loss rate than that of HGL-MFC because minimizing the maximum flow leaves more space for future traffic growth. Therefore, when unpredictable traffic is coming, few packets tend to overflow the capacity. However, based on the diagram, we find that this difference is not significant (the mean difference is 3%). This is because a multi-hop traffic detour by local route adjustment also promptly accommodates the unpredictable traffic, which prevents packet losses from another perspective. For all the traffic volumes, DSP sustains the largest packet loss rate because it only tries to route the traffic to the shortest paths. Aggregated traffic may overflow the selected paths and finally lead to a large packet loss rate. MFMR uses multi-paths to spread the traffic over the entire network, which causes a significant decrease in the packet loss rate compared with that of DSP. However, due to its inability to address unpredictable traffic, the packet loss rate is still larger than that of HGL. TLR introduces traffic lights to indicate the congestion status and dynamically reroutes traffic to avoid packet drops when congestion is encountered. It performs well in conditions with light loads but deteriorates rapidly with increases in the traffic load, as TLR is the shortest path based and cannot actively spread traffic. When too much traffic has accumulated in the shortest paths, local adjustment is actually too late to stop the packet drops. Moreover, due to the lack of a global view and the limited detour range, severe traffic aggregation is difficult to overcome among neighboring satellites.

Simulation results: (a) packet loss rate, (b) average end-to-end delay, (c) average queue occupancy for each satellite (A = 3.2 Tbit), (d) traffic distribution index, (e) number of route changes, and (f) communication overhead.
Average end-to-end delay
Second, we evaluate the performance of HGL in terms of average end-to-end delay. The average end-to-end delay of each scheme is computed by averaging the end-to-end delays of all source–destination pairs. As shown in Figure 5(b), MFMR and HGL-MMF show higher average end-to-end delay (worse performance) than others because they reserve maximum residual bandwidth to achieve the objective of minimizing the maximum flow. As a result, packets may suffer more detours to keep link utilization low. DSP always routes traffic to the single shortest paths. Its average end-to-end delay degrades substantially with heavy loads because long queuing delays begin to appear at the selected paths. TLR dynamically detours traffic to alternative paths in response to congestion. Although traffic detours may add additional delay in situations characterized by light loads, it is still more beneficial than having to bear severe queuing delays in conditions characterized by heavy loads, as the TLR curve is substantially below the DSP curve after the total offered traffic increased to 1.9 Tbit. HGL-MFC achieves the lowest average end-to-end delay in all five schemes. This should be attributed to the fact that HGL-MFC eliminates congestion and queuing delays to the greatest extent by combining proactive traffic dispersing with local real-time route adjustment. It can also be attributed to the fact that HGL-MFC prefers the shortest paths to carry traffic.
To verify that HGL can alleviate congestion and reduce queuing delays, we calculate the average queue occupancy (averaged value over the simulation launch time) of each satellite in Figure 5(c). The simulations are performed for cases where the total offered traffic A is set to 3.2 Tbit per day.
As can be seen, HGL (including HGL-MFC and HGL-MMF) achieves the lowest average queue occupancy. This indicates that congestion is successfully alleviated throughout the network by global traffic dispersing and local real-time adjustment in two steps. DSP exhibits the highest results for some satellites because it considers neither global traffic dispersing nor local adjustment. Traffic in DSP tends to highly concentrate on a few shortest paths in high-load areas, and satellites that often go through these shortest paths may have larger traffic loads and higher average queue occupancies. TLR can detour a portion of traffic from shortest paths to alternative paths when the shortest paths are congested. Thus, the average queue occupancies of these satellites are lower than DSP. MFMR always attempts to distribute traffic to all the paths in the constellation. Therefore, the average queue occupancy of each satellite in MFMR is more evenly distributed. However, due to its inability to address frequent burst traffic congestion, the queue occupancies of some satellites are still higher than HGL.
Traffic distribution index
Third, we evaluate the traffic distribution in each scheme. To explore how well the traffic is distributed across the network, the traffic distribution index (TDI) 20 is introduced
where n is the number of ISLs and
Route oscillation
Because HGL uses the global view to proactively disperse baseline traffic demands, it should be able to reduce the need for subsequent route adjustments, which alleviates route oscillation problems. To verify this idea and investigate how many improvements can be obtained, we select HGL and TLR and compare the number of route changes during the system cycle.
Figure 5(e) indicates that HGL truly decreases the number of route changes compared to TLR. HGL-MFC reduces the amount by nearly half, and HGL-MMF reduces the amount by nearly three-fourth. Therefore, we can conclude that the use of long-range baseline traffic demands to conduct preliminary traffic allocation can significantly reduce the need for latter adjustment and effectively alleviate route oscillation problems.
Communication overhead
Finally, we compare the communication overhead of each scheme. Remember that HGL and DSP schemes are realized on the same basis: plane speakers periodically collect and exchange link states to establish the GRIB. MFMR collects link states based on the flooding mechanism in the routing sets. Figure 5(f) presents the result in a system cycle (24 h). As can be seen, DSP and MFMR exhibit a relatively lower and constant value compared with other schemes because they only periodically update the routing tables without adapting to unpredictable traffic fluctuations. The communication overhead is mainly derived from the packets for GRIB updating and the delivery of routing tables. In contrast, TLR, HGL-MFC, and HGL-MMF represent relatively higher values because they not only periodically update the global routing tables but also dynamically adjust their routes according to real-time traffic fluctuations. Warning packets and BSAs are produced on a large scale when intermediate satellites encounter unexpected congestion. Moreover, as HGL-MFC and HGL-MMF reduce the need for route adjustment, they generate less communication overhead than TLR.
Conclusion
In this article, we propose a novel load balancing routing scheme, HGL, for the IoT through satellite networks. It not only combines global strategy and local strategy to optimally allocate IoT traffic flows from the whole and part but also utilizes the regularity and predictability of LEO satellite networks to proactively disperse traffic loads to prevent congestion. Compared with single-strategy schemes, the simulation results indicate that HGL effectively reduces the occurrence of congestion and achieves better performance in terms of the packet loss rate, average queuing delay, traffic distribution, and route oscillation with a reasonable communication overhead. Meanwhile, HGL, whose objective is minimizing the total flow cost (HGL-MFC), can reduce the average end-to-end delay to a much greater degree than the traditional approach of minimizing the maximum flow. Therefore, for delay-sensitive IoT applications, HGL-MFC may be a better choice.
Although HGL is proved to work well in network layer, there is still much room for improvement to integrate HGL into higher layer protocols. For example, multi-path routing concept of HGL may sometimes lead to packet reordering. In case of connection-oriented transport protocols (e.g. TCP), this phenomenon may result in the transmission of duplicate acknowledgments and unnecessary halves of the congestion window. Fortunately, a number of TCP-enhanced technologies, such as TCP-PR, 58 FLARE, 59 and ALBAM, 60 can resolve this problem. Therefore, in the works that follow, we hope to integrate proper TCP-enhanced technology in the design of HGL and conduct further evaluation. Second, the baselines of the IoT traffic demands of LEO satellite networks may need to be revised using realistic traffic traces and appropriate prediction algorithms. Big data,61–63 wavelet analysis, 64 and support vector regression65–69 technologies would be helpful for analyzing traffic characteristic and designing better traffic model. Therefore, in future work, we hope to address more real-world projects, collect realistic traces, and find appropriate traffic prediction algorithms suitable for IoT and satellite environments.
