Sage Journals: Discover world-class research

Abstract

Security of cyber-physical systems against cyber attacks is an important yet challenging problem. Cyber-physical systems are prone to information leakage from the physical domain. The analog emissions, such as magnetic and power, can turn into side channel revealing valuable data, even the crypto key of the system. Template attack is a popular type of side-channel analysis using machine learning technology. Malicious attackers can use template attack to profile the analog emission, then recover the secret key of the system. But conventional template attack requires that the adversary has access to an identical experiment device that he can program to his choice. This study proposes a novel side-channel analysis for physical-domain security in cyber-physical systems. Our contributions are the following three points: (1) Major peak region method for finding points of interests correctly is proposed. (2) A method for establishing templates on the basis of those points of interest still without requiring knowledge of the key is proposed. Several techniques are proposed to improve the quality of the templates as well. (3) A method for choosing attacking traces is proposed to significantly improve the attacking efficiency. Our experiments on three devices show that the proposed method is significantly more effective than conventional template attack. By doing so, we will highlight the importance of performing similar analysis during design time to secure the cyber-physical system.

Keywords

Cyber-physical systems security and privacy side-channel analysis template attack points of interest

Introduction

The term cyber-physical systems (CPSs) refers to a new generation of systems with integrated computational and physical capabilities that can interact with humans through many new modalities. Figure 1 shows the feature of CPS. To realize the automatic flow of data, there are four steps: state perception, real-time analysis, scientific decision-making, and precise execution. A large number of implicit data contained in physical space are transformed into explicit data through state perception, and then can be analyzed by a computational method in information space, and the explicit data can be transformed into valuable information. The information of different systems is processed centrally to form scientific decision-making of external changes, which further transforms information into knowledge. Finally, the more optimized data are applied to the physical space to form a closed-loop flow of data.

Figure 1.

The feature of CPS.

Figure 2 shows a CPS application scenario. In CPSs, computational and communication cores, governed by cyber processes, interact with physical-domain sensors and actuators. This interaction opens up possibilities for unique vulnerabilities¹ posing serious challenges to the system security. For example, as the information flow in the cyber-domain manifests physically in the form of energy flows in the physical domain, it may be leaked. These energy flows may be observed in the form of various analog emissions such as vibration, acoustic, magnetic, and power.

Figure 2.

CPS application scenario.

These mediums, through which unintentional leakage of information occurs, are also known as side channels. Side channels pose a serious threat to the confidentiality of the system as they may indirectly reveal the cyber-domain data. Side-channel attacks^2,3 have shown to be extremely effective as a practical means for attacking implementations of cryptographic algorithms. Adversaries can obtain sensitive information from side channels such as timing of operations,² power consumption,³ and electromagnetic emanations.^4–6 They are used to break the cryptographic protocols, where attackers, rather than using brute force or attacking theoretical weakness of the algorithms, use side channels to infer about the various system states in the cyber domain. Since Paul Kocher’s original paper,³ a number of devastating attacks, such as simple power analysis (SPA), differential power analysis (DPA),^3,7,8 and correlation power analysis (CPA),^9–11 have been reported on a wide variety of cryptographic implementations.^12–16

Template attack (TA) was first proposed by Chari et al.¹⁷ in Conference on Cryptographic Hardware and Embedded Systems 2002 (CHES2002). Since then, it has been widely used by many scholars in side-channel attack technology. The basic idea is to establish noise templates for specific information leaked during cryptographic algorithms. Noise templates reflect the energy consumption characteristics of leaked information and are used to identify information leaked by attack traces. TAs¹⁷ consist of two phases. The first phase is the profiling phase, and the second phase is the extraction phase. In the profiling phase, one captures some actual power traces from a reference device identical or similar to the targeted device and builds templates for each key-dependent operation with the actual power traces. In the extraction phase, one can exploit a small number of actual power traces measured from the targeted device and the templates obtained from the profiling phase to classify the correct (sub) key. However, a key requirement for the TA¹⁷ is that the adversary has an identical experimental device which can be programmed. By changing the secret key of the experimental device, the adversary can cross compare the power consumption measured in the encrypting process with different secret keys and find possible positions of information leakage on power traces. These positions are called points of interest (POIs).¹⁷ The knowledge^18,19 of POIs allows the adversary to profile the power distribution of specific operations (usually by multivariate Gaussian model) and build the corresponding templates for extracting the secret key of a new device. Up to now, many different methods of choosing interesting points were analyzed in the literature. They are difference of means (DOM)-based method,¹⁷, sum of squared differences (SOSD)-based method,²⁰ CPA-based method,²¹ sum of squared pairwise T-differences (SOST)-based method,²⁰ signal-to-noise ratio (SNR)-based method,²¹ variance (VAR)-based method,²² mutual information analysis (MIA)-based method,²³ and principal component analysis (PCA)-based method.²⁴

The limitation of TA is that the adversary should obtain full control of an identical experimental device, which usually means arbitrarily changing the secret key of the experimental device; this limitation has been relaxed to knowing only three keys of similar devices by the method proposed in the study by Lerman et al.²⁵ and has been further relaxed to only requiring one secret key of an identical device by the concept of equal images under different subkeys (EIS) proposed in the studies by Gierlichs et al.²⁰ and Schindler et al.²⁶ EIS property states that the domain of the intermediate value of an operation is always the same whether or not the plaintext and subkeys involved in the operation are random or not. Under the assumption of EIS, an adversary can profile a type of device with only one secret key. Wu et al.²⁷ have proposed a method of TA based on clustering, which can accurately fit the leaked information probability model even one secret key cannot be required, and the location of information leakage can be determined. But this method to locate the POIs is not straightforward.

As mentioned earlier, POIs are the basis of building templates in TA. How to choose the POIs correctly is significantly important. Our contributions are the following three points:

Sometimes adversary cannot locate the positions of SBOX input and output straightforwardly, a new MPR method for finding POIs correctly is proposed.

A method for establishing templates on the basis of those POIs still without requiring knowledge of the key is proposed. Several techniques are proposed to improve the quality of the templates as well.

A method for choosing attacking traces is proposed to significantly improve the attacking efficiency.

In this article, related works are introduced in section “Related works.” Then, a novel TA method is proposed. How to find POIs correctly? How to build templates effectively? How to extract the correct key? These three questions are answered in section “Novel TA on SM4.” In order to verify the novel method, experiments are shown in section “Verification and comparison.” Finally, the conclusion is drawn.

Related works

In this section, we briefly review the different methods of choosing interesting points, TAs and the block cipher algorithm SM4.

The methods of choosing interesting points

As mentioned earlier, many different methods of choosing interesting points were analyzed. They are DOM, SOSD, SOST, and PCA.

Captured traces normally consist of too many samples, usually to the magnitude of 10⁵. Most of these samples are irrelevant to the targeted information leakage of $O_{k}$ . The POIs of power consumption with the most significant information leakage on traces should be determined. We compute the signal-strength estimate $F (t)$ for every point $t$ corresponding to the targeted intermediate value in actual power traces using the different methods of choosing interesting points. Then, the interesting points are chosen based on the value of the signal-strength estimate $F (t)$ . The first method DOM was proposed in the study by Chari et al.,¹⁷ in which samples with the largest VAR of group mean were chosen as POIs

$F (t) = \sum_{i < j} (μ_{i} (t) - μ_{j} (t))$ (1)

where $μ_{i} (t)$ represents the mean power of group $i$ at location $t$ of the traces.

The SOSD method was proposed to improve the quality of selecting POIs in the study by Gierlichs et al.²⁰

$F (t) = \sum_{i < j} (μ_{i} (t) - μ_{j} (t))$ (2)

where $μ_{i} (t)$ represents the mean power of group $i$ at location $t$ of the traces, and $μ_{j} (t)$ represents the mean power of group $j$ at location $t$ of the traces.

The SOST method is based on the t-test, which is a standard statistical tool to meet the challenge of distinguishing noisy signals. The SOST method was proposed in the study by Gierlichs et al.²⁰

$F (t) = \sum_{i < j} \frac{{(μ_{i} (t) - μ_{j} (t))}^{2}}{(\frac{{σ_{i}}^{2} (t)}{| N_{i} |} + \frac{{σ_{j}}^{2} (t)}{| N_{j} |})}$ (3)

where ${σ_{i}}^{2} (t)$ represents the VAR of group $i$ at location $t$ of the traces. $| N_{i} |$ denotes the cardinality of the set.

PCA has also been applied as a dimension reduction technique in TA.²⁴ In practice, the effect of applying PCA is limited because the principal dimensions contain a large proportion of VAR on locations irrelevant to the targeted operation. PCA is practical only when applied to a large number of POIs (e.g. thousands) that are selected on the basis of SOSD or SOST.²⁰

TA

TA consists of two phases. The first phase is the profiling phase, and the second phase is the extraction phase.

Profiling phase

In the profiling phase, traces are grouped on the basis of the intermediate values produced by the targeted operations in the encryption process. These groups of traces will be used to build the templates, which are usually modeled with multivariate Gaussian distribution representing the mean power consumption and the characteristic of Gaussian noise of the leaked information. Suppose the trace group of operation $O_{i}$ is ${ϑ_{i}}$ , then its multivariate Gaussian distribution will be

$p (υ_{i} | T_{i}) = \frac{1}{\sqrt{{(2 π)}^{N} | Σ_{i} |}} \exp (- \frac{1}{2} {(υ - μ_{i})}^{T} Σ_{i}^{- 1} (υ - μ_{i}))$ (4)

where $Σ_{i}$ represents the covariance matrix and the symbol $Σ_{i}^{- 1}$ denotes its inverse. $μ_{i}$ represents the mean power vector. $N$ is the number of information leakage positions on traces. $T_{i}$ represents the mean power consumption and noise distribution of operation.

Extraction phase

After the templates are built, one or a few traces will be used to extract the secret key on the basis of their probabilities determined by the templates

$\hat{κ} = \underset{k}{argmax} Π_{j = 1}^{t} p (υ_{j} | T_{ϕ (k_{i}, x_{j})}), i = 0, 1, \dots, K - 1$ (5)

where $\hat{k}$ is considered to be the correct subkey. $T_{ϕ (k_{i}, x (i))}$ is the template corresponding to the intermediate value $ϕ (k_{i}, x_{j})$ . For example, when the output of the first S-box in the first round of SM4 is chosen as the targeted intermediate value, one builds templates for each output of the SBOX. In this case, $ϕ (k^{i}, x_{j}) = Sbox (k_{i} \oplus x_{j})$ , where $x_{j}$ is the input plaintext corresponding to the power trace $υ_{j}$ .

SM4 algorithm

SM4 algorithm is a block cipher algorithm, formerly known as SMS4 algorithm, which is the first official commercial cryptographic algorithm of China. Figure 3 presents the framework of round function of SM4. The plaintext, ciphertext, and key of SM4 are all 128 bits. There are 32 rounds for SM4, and the key of each round is 32 bits. The S-box of SM4 has eight inputs and eight outputs. Actually, the SM4 algorithm is similar to AES-128. Both algorithms are Rijndael structures.

Figure 3.

Framework round function in the SM4 cryptographic algorithm.

Novel TA on SM4

With the promulgation of the SM4 algorithm in 2006, in recent years, domestic scholars began to pay attention to the security of the SM4 algorithm in the side-channel direction.

Various attack methods against the SM4 algorithm were proposed in previous studies^28–32 from power analysis attack, including SPA attack, DPA attack, and CPA attack. Some TA methods on the SM4 algorithm were proposed by Zhang³³ and Ma and Ding.³⁴ These methods are based on traditional TAs.

In this section, a novel TA method is proposed. As introduced earlier, TA has two phases: profiling phase and extraction phase. In the profiling phase, POIs are the basis of building templates in TA. How to choose the POIs correctly is significantly important. The quality of the template is directly related to the correctness of obtaining the key in the second phase.

The steps for the new TA proposed in this section are as follows:

Step 1. X-grouping process. Calculate X of the first round with the plaintexts of each trace and then group the traces by X;

Step 2. Finding POIs process. Find the leaking positions of SBOX input and output in the current round by calculating SOSD or SOST of the mean power of each X-group and then choosing POIs;

Step 3. Establishing template process. Merge X-groups into hamming weight groups and calculate the parameters of each template;

Step 4. Extracting round key process. Choose vectors of traces near template centers to extract the subkeys of the current round;

Step 5. Calculate the round output for each candidate round key and the value for the next round. Go to step 2 and repeat until all round keys necessary to recover the secret key are extracted.

Finding POIs

For any block cipher algorithm, non-linear substitution (SBOX) is the key part of block cipher algorithms as it ensures their reliability and is thus the major target in side-channel attacks. The input of round $i + 1$ of SBOX is the output of round $i$ which is the combined data produced by XORing the round key $R K_{i}$ with data $X_{i}$ . The input value $Y_{i}$ can be represented as

$Y_{i} = X_{i} \oplus R K_{i}$ (6)

Definition 1

Hamming weight groups are groups of traces, which grouped by the hamming weight of $Y_{i}$ or $SBOX (Y_{i})$ . Same hamming weight is grouped to one group.

Definition 2

X-groups are groups of traces, which grouped by the value of $X_{i}$ . Same value is grouped to one X-group.

Traditionally, traces need to be grouped into hamming weight groups of $Y_{i}$ to find the position of the ith SBOX’s input. The sum of difference in the mean of different groups, evaluated by Formula (1), (2), or (3), can reveal the same hamming weight of the ith SBOX’s input leakage in the power traces. Similarly, traces can be grouped by the same hamming weight groups of $SBOX (Y_{i})$ to find the position of the ith SBOX’s output. Obviously, the round key $RK$ must be available to find the position of SBOX’s input or output on the basis of the hamming weight groups.

We propose that the positions of SBOX’s input and output can be found on the basis of X-groups, which can be obtained without the round key according to its definition. Suppose $X_{i}$ in Formula (6) is known, then traces can be grouped by the same value of $X_{i}$ into X-groups. The value of $X_{i}$ of each X-group is the same, and the round key $RK$ is fixed. Thus, the value of $Y_{i}$ and the hamming weight of each X-group will be the same. Although the hamming weights of different X-groups may or may not be equal, the sum of difference in the mean of X-groups can still reveal the degree of hamming weight leakage. The reason is that, at the leakage position, the difference in the mean of X-groups with different hamming weights will be maximal. Accordingly, the position of SBOX’s input can be located by choosing the position where the value of DOM, SOSD, or SOST reaches the highest value.

Given that the SBOX’s input of each X-group is the same, the SBOX’s output of each X-group will be the same as well. Therefore, the sum of difference in the mean of X-groups will reveal the leakage positions of SBOX’s input and output.

We take SM4 as an example. In SM4, each round function has four SBOX’s, the input of each SBOX is the value of 8 bits $X_{i}$ or 8 bits $R K_{i}$ , byte by byte. Figure 4 represents the operation process of each SBOX at the first round. The input $Y_{j}$ of each SBOX at the first round will be

$Y_{j} = {(T_{1} \oplus T_{2} \oplus T_{3})}_{j} \oplus R K_{j}, j = 1, \dots, 4$ (7)

where $T_{1}, T_{2}, and T_{3}$ are three 32 bits of plaintext, ${(T_{1} \oplus T_{2} \oplus T_{3})}_{j}$ is 8 bits, and $R K_{j}$ is 8 bits.

Figure 4.

The operation process of each SBOX at the first round.

In accordance with Formula (6), $X_{j}$ will be

$X_{j} = {(T_{1} \oplus T_{2} \oplus T_{3})}_{j}, j = 1, \dots, 4$ (8)

In other words, $X_{j}$ of each trace can be calculated with the plaintexts only. Traces can then be grouped into X-groups with different $X_{j}$ values. By calculating the sum of mean power differences of these X-groups, the positions of the jth SBOX’s input and output can be found without knowing the corresponding subkey $R K_{j}$ .

Figure 5 shows the peaks of SOST of SBOX’s input and output calculated on the basis of X-groups, which are nearly in the exact same positions found in hamming weight groups with the known round key. X-axis represents the POIs positions, and Y-axis represents the SOST value.

Figure 5.

SOST curve of X-groups for device A (section “Verification and comparison”). The highest peak is the leak position of SBOX input, and the second highest peak is the leak position of SBOX output.

After the positions of SBOX’s input and output are located, several POIs can be chosen around one of them.

But the method to locate the positions of SBOX’s input and output is not straightforward. Figure 6 shows the SOST curve of another device on which traces are measured with a significantly high sampling frequency. Similar to those in Figure 5, X-axis represents the POIs positions, and Y-axis represents the SOST value, and the left and right highest peaks in Figure 6 are the leakage positions of SBOX’s input and output. Different from the peaks in Figure 5, the peaks in Figure 6 are much wider and possess several subpeaks. Some subpeaks around the right peak are even higher than those around the left peak. Therefore, the selected two highest peaks on the curve are not the correct positions. How to choose the correct positions? We propose one method.

Figure 6.

SOST curve of X-groups for device B (section “Verification and comparison”). Left and right peaks are surrounded by several subpeaks that bring difficulty in locating the two peaks automatically.

Definition 3

Major peak region (MPR) is a horizontal range in a line diagram, in which the horizontal distance of any neighboring subpeaks is less than or equal to a maximum distance named width parameter. An MPR is composed of a left endpoint, a right endpoint, and a peak location where the highest peak locates in that region. The left and right endpoints represent the current boundary of a region.

As shown in Figure 6, before locating the positions of SBOX’s input and output, we should recognize two MPRs of the curve.

MPR is used to represent the inverse V-shaped regions of the major peaks by collecting the neighboring subpeaks in that region. When creating an MPR given an initial location, the location will be the initial left and right endpoints. In determining whether a new location belongs to a region, the location will be compared with the left and right endpoints and will result in three cases: if the new location is between the two endpoints, then it belongs to that region; if the new location is outside of the region but within the maximum distance to the left or right endpoint, then it expands that region by replacing the left or right endpoint with itself; otherwise, the new location does not belong to that region. On this basis, the steps of the method for locating the two major peaks of SOSD or SOST curve are described as follows:

Step 1. Sort the locations on trace by the SOSD or SOST value in descending order.

Step 2. Create the first MPR with the location of the highest peak as the peak location, as well as the left and right endpoints of that region.

Step 3. Get the next location in the sorted queue and try to add it into every existing MPR. If the location satisfies the maximum distance criteria for a region, then it belongs to that region. If it does not belong to any existing region, then a new MPR with that location shall be created.

Step 4. Repeat step 3 until all locations are checked.

Thereafter, each created MPR will represent an inverse V-shaped peak region. This method can find all the peak regions on the curve, so we can easily choose two highest major peaks. This method uses the width parameter to smooth the SOST curve. If width parameter is set too small, then some subpeaks will be created as an individual MPR and the correct second highest major peak may be missed. If width parameter is set too large, then the two inverse V-shaped regions may be merged into one MPR and the second highest major peak will also be missed. Empirically, our experimental results suggest that 10 times of the width of an instruction width will be a good choice. Figure 7 shows the result of the proposed method which shows that the locations of SBOX’s input and output are located correctly.

Figure 7.

Red lines are located at the two major peaks by the use of the MPR method with the width parameter set to 10 times of the instruction width.

Building templates on the basis of X-groups

POIs of the SBOX’s input or output at the first round can be used to establish templates and then extract the round key of the first round. The X value of the second round can then be calculated with the extracted round key of the first round. Then, the POIs of the SBOX’s input or output at the second round can be found by the X-groups of the second round, and the round key of the second round can be extracted too. Such a procedure (i.e. X-grouping, finding the POIs of the SBOX’s input or output, establishing templates, and extracting round key) will be repeated until all round keys necessary to recover the secret key, as shown in Figure 8.

Figure 8.

The procedure of the novel TA.

After the POIs are located, the next task is to build templates for the power consumption model of SBOX’s input or output. Unfortunately, while the hamming weights of the SBOX’s input or output of X-groups and the round key are unknown, templates cannot be established directly on X-groups. In this article, we propose a novel method for determining the hamming weight of the SBOX’s input or output of each X-group.

According to the hamming weight model, the mean power consumption of each X-group should be proportional or inversely proportional to their hamming weights. Thus, the X-group with large mean power consumption will obtain a large hamming weight. Suppose the plaintexts are uniformly distributed, then X values will be uniformly distributed as well, $Y$ values too (Figure 9). In other words, the hamming weight of $Y$ or $SBOX (Y)$ will be binomially distributed. This condition means that the number of X-groups of a specific hamming weight equals to the binomial coefficient. If the input of SBOX is $n$ bits, the number of X-groups with hamming weight $h$ will be $n! / ((n - h)! h!)$ . On the basis of the two facts, the hamming weights of X-groups sorted by their mean power consumptions can be assigned sequentially according to the binomial coefficient. For an 8-bit SBOX, 256 X-groups are present and the number of X-groups with hamming weights 0 to 8 will be $(1, 8, 28, 56, 70, 56, 28, 8, 1)$ . Assigning hamming weight for each X-group is difficult because the correct order of setting the hamming weights to each X-group is still unknown because the power consumption can be either proportional or inversely proportional to the hamming weight. Hamming weight of each X-group can be set by assuming that the power consumption is proportional to hamming weight at first. If this assumption is incorrect, then the problem will manifest in the filtering process before attacking the second round key (section “Novel TA on SM4”) and can be fixed by reversing the assumption and starting over the hamming weight assignment for X-groups at the first round.

Figure 9.

Member distributions: (a) for device A and (b) for device B.

After the hamming weight of each X-group is assigned, X-groups with the same hamming weight will be merged into one hamming weight group. The power vectors of traces in each hamming weight group will also be used to establish the corresponding multivariate Gaussian distribution template by calculating their mean and covariant matrix, as Formula (4).

We intuitively validate the method for establishing hamming weight templates on the basis of X-groups. For this purpose, two sets of templates obtained by the proposed approach and by conventional TA profiling are drawn in Figure 9 in different colors. $X$ and $Y$ axes represent the power consumption at two POIs. The red contours represent templates obtained on the basis of X-groups, and the green ones represent the templates obtained by conventional TA. For the sake of clarity, each template is represented by only one contour with 95% of its highest possibility density. As shown in Figure 9, the two sets of Gaussian distributions in (a) are overlapped very well and relatively well in (b) for most hamming weight templates. This finding means that the method for establishing hamming weight templates on the basis of X-groups is feasible. Non-overlapping templates are those with extreme hamming weights, such as 0 and 8, and thus should exert minor influence on attacking effectiveness because the chance of an attacking trace of such extreme hamming weights is relatively low. Not all non-overlapped templates are underqualified ones. For example, in Figure 9(b), the template of hamming weight 0 obtained by X-groups (the left-bottom red contour) is better than that obtained by conventional TA profiling (the highly vertical green contour embedded in the middle of templates with hamming weights of 1 and 2 located at the left-bottom corner).

Arbitrarily assigning the hamming weight for each X-group may be incorrect as the number of traces in each X-group is relatively small. For example, suppose 3840 traces are available and the input of SBOX is 8 bits, then 256 X-groups will be generated, and each of them will contain only 15 traces on average. Under the noise environment, their average power may drift into the power range of nearby X-groups. This condition can cause certain inaccuracy of the estimated templates. Two methods are proposed to optimize the estimated templates.

Improvement 1: projecting the center of each template to the major orientation

According to the linear hamming weight power consumption model, the center of each member distribution should be located on a line along a specific orientation. Several experiments have verified this fact. As some X-groups may be assigned with an incorrect hamming weight, the center of member distribution may drift from that orientation. Projecting each template center onto the orientation line can be helpful.

The parameters of the major orientation line can be calculated using linear regression of the centers of estimated templates. Given that the groups with hamming weight 0 or 8 only contain traces in one X-group, their templates are least reliable and should be discarded in linear regression. Assuming $p$ POIs are available, the center vector of a template is $μ = R^{p}$ . If $p \geq 2$ , then the orientation line is a hyper plane in p-dimensional space and can be expressed as

$x_{p} = a_{1} x_{1} + \dots \in + a_{p - 1} x_{p - 1} + b$ (9)

where $Xi$ is a column vector composed of the $i^{th}$ dimension element of the mean vector $μ$ of each template.

Assuming $X = [x_{1}, \dots, x_{p - 1}]$ and $y = x_{p}$ , then the linear regression based on least squares criterion (LSE) will be

$\hat{a} = {(x^{T} X)}^{- 1} X^{T} y$ (10)

After $\hat{a}$ is solved, $b$ can be calculated by substituting $\hat{a}$ in Formula (10). The problem of projecting the template center onto the orientation line is an affine projection problem. Assuming the hyper plane is

$a_{1} X_{1} + \dots + a_{p - 1} X_{p - 1} + a_{p} X_{p} + b = 0$ (11)

The affine projection matrix will be a $(n + 1) \times (n + 1)$ matrix as follows

$P = [\begin{matrix} \frac{{a_{1}}^{2}}{\sum_{k = 1}^{n} {a_{i}}^{2}} & \dots & \frac{a_{1} a_{n}}{\sum_{k = 1}^{n} {a_{1}}^{2}} & \frac{a_{1} b}{\sum_{k = 1}^{n} {a_{i}}^{2}} \\ \dots & \frac{a_{i} a_{j}}{\sum_{k = 1}^{n} {a_{i}}^{2}} & \dots & \dots \\ \frac{a_{n} a_{1}}{\sum_{k = 1}^{n} {a_{i}}^{2}} & \dots & \frac{{a_{n}}^{2}}{\sum_{k = 1}^{n} {a_{i}}^{2}} & \frac{a_{n} b}{\sum_{k = 1}^{n} {a_{i}}^{2}} \\ 0 & \dots & 0 & 1 \end{matrix}]$ (12)

where the top-left $n \times n$ submatrix is simply the product of the unit normal vector of the hyper plane, and the $(n + 1)$ column contains the shift factor for each dimension. By using this projection matrix, the center of a template can be projected onto the hyper plane by

${\tilde{μ}}_{i}^{'} = P \times {\tilde{μ}}_{i}, i \in [1, n]$ (13)

where ${\tilde{μ}}_{i} = [{μ_{i}}^{T}, 1]^{T}, i \in [1, n]$ .

Improvement 2: using the same covariance matrix for each template

As the VAR of power is mainly determined by the type of instruction, the VAR of every X-group at a specific position on traces should be the same. Similarly, the correlation of the power at two positions on traces should not be different from one X-group to another X-group. Ideally, each X-group should possess the same covariance matrix. As the template of the medium hamming weight is best qualified because it is built on the basis of the most number of traces, other templates covariance matrixes can be substituted by this covariance matrix

$\sum_{i} = \sum_{medium}, i \in [1, n]$ (14)

Extracting the key

In this section, the attacking method will be introduced first and then the entire attacking framework will be described, including a filtering process that can significantly reduce the number of combinations of candidate round keys used to attack the next round key.

The attacking method is the same as that of conventional TA. In particular, given a set of attacking traces, their joint probability of each possible subkey can be calculated by the probabilistic distribution of each template. Then, the top $n$ subkeys with the highest probability will be chosen as the candidates. The difference is that only one device is used to profile and attack in the current scenario, but the profiling and attacking traces are the same. The same set of traces should be used in the profiling and attacking phases. Using all the traces in attacking will be inefficient. Randomly choosing some traces to attack can work. However, an optimistic method for properly choosing traces that can improve the chance of success should be used.

Trace vectors in a hamming weight group disperse from the mean power vector of that hamming weight because of the Gaussian noise. If such traces located at the peripheral area of a distribution are used in attacking, then they can exert a negative effect on the attacking quality. On the contrary, traces located near the center of the distribution are a better choice. As the templates with medium hamming weights are qualified statistically because they are built on the basis of a large number of traces, they should contribute a large number of attacking traces. The number of attacking traces near the center of a template can be calculated as

$m_{i} = p_{i} \times m, i \in [1, n]$ (15)

where $m$ is the total number of attacking traces; $m_{i}$ is the number of attacking traces to be chosen near the center of the template with hamming weight $i$ ; and $p_{i}$ is the prior of hamming weight $i$ , which can be calculated as follows

$p_{i} = \frac{1}{2^{n}} \times \frac{n!}{i! (n - i)!}$ (16)

where the input of SBOX is $n$ bits. Figure 10 shows the chosen traces near the template centers. X and Y axes represent the power consumption at two POIs.

Figure 10.

Choosing 20 attacking traces near template centers. Templates with hamming weight 0 or 8 does not contribute to the attacking traces.

Different from conventional TA in which templates for all rounds are built in the profiling phase, this method can only build templates for the first round at the first time. Templates for the next round will be built on the basis of the output of the previous round.

Building templates for the next round with each candidate round key of the previous round in the extraction process seems inefficient at first. This step is rather an advantage in the sense that the size of candidate set of round keys can be compressed by adding a filtering process in step 2. The size of candidate set is a key factor of the efficiency of an attacking scheme. Suppose four subkeys of each SBOX will be remained as the candidates, four SBOX’s are present in each round, and at least four round keys are needed to recover the main key as in SM4, then 256 candidate round keys will be generated in each round, thereby producing 256 sub-branches for each branch of the previous round. A total of 4,294,967,296 branches will be available for testing. If the candidate set can be compressed, then the attacking efficiency will be improved exponentially. In step 2, the SOSD or SOST will be calculated on the basis of the candidate round key of the previous round. An incorrect round key from the previous round will not produce any remarkable peak on SOSD or SOST curve because incorrect round key will produce incorrect X values for the next round. Figure 11 shows the SOST calculated on the basis of X-groups of the second round with the correct first round key and an incorrect first round key, which is the second probable combination of the candidate subkeys of the first round.

Figure 11.

SOST for the second round: (a) on the basis of the correct first round key and (b) on the basis of the incorrect first round key.

The filtering process can be implemented by comparing the height of the highest peak of SOST with a minimum height or threshold, which is set to be five times the average SOST in our experiments. If a candidate round key can produce SOST in which the highest height is larger than the threshold, then the round key can be reserved. Otherwise, it will be filtered.

Another functionality of the filtering process is to determine the power proportional model of a device. As mentioned in section “Novel TA on SM4,” when attacking the first round, hamming weights are assigned to X-groups on the basis of an arbitrary assumption that the power consumption is proportional to the hamming weight. If this assumption is incorrect, then the candidate keys of the first round will be completely incorrect, which in turn will not produce any significant peak of the SOST for the second round. Whether the assumed proportional power model is correct, can be determined using the same criterion of filtering. If the model is incorrect, then the proportional model is reversed and step 3 of the first round is repeated.

Verification and comparison

We experiment on three different cryptographic devices with the SM4 crypto algorithm.

Device A is a 32-bit smartcard designed as a positive logic circuit with a software implementation of SM4.

Device B is an embedded system running SM4 on an 8-bit microprocessor AT90S8515 with the CPU frequency of 28 MHz.

Device C is also a 32-bit smartcard designed as a negative logic circuit.

Three trace sets are captured from devices A, B, and C with different sampling rates. Each contains 10,000 traces, which are divided into profiling and attacking sets in the ratio of 80% and 20%. Dividing trace sets into profiling and attacking sets is not necessary for the proposed algorithm but should be conducted in experiments for comparing the proposed attacking schema with conventional TA. A total of 20 traces are used in the extraction phase for all experiments.

N-order success ratio and guessing entropy (GE) proposed in the study by Standaert et al.³⁵ are used to measure the effectiveness of attacking. N-order success ratio refers to the frequency of the correct key located in the top N candidates in multiple attacks. GE is the mean position of the correct key in the candidate sets and can reflect the effectiveness of an attacking scheme precisely.

Three types of experiments are conducted as follows.

Experiment 1 is designed to verify the correctness of locating the positions of SBOX’s input and output on the basis of X-groups.

Experiment 2 is conducted to compare the effectiveness of the proposed attacking method (i.e. proposed TA) with conventional TA (i.e. TA), in which the attacking traces are taken from a standalone attacking set different from the profiling set.

Experiment 3 tests the ultimate effectiveness of proposed TA, in which only one trace set is used in the profiling and extraction phases, and the attacking traces are taken near the centers of the templates (section “Novel TA on SM4”).

Table 1 shows the result of experiment 1, in which the positions of SBOX’s input and output obtained by the two methods (X-group and Hamming weight group) are nearly exactly the same for all devices.

Table 1.

Input and output positions of sbox1 of the first round.

Grouping method	SBOX input/output	Device A	Device B	Device C
X-group	Input position	12093	509	660
Hamming weight group		12094	509	660
X-group	Output position	12282	554	683
Hamming weight group		12282	554	683

Experiment 2 is designed to compare the effectiveness of proposed TA with TA and the effectiveness of four variants of profiling methods in the proposed TA.

Method 1. The original profiling method merges X-groups to hamming weight groups to build the templates directly (named OrgProf).

Method 2. The second profiling method is an improvement by projecting centers of hamming weight templates onto the major orientation hyper plane (named ProjProf).

Method 3. The third one is another improvement by use of the covariance matrix of the template with hamming weight 4 (i.e. the medium hamming weight for 8-bit SBOX) for the rest of templates (named SgmProf).

Method 4. The last profiling method combines previous two improvement techniques (named ProjSgm). The attacking target is the SBOX’s output of the first round.

For the sake of comparison with TA, the attacking traces used in this experiment are all selected from an attacking set different from the profiling set.

In Table 2, the highest success ratio and the minimum GE are highlighted in bold and italic font. For devices A and B, the effectiveness of the proposed TA and its variant are noticeably better than TA. For device C, the effectiveness of the proposed TA and its variant are comparable if not better than that of TA.

Table 2.

Comparing the effectiveness of variant profiling methods for sbox output.

Device	Method	First success (%)	Second success (%)	Third success (%)	Fourth success (%)	GE
A	TA	60	70	80	85	5.45
	OrgProf	75	85	85	85	2.10
	ProjProf	75	85	85	85	2.10
	SgmProf	80	95	100	100	1.25
	ProjSgm	80	95	95	100	1.30
B	TA	75	75	75	85	8.45
	OrgProf	80	85	90	90	9.35
	ProjProf	80	85	90	90	9.40
	SgmProf	80	85	85	90	6.60
	ProjSgm	80	85	85	85	6.55
C	TA	60	80	85	85	3.05
	OrgProf	65	80	80	85	2.90
	ProjProf	75	80	85	85	2.70
	SgmProf	70	80	80	85	3.35
	ProjSgm	65	80	80	85	3.45

GE: guessing entropy; TA: template attack.

Therefore, the quality of templates based on X-groups is better than that of templates based on hamming weight groups. This result is still unclear because the templates based on X-groups contain traces of incorrectly assigned hamming weights. One explanation is that the templates of TA overfit the profiling set. In Figure 10, the center of the TA template of hamming weight 0 is misplaced into the area of other templates. In other words, the probabilistic distribution of hamming weight 0 is inaccurate because of the inadequate profiling traces used to calculate the parameters of its distribution.

On the contrary, three variants of the proposed TA show no significant superiority to one another. For device A, the SgmProf method achieves the highest success ratio and the lowest GE. Meanwhile, for device C, the ProjProf method achieves the highest attacking quality. The situation of device B is complicated and for which the OrgProf method and the ProjProf method achieve the highest success ratio, while the SgmProf method and the ProjSgm method obtain the lowest GE. In summary, the variants of proposed TA exert no deterministic effect and are therefore optional.

Experiment 3 checks the effectiveness of the proposed method by selecting attacking traces from the same trace set of profiling and near the template centers as introduced in section “Novel TA on SM4.” The results shown in Table 3 are good. The first-order success ratios for all devices achieve 100%.

Table 3.

Effectiveness of using traces near template centers in attacking SBOX’s output.

Device	Method	First success (%)	Second success (%)	Third success (%)	Fourth success (%)	GE
A	TA	25	40	60	65	8.25
	proposed TA	100	100	100	100	1
B	TA	75	75	75	85	8.45
	proposed TA	100	100	100	100	1
C	TA	60	80	85	85	3.05
	proposed TA	100	100	100	100	1

GE: guessing entropy; TA: template attack.

Conclusion

This article provides an insight into how to extract the crypto key of the cipher algorithm in the smartcard, which is the most important cyber-domain information, from the side-channel leakage information transmitted by the cyber-physical system. As a case study, the embedded system and the smartcard running SM4 crypto algorithm are selected, and analog power emission is analyzed. We propose a novel TA on the SM4 algorithm. POIs are the basis of building templates in TA. Sometimes adversary cannot locate the positions of SBOXs input and output straightforwardly, a new MPR method for finding POIs correctly is proposed. A method for establishing templates on the basis of those POIs still without requiring knowledge of the key is proposed. Several techniques are proposed to improve the quality of the templates as well. A method for choosing attacking traces is proposed to significantly improve the attacking efficiency. The experimental results show that the proposed novel TA achieves much better attacking quality than does conventional TA. The method for locating the leakage positions of SBOX’s input and output without knowing the key may be applied in many other block cipher algorithms. This method also can be further studied in scenarios that consider countermeasures. This work serves as a proof of concept for necessity of exploring different analog emissions of CPSs that are capable of leaking information and weakening the confidentiality of the system. The proposed novel TA can effectively break through the limitations of the traditional TA, but there are other new attack ideas, such as SBOX input–output joint attack and voting attack. Further development can be considered in the follow-up study.

Footnotes

The authors would like to thank reviewers for their detailed reviews and constructive comments,which helped improve the quality of the article.

Handling Editor: Marcin Wozniak

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research,authorship,and/or publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research,authorship,and/or publication of this article: This work was supported in part by 13th Five-Year Plan National Cryptographic Development Fund under grant number MMJJ20180244 and National Key R&D Program Funding under grant numbers 2018YFB0904900 and 2018 YFB0904901.

ORCID iD

Min Wang

References

Faruque

Regazzoni

Pajic

. Design methodologies for securing cyber-physical systems. In: Proceedings of the 2015 international conference on hardware/software codesign and system synthesis (CODES+ISSS), Amsterdam, 4–9 October 2015. New York: IEEE.

Kocher

. Timing attacks on implementations of Diffie-Hellman, Rsa, Dss, and other systems. In: CRYPTO ‘96 Proceedings of the 16th annual international cryptology conference on advances in cryptology, London, 18–22 August 1996. New York: ACM.

Kocher

Jaffe

Jun

. Differential power analysis. In: Advances in cryptology—CRYPTO’ 99 (ed. Wiener

), Santa Barbara, CA, 15–19 August 1999, pp.388–397. Berlin: Springer.

Gandolfi

Mourtel

Olivier

Electromagnetic analysis: concrete results. In: Cryptographic hardware and embedded systems—CHES 2001 (eds Koç

ÇK

Naccache

Paar

), Paris, 14–16 May 2001, pp.251–261. Berlin: Springer.

Quisquater

Samyde

. Electromagnetic analysis (ema): measures and counter-measures for smart cards. In: Proceedings of the international conference on research in smart cards: smart card programming & security, London, 19–21 September 2001. New York: ACM.

Agrawal

Archambeault

Rao

et al . The em side-channel(s). In: Proceedings of the revised papers from the international workshop on cryptographic hardware & embedded systems, Redwood Shores, CA, 13–15 August 200.

Kocher

. Introduction to differential power analysis and related attacks, 1998, http://www.cryptography.com/dpa/technical/index.html; https://ci.nii.ac.jp/naid/10017583487/en/

Prouff

Rivain

Bevan

. Statistical analysis of second order differential power analysis. IEEE Trans Comput 2009; 58(6): 799–811.

Brier

Clavier

Olivier

. Correlation power analysis with a leakage model. Ches 2004; 37(22): 16–29.

10.

Moradi

Mischke

Eisenbarth

. Correlation-enhanced power analysis collision attack, 2010, https://eprint.iacr.org/2010/297.pdf

11.

Socha

Miskovsky

Kubatova

et al . Correlation power analysis distinguisher based on the correlation trace derivative, 2018, https://ieeexplore.ieee.org/document/8491869

12.

Chari

Jutla

Rao

et al . Towards sound approaches to counteract power-analysis attacks. In: CRYPTO ’99 Proceedings of the 19th annual international cryptology conference on advances in cryptology, Berlin, 15–19 August 1999. New York: ACM.

13.

Ors

Gurkaynak

Oswald

et al . Power-analysis attack on an asic aes implementation. In: Proceedings of the international conference on information technology: coding and computing, Las Vegas, NV, 5–7 April 2004. New York: IEEE.

14.

Yen

Lien

Moon

et al . Power analysis by exploiting chosen message and internal collisions c vulnerability of checking mechanism for RSA-decryption. In: Mycrypt’05 Proceedings of the 1st international conference on progress in cryptology, Kuala Lumpur, Malaysia, 28–30 September 2005. New York: ACM.

15.

Mcevoy

Tunstall

Murphy

et al . Differential power analysis of hmac based on sha-2, and countermeasures. In: WISA’07 Proceedings of the 8th international conference on information security applications, Jeju Island, Korea, 27–29 August 2007, New York: ACM.

16.

Standaert

F-X

Örs

Quisquater

et al . Power analysis attacks against FPGA implementations of the DES. In: Proceedings of the 14th international conference, field-programmable logic and applications, Leuven, 30 August–1 September 2004. New York: Springer.

17.

Chari

Rao

Rohatgi

Template attacks. In: Proceedings of the CHES ‘02 revised papers from the 4th international workshop on cryptographic hardware & embedded systems, London, 13–15 August 2002. New York: ACM.

18.

Zhang

Zhou

. How many interesting points should be used in a template attack? J Syst Softw 2016; 120: 105113.

19.

Fan

Zhou

Zhang

et al . How to choose interesting points for template attacks more effectively? In: INTRUST 2014 Revised selected papers of the 6th international conference on trusted systems, Beijing, China, 16–17 December 2014. New York: ACM.

20.

Gierlichs

Lemke-Rust

Paar

Templates vs. stochastic methods, 2006, https://www.iacr.org/archive/ches2006/02/02.pdf

21.

Oswald

. Power analysis attacks: revealing the secrets of smart cards. New York: Springer, 2007.

22.

Mather

Oswald

Bandenburg

et al . Does my device leak information? An a priori statistical power analysis of leakage detection tests. In: Proceedings of the Part I of the Proceedings of the 19th international conference on advances in cryptology—ASIACRYPT 2013, vol. 8269, Berlin, 1–5 December 2013. New York: ACM.

23.

Gierlichs

Batina

Tuyls

et al . Mutual information analysis. In: Proceedings of the international workshop on cryptographic hardware & embedded systems, Washington, DC, 10–13 August 2008. New York: Springer.

24.

Archambeau

Peeters

Standaert

et al . Template attacks in principal subspaces, 2006, https://www.iacr.org/archive/ches2006/01/01.pdf

25.

Lerman

Medeiros

Veshchikov

et al . Semi-supervised template attack. In: Proceedings of the international conference on constructive side-channel analysis & secure design, Paris, 6–8 March 2013. New York: ACM.

26.

Schindler

Lemke

Paar

. A stochastic model for differential side channel cryptanalysis. In: Proceedings of the International conference on cryptographic hardware & embedded systems, Edinburgh, 29 August–1 September 2005. New York: Springer.

27.

Wang

et al . Template attack of crypto chip based on clustering. J Commun 2018; 39: 83–93.

28.

Qiu

Bai

. Power analysis of a FPGA implementation of SM4. In: Proceedings of the 5th international conference on computing, communications and networking technologies (ICCCNT), Hefei, China, 11–13 July 2014, pp.1–6. New York: IEEE.

29.

Tang

Zhang

et al . A novel method of correlation power analysis on sm4 hardware implementation. In: Proceedings of the international conference on computational intelligence & security, Wuxi, China, 16–19 December 2017. New York: IEEE.

30.

Chen

Zheng

. Design and implementation of power analysis-immune SMS4 algorithm, 2016, https://ieeexplore.ieee.org/document/7589403

31.

Shan

Wang

et al . A chosen-plaintext method of cpa on sm4 block cipher. In: Proceedings of the 2014 10th international conference on computational intelligence and security, Kunming, China, 15–16 November 2014, pp.363–366. New York: IEEE.

32.

Chen

Hexin

Wang

. Improved chosen-plaintext dpa on block cipher SM4. J Tsinghua Univ, 2017, http://jst.tsinghuajournals.com/EN/abstract/abstract152093.shtml

33.

Zhang

. Study and application of power analysis on SM4 and SM2 algorithmTsinghua University, 2015.

34.

Ding

. Research on machine learning attack method of SM4 mask scheme. J Eng Coll Arm Pol Force 2018; 35: 63–67.

35.

Standaert

Malkin

Yung

. A unified framework for the analysis of side-channel key recovery attacks. In: Advances in cryptology—EUROCRYPT 2009 (ed. Joux

), Cologne, 26–30 April 2009, pp.443–461. Berlin: Springer.