Abstract
Introduction
With the growing size of the world economy, world energy consumption continues to increase. Meeting the challenges to economic development and environmental problems, the realization of sustainable economic development is getting more and more common recognition. Conserving resources, improving the energy efficiency, and developing new energy resources have become the theme of the world’s energy utilization (Ozgur and Kose, 2006).
Wind energy is the kinetic energy gained by the uneven heating of the ground by solar radiation, resulting in the uneven distribution of pressure in the atmosphere and the movement of air in a horizontal direction. Wind power is pollution-free and clean natural resource. The earth has the huge wind energy resources. It is estimated that the available wind energy on the earth is 10 times that of water energy, and only 1% of the wind energy can meet the global energy demand. Wind power is clean, renewable energy that does not have to be mined or transported. Compared with other power generation, the wind power generation has many advantages, such as construction cycle is short and the costs of operation and maintenance are low. Due to the shortage of oil, natural gases, and other non-renewable energy, and the increasing serious environmental problems, like air pollution, acid rain, and the greenhouse effect, the use of wind power generation has attracted increasing attention.
China’s economy has shifted from high-speed growth to high-quality development, paying more attention to the relationship between economic development and environmental protection. China has large amounts of wind resources which have great development values. China has two major wind belts: the first are the Three North area (Northeastern China, North China, Northwestern China), including Heilongjiang, Jilin, Liaoning, Hebei, Inner Mongolia, Gansu, Qinghai, Tibet, and Xinjiang. This region accounts for 80% of the country’s available wind power reserves, which is also known as China’s largest contiguous wind resource region. With the flat terrain, easy access, and no destructive wind speed, these wonderful natural factors make this region conducive to large-scale development of wind farms. Second, the east coast areas. The southeast coast is affected by the Taiwan Strait. When the cold air moves southward to reach the strait, the wind speed increases due to the canal effect. Cold air in winter and spring and typhoons in summer and autumn can affect the coastal areas and islands, making it the best area for wind energy in China. In addition to the above two wind zones, local inland areas are also rich in wind energy owing to the influence of lakes, mountains, and special topography (Zhou et al., 2012; Zhou and Hu, 2018). With the increasingly serious environmental and energy problems, the development of renewable energy has attracted more and more attention. China is a late starter in wind power generation, but the wind power industry is gaining momentum as the country continues to introduce a series of encouraging and supporting policies. At present, China has become the world’s largest country in new and cumulative wind power installed capacity (Li and Kong, 2019).
As the frequency of data sampling increases, the characteristics of the function become ever more obvious. The data with functional characteristics are named as “Functional Data.” Functional data analysis (FDA) is the general name of the method for analyzing functional data (Ramsay, 1982). When analyzing high frequency data, the data may have high data dimension, noise, missing data, outliers, and other problems, which bring difficulties to analysis. The FDA method treats the data generation process behind the data as a function, which can overcome the above problems encountered in high frequency data analysis. Therefore, more effective results can be obtained by analyzing data from the perspective of function. After decades of development, FDA methods have been widely developed. There are two representative works “Functional Data Analysis” and “Applied Functional Data Analysis” by Ramsay and Silverman. The former focuses on theoretical analysis, while the latter focuses on applied research. At present, many traditional statistical analysis methods have been improved and extended to the FDA framework.
Cluster analysis is an important method in statistical research. It is an effective method to simplify the data structure by unsupervised classification without prior knowledge and just based on the characteristics of data. With so much data being collected, methods to identify the same set of protons in the data are increasingly needed. The purpose of cluster analysis is to identify homogeneous data groups without using any prior knowledge of data group labels. Functional clustering analysis combines FDA with clustering analysis. Functional clustering analysis divides functional objects into multiple classes, so that the objects in the class have a similar curve change pattern, and the objects among the classes have different curve change patterns. The method is used to explore the potential class structure in the functional data set. In recent years, functional clustering analysis has developed rapidly. Abraham et al. (2003) used the B-spline basis function to reconstruct functional data and then performed k-mean clustering analysis on the coefficients of the basis function. The constructed mixed models are based on mixed effects, which are therefore suitable for sparse function data clustering analysis (James and Sugar, 2003). Compared with traditional clustering method, functional clustering method can analyze high dimension data, which can use more information about the data and reduce the information loss, so that the cluster results are more reliable than traditional method results.
This paper uses the method of functional clustering analysis to study the wind power generation in different Chinese provinces and classifies the wind power generation situation in China. FDA can use high frequency data to analyze, compared to the traditional way to analyze aggregation data, which can reduce information loss. On the other hand, FDA viewed the data as a function, which can calculate the derived function, so that we can analyze the change rate of the data and get more useful information from the data. Based on the above reasons, we say that we use FDA method which can more effectively analyze the wind power generation situation in different Chinese provinces, and provide suggestions for the development and utilization of wind power resources in China.
Methods
Functional principal component analysis (FPCA)
By selecting the appropriate basis function system and smoothing coefficient, the discrete points are transformed into a functional data object. When analyzing functional data object, FPCA should be first considered to reduce the infinite dimensions. FPCA transforms the infinite functional data into the finite functional principal scores by keeping the information as much as possible.
We view a random curve
According to the Mercer lemma, the covariance function can be decomposed into
Based on the Kathunen-Loève theorem, the stochastic process can be decomposed into the mean function and the summation of products between the principal component function and the principal component score
Use the first
In this paper, the cumulative contribution rate is adopted to select the best principal component number
Functional clustering method
Based on the FPCA, the infinite data are truncated to finite functional principal component scores, which realize the dimension reduction. The functional clustering method is performing the traditional clustering analysis on the functional principal component scores.
Cluster analysis is an important data mining technology. Cluster analysis can classify data according to data characteristics without knowing the data classification before. The basic idea of cluster analysis is to construct a matrix of independent variables, classify the individuals with similar properties into the same category, and classify the individuals with large different property into different categories. So that after classifying, the individuals within the category have higher homogeneity and the individuals among different types have higher heterogeneity. At present, there are many mature clustering analysis methods, such as density based method, hierarchy based method, partition based method, grid based method, model based method, and so on.
This paper uses the typical k-means approach. K-means algorithm is a clustering analysis algorithm that is solved through iteration. The specific steps are as follows:
Input the initial data set and specify the number of divided clusters k; Arbitrary selection of k data object points as the initial clustering center; Assign data objects to the most similar cluster according to the average value of objects in the cluster; Update cluster mean value; Calculate the clustering criterion function SSE; Repeat steps (iii)–(v) until the SSE value of the criterion function does not change any more; Output k clusters that satisfy the convergence of the square error criterion function.
There are several ways of defining distance. In this paper, Euclidean distance is selected to measure the distance, and the sum of square error is used as the objective function of clustering
By minimizing the objective function SSE, we can get
To sum up, it can be concluded that the optimal center is the minimized SSE, which is also the cluster mean value.
Empirical study
Cluster analysis is helpful to understand the characteristics of variables and identify their types, so as to learn more useful information. This paper selects the monthly data of wind power generation in Chinese provinces level data, which is from March 2013 to October 2019, by using the functional cluster analysis to analyze the main characteristics of wind power generation in China.
The cumulative annual value of wind power generation in each Chinese province from 2014 to 2018 is shown in Table 1 and the gross number is shown in Figure 1. In terms of the total amount, China’s wind power generation is increasing yearly and growing rapidly. The total wind power generation in 2017 is about twice that of 2014. And the generation in Sichuan province increases most rapidly, the generation in 2018 is nearly 22 times more than in 2014.
The cumulative annual wind power generation in each Chinese province from 2014 to 2018.

China’s annual wind power generation from 2014 to 2018.
The distribution map of wind power generation in each Chinese province in 2014, 2016, and 2018 is shown in the Figures 2 to 4, respectively. It can be seen that the wind power generation in China has the following characteristics: first, the wind power generation in Inner Mongolia has always been the largest in China. Its unique geographical features make it become the most important wind power generation region. Second, Ningxia, Hebei, Gansu, Xinjiang, Heilongjiang, Jilin, Liaoning, and other provinces have large amount of wind power generation, which is geographically concentrated in northern China. Third, east coast area is another significant wind power generation region. But compared with the rapid development of wind power generation in other provinces, the generation of east coast does not have evident increase. What is more, Yunnan which sites in southwest China has seen a rapid increase in wind power generation in recent years, making it the largest wind power generation province in the south China.

2014 wind power generation distribution map.

2016 wind power generation distribution map.

2018 wind power generation distribution map.
To better analyze monthly wind power generation data and reduce information loss caused by the annual data that obtained via accumulating the monthly data. This paper analyzes monthly wind power generation data of 30 Chinese provinces from March 2013 to October 2019 from the perspective of FDA.
First, the monthly data of wind power generation are fitted with B-spline basis function system, and the monthly discrete data points are transformed into a functional data object. The functional data objects of wind power generation monthly data which contain 30 lines are shown in Figure 5. Take the first derivative of the functional data object to obtain the first derivative function data object, which is illustrated in Figure 6. When we analyze the data in the view of function, we see the data generation process behind the data is a function. And we can use the tool of derivation to analyze the function change rate and get more information behind the data. From Figures 5 and 6, we can conclude the following results. First, although there are differences in the amount of wind power generation in different provinces, the wind power generation presents the same cyclical changes. Wind, as a climate element, which is linked to seasonal change, shows the annual change character. Second, wind power generation in most Chinese provinces is roughly the same, except for a few provinces that have large wind power generation, most provincial wind power generation is similar (the curves of wind power generation function are concentrated at the bottom of the figure). Third, with the passage of time, the wind power generation in all provinces shows a continuous and turbulent growth trend and the growth rate is accelerating. It shows that the utilization of wind energy resources in all Chinese provinces is increasing.

China’s 30 provinces monthly wind power generation functional data.

China’s 30 provinces monthly wind power generation first derivative function data.
For further analysis of functional data object, it is necessary to reduce the dimension of functional data objects. The main tool is to use FPCA. FPCA was performed on the monthly wind power generation functional data object. Two functional principal component functions are selected here, which explain the variation of 94.38 and 3.16% (cumulative explanation over 95%), respectively. The two functional principal component functions are shown in Figure 7. The first functional principal component function represents the overall wind power generation trend. We can see that the curve is constantly fluctuating and showing an overall upward trend. The second functional principal component function represents the adjustment of local wind power generation, showing a downward trend of oscillation. Accordingly, the scores of the first two functional principal components corresponding to each province were calculated, as shown in Table 2.

The first two functional principal component functions.
The first two functional principal component scores.
Functional clustering analysis uses the clustering analysis on the functional principal component scores. The results are shown in Table 3. According to the results of cluster analysis, only Inner Mongolia belongs to the first category. The provinces in category 2 are Hebei, Shanxi, Liaoning, Jiangsu, Shandong, Yunnan, Gansu, Xinjiang, and Ningxia. The remaining provinces fall into category 3. The results are similar to the result of the wind power generation distribution map. The monthly wind power generation in category 1 is greater than that in category 2, and that in category 2 is greater than that in category 3. The first two types of provinces are the main regions of wind power generation in China, mainly located in northern China. We can see that northern China is the main area of wind power generation in China. Topographically, these provinces are mainly distributed in the plateau region, which is rich in wind resources and has great potential for wind power generation development. Based on the FDA, we can see that the monthly wind power generation has a periodic character, which indicates that the monthly wind power generation is related to the climate and weather changes. And with the wind power generation has increased in recent years, the periodic changes become more obvious. It shows that in the wind power abundant month, the wind power generation potential gets more developed.
Functional clustering analysis results.
For comparison, k-means clustering analysis was performed with annual data. The results are shown in Table 4. By comparing the results of the functional clustering analysis with the results of traditional clustering analysis in annual data, it can be found that the two results are consistent. This shows that the results of functional clustering analysis are reliable. The multi-dimensional data can be used in the functional clustering analysis. Compared with the accumulated annual data, using functional clustering analysis to analyze the monthly data has a lot of advantages, such as the information leakage is reduced, and the results are more intuitive and easier to understand.
Traditional clustering analysis results.
Conclusion
Energy plays an important role in country’s economic development. As a major energy consumer country in the world, China’s energy reserves are of great significance to China’s economic development. Wind energy is an important renewable energy source; it can effectively alleviate an energy shortage, reduce the damage to the environment, and contribute to the realization of sustainable economic development. China is shifting from high-speed growth to high-quality development, paying greater attention to the quality of growth and the balance between economy and environment. As an important new energy, wind power generation can effectively get rid of excessive dependence on fossil fuel, alleviate an energy shortage, and improve the ecological environment. It is of profound significance to vigorously develop clean energy such as wind energy and adopt advanced science and technology to improve energy production and energy efficiency.
In this paper, the monthly data of wind power generation in 30 provinces of China from March 2013 to October 2019 were analyzed by using functional cluster analysis. The conclusions are as follows. First, the overall growth trend of wind power generation in China is accelerating. Second, in geographically, wind power generation in the north of China is greater than in the south of China. But in recent years, wind power generation in southern provinces such as Yunnan has soared. Third, from the perspective of monthly data, wind power generation is cyclical and related to seasonal climate. In the future development of wind energy resources, regional and climatic factors should be taken into account and advanced technology should be adopted to increase the output of wind power generation. Fourth, the results of the functional clustering analysis of Chinese wind power generation monthly data and the traditional clustering analysis of Chinese wind power generation annual data are consistent. Compared with the traditional cluster analysis method, the functional cluster analysis can use to analyze high frequency data, which reduce information loss and make the results more intuitive and easier to understand. At the same time, when analyzing data from the perspective of functional data view, the data generation process behind the data is regarded as a function, which can be used to calculate the derivative function so as to mine more information. Such as, when we calculate the derivative function, we find out that monthly wind power generation data have the tendency of accelerating and cyclical. But if we are simply summing the monthly data to get the yearly data, we lost the information of the change between different months and cannot get the result. The functional clustering analysis uses more information and the result is more reliable.
Based on the empirical results of this paper, we give some suggestions for Chinese provincial wind power generation development in the future. First, northern China and plateau regions have abundant wind power resources, further developing wind power resources of these regions can increase the wind power generations. Second, Chinese provincial monthly wind power generation has a periodic character which is related to the climate and weather changes, so that if we take the weather changes factor into consideration we can get the potential generation.
