Sage Journals: Discover world-class research

Abstract

The seismic data acquired are usually spatially undersampled due to the constraints of the field acquisition environment. However, the removal of multiple waves, offsets, and inversions requires high regularity and integrity of seismic data. Therefore, reasonable data reconstruction methods are usually applied to the missing data in the indoor processing stage to recover regular seismic data. The traditional reconstruction methods for seismic data reconstruction are generally based on some assumptions (e.g., assuming that the data satisfies linearity or sparsity, etc.) and have some limitations of use. To overcome the applicability problem of traditional seismic data reconstruction methods, this article proposes a generative adversarial network (GAN) seismic data reconstruction method based on moment reconstruction error constraints. The method can extract the deep features of the data nonlinearly without any assumptions. First, the error function in the GAN is improved, and the commonly used joint error function of adversarial loss plus L1/L2 amplitude reconstruction loss is improved to a new error function consisting of adversarial loss and moment reconstruction loss weighting. Then, an adversarial network data reconstruction generation method based on the moment reconstruction error constraint is given. Next, an experimental analysis of different types of data missing was carried out using theoretical model data, and the study method was analyzed by interpolation errors. Finally, actual seismic data is used to further validate the effect of the research method. The experimental results show that the improved algorithm performs superiorly in dealing with the data reconstruction problem. Compared with the error function of conventional GAN optimization, the reconstruction results of GAN based on the moment reconstruction error constraint have better amplitude preservation.

Keywords

moment reconstruction GAN data reconstruction

Introduction

Numerous scholars have proposed solutions to the problem of the reconstruction of seismic data. These methods can be broadly summarized as predictive filter-based methods, sparse transform-based methods, low-rank matrix-based methods, fluctuating equation-based methods, and data-driven machine-learning-based algorithms. The reconstruction method of seismic data based on predictive filtering has the characteristics of high reconstruction accuracy and fast computation. Therefore, it is widely used in practical industrial production. Claerbout implemented an interpolation method for prediction error filtering in the T-X (time-space) domain, but the reconstruction accuracy is limited by the number of iterations, resulting in an overly large computational effort.¹ Spitz proposed an F-X (frequency-space) predictive filter interpolation algorithm based on the linear predictability of the data in the F-X domain to extract the autoregressive coefficients of different frequency components to achieve a two-dimensional reconstruction of seismic data.² The reconstruction method of seismic data based on the sparse transform class is reconstructed by applying some mathematical transform to the data and using the characteristics of the data in some transform domain. Common transform methods are Fourier transform,^3–5 Radon transform,^6,7 Curvelet transform,⁸ and Seislet transform.^9,10 The reconstruction method based on various sparse transformation algorithms can reconstruct the missing data with high accuracy and does not require other geophysical a priori information, but the method is less resistant to false frequencies. The reconstruction method for seismic data based on low-rank matrix classes assumes that the original unmissed seismic data should be of low rank and that data undersampling leads to an increase in matrix rank. The current common solution idea is to reconstruct by building a low-rank arrangement or minimizing nuclear norm. Oropeza et al.¹¹ completed the data reconstruction by building Hankel matrices and performing matrix rank reduction using singular value decomposition and convex set projection algorithms. Emmanuel et al.¹² completed the reconstruction of irregular data by transforming the matrix rank reduction process into a nuclear norm minimization problem for solving it. Liu et al.¹³ optimized the matrix rank reduction process to improve the computational speed of the algorithm by accelerating the operations using an accelerated proximal gradient descent method. Zhang et al.¹⁴ proposed a hybrid mathematical model of rank reduction and sparse promotion and used this model to reconstruct 3D seismic data. Wang et al.¹⁵ proposed a joint sparse facilitation method to recover irregular data sampled from non-isometric grids with good results.

With the development of machine learning field in recent years, deep network models have achieved good results in seismic data processing, imaging, and inversion.^16–22 Some scholars have applied deep learning to reconstruct seismic data. the main idea is to construct a nonlinear mapping relationship solved from missing data to complete data by the network. Wang et al.²³ used a deep residual network (ResNet) to learn the mapping relationship between rule-deficient data and nondeficient data, which solved the rule-data interpolation problem but was not effective for irregular deficiencies. Meng et al.²⁴ proposed a seismic data reconstruction method based on unsupervised ResNet. This method does not need a complete seismic data set or a training set, which solved the label problem, and is applicable to the rule-data interpolation problem. Generative adversarial network (GAN)²⁵ showed strong ability in data reconstruction and Alwon et al.²⁶ achieved better results using generative adversarial models for in-channel interpolation of rule-sampled data. Kaur et al.²⁷ further proposed the use of generating an antagonistic network to reconstruct 3D seismic data. Due to the instability and susceptibility to pattern collapse in the training process of GAN, Alexey Dosovitskiy et al.²⁸ added the traditional L1/L2 reconstruction loss to the original error function of GAN to construct a hybrid objective function for optimization, in which the work of the discriminator remains unchanged. The generator should not only cheat the discriminator but also ensure that the generated data is as similar as possible to the original data. The error of the GAN is used to compensate for the ambiguity and lack of high-frequency information of the L1/L2 reconstruction loss, while the L1/L2 reconstruction loss provides the training stability required for the convergence of the GAN error. Phillip Isola et al.²⁹ discuss the difference between adding L1 and L2 norm reconstruction loss to the original GAN error function and show that the L1 norm reconstruction error introduces less ambiguity. Besides, Marano G Carlo et al.³⁰ put forward a recent review on the application of GANs in earthquake-related geophysical research, explained the recent research on the earthquake event generated by GAN synthesis based on artificial intelligence, and emphasized the advantages and limitations of the current adversarial learning research applied to seismology.

In fact, GANs are essentially solving a multi-solvability inverse problem, so achieving both network training stability and multi-modal output should be considered. Although the current approach of using the joint optimization of GAN loss and reconstruction loss has been shown to make the network more stable, it brings the side effect of not being able to learn the multi-modal case with complete data distribution. In this article, we adopt a new moment reconstruction loss alternative reconstruction loss proposed by Soochan Lee.³¹ This error function ensures the stability of network training without degrading the fit and multi-modality of data reconstruction. The core idea is to use the maximum likelihood estimation loss to predict the sample moments of the true data distribution. The GAN training is then aided by minimizing the sample moments of the generated distribution and the predicted sample moments, rather than the L1/L2 loss in most algorithms that directly calculate the input data versus output data amplitudes.

As far as we know, GANs based on moment reconstruction error constraints have not yet been applied to geophysical research. The rest of this article is arranged as follows. In the second section, the basic principle of GAN is introduced, and the details of moment reconstruction loss-constrained GAN (MR-GAN) interpolation method are described. In the third section, the interpolation results based on Marmousi2 theoretical data and actual data are given, and compared with the traditional multi-domain transformation methods and GAN method. Finally, the fourth section concludes.

Theory and method

Principle of GANs

GAN was proposed by Google in 2014. The main idea is to build two neural networks to learn against each other. The generator G is used to fit the sample distribution, and G wants to learn samples sampled close to the true distribution to fool the discriminator D. The discriminator D is used to distinguish whether the data is sampled from the true distribution or from the distribution predicted by G. The performance of both improves over the iterations, until the discriminator has difficulty distinguishing whether the data are sampled from the data distribution or from the distribution predicted by the generator reaches the optimum.

The objective function of the GAN is defined as the minimization of the distance measure between the true data distribution and the predicted data distribution. For the optimal generator, the minimum–maximum optimization scheme of equation (1) is used to train the GAN. $G * = \arg min_{G} max_{D} L_{G A N} (G, D)$ (1)The objective function can be defined as: $L_{G A N} (G, D) = E_{x \sim P_{d a t a} (x)} [\log D (x)] + E_{x \sim P_{G}} [\log (1 - D (G (x)))]$ (2)where $P_{d a t a (x)}$ denotes the true data distribution, $P_{G}$ denotes the predicted data distribution, $D (x)$ denotes the score of samples sampled from the true data distribution after passing the discriminator D, $D (G (x))$ denotes the score of samples sampled from the predicted distribution after passing the discriminator D, E denotes the expectation operation, and $\log$ denotes the logarithmic operation.

Compared with other generative models, GAN's optimization approach does not require pre-modeling and does not require knowing in advance what distribution the data obeys. Instead, it samples directly from an unknown distribution, thus theoretically approximating the true data completely, which is the biggest advantage of GAN. However, the disadvantage of this method which does not require pre-modeling is that it is too free, resulting in a very unstable training process. During the model optimization process, it often happens that the generated data differs too much from the desired results or even has no relationship at all.

To improve the quality of the samples generated by the GAN model, a common optimization method is to introduce the regular term constraint of the conventional generative network error function based on the adversarial loss function and combine the advantages of both to optimize the model.

The objective function in conventional generative networks is generally defined as the difference in distance between each point of the generated data and the true data. This is expressed in the seismic data interpolation as the difference in amplitude reconstruction for each sampling point. By minimizing this objective function, the sampling points in the generated data that fluctuate excessively from the original data can be penalized to improve the quality of the GAN-generated samples and accelerate the convergence. The amplitude reconstruction loss $L_{r e c}$ commonly use the L1/L2 loss and denoted by p = 1 and 2: $L_{r e c} (G) = E_{x, y} [| | y - G (x) | |_{p}^{p}]$ (3)The adversarial loss function $L_{G A N}$ of GAN and the amplitude reconstruction loss function $L_{r e c}$ are combined by the reconstruction parameter to form a new loss function.

The final mixing error function is： $L = L_{G A N} + λ L_{r e c}$ (4)

Mismatch between amplitude reconstruction loss and antagonistic loss

Then, we will show why the existing hybrid error functions are not joint optimization. Here, taking L2 loss as an example, according to the deviation variance decomposition formula,³² the L2 error function can be decomposed into three terms, as shown in equation (5): $\begin{aligned} \underset{E r r o r}{\underset{⏟}{E_{y, y^{⌢}} [{(y - \tilde{y})}^{2}]}} = \underset{V a r (y)}{\underset{⏟}{E_{y} [{(y - E [y])}^{2}]}} \\ + \underset{b i a s^{2} (\tilde{y})}{\underset{⏟}{E_{y} [{(E [\tilde{y}] - E [y])}^{2}]}} + \underset{V a r (\tilde{y})}{\underset{⏟}{E_{y, \tilde{y}} [{(\tilde{y} - E [\tilde{y}])}^{2}]}} \end{aligned}$ (5)where y and $\tilde{y}$ represent the complete data and network reconstruction results. $E [y]$ and $E [\tilde{y}]$ are the expectation of y and $\tilde{y}$ . $E r r o r$ is the total error. $V a r [y]$ is an integrable term because the variance of y cannot be controlled, $b i a s^{2} [y]$ is a deviation squared term that measures the error in the distance measure of y from $\tilde{y}$ , and $V a r (\tilde{y})$ is the variance term, which can be interpreted as a measure of the squared error loss of $\tilde{y}$ with respect to the distance expectation. Since the bias and variance of $\tilde{y}$ are a function of our estimated quantities, the variance and bias terms can be reduced by training optimization. For L2 loss, when $\tilde{y} = E [\tilde{y}] = E [y]$ , the variance and deviation terms will be reduced to 0 to reach the optimum.

As shown in equation (6), when the mixing error (equation (4)) is optimal, we denote $G *$ by the GAN loss at this time and $R *$ the L2 loss at this time. $\begin{aligned} G * = {G | p_{d a t a} (y | x) = p_{G} (y | x)} \\ R * = {G | \underset{G}{\arg min} E_{x, y} [L_{2} (G (x), y)]} \end{aligned}$ (6)According to the GAN loss definition, the optimum is reached when the predicted distribution is the same as the true distribution. The L2 loss minimization is equivalent to the minimization of the bias and variance, and in order to reduce the conditional variance to zero, the model will output the mean of the distribution. However, for seismic data interpolation, the conditional variance $V a r (y | x)$ is often not zero, that is, there are multiple solutions of the interpolated output y for a given missing input data x, and this multi-solvability increases as the proportion of missing data increases. This means that for the joint use of reconstruction loss and GAN loss is problematic, as it may cause a pattern collapse of the model, resulting in a model that always generates a single sample and cannot continue learning.

Sample moment reconstruction loss

To make the definition of amplitude reconstruction loss consistent with the mathematical meaning of the maximum likelihood of GAN loss without losing the multi-modal nature of the model output, the sample moment reconstruction (MR) loss is used instead of the conventional amplitude reconstruction loss in this article, and the maximum likelihood is applied to the sample moments instead of being used directly to reconstruct the amplitude differences of the data. By imposing certain prior constraints on the data distribution, the sample moments correspond to the mean and variance, assuming that the data distribution obeys a Gaussian distribution. As shown in Figure 1, the overall architecture of MR–GAN follows that of GAN, but the difference is that the generator of MR–GAN generates K different samples ${\tilde{y}}_{1 K}$ for each input x by varying the noisy input z. The randomness of the model is increased by the noise term, forcing it to output multi-modal samples. Since the generator is assumed to approximate the true distribution, it can be assumed that the sample moments of the generated distribution can be used to estimate the sample moments of the true distribution. The sample moment reconstruction loss is calculated according to equation (7), where y is the complete data, $\tilde{y}$ represents the reconstructed data, $\tilde{μ}$ is the mean of the sample moments of the predicted distribution, ${\tilde{σ}}^{2}$ is the variance of the sample moments of the predicted distribution, and K is the number of samples for calculating the sample moments. $\begin{aligned} L_{M R} = E_{x, y} [\frac{(y - \tilde{μ})}{2 {\tilde{σ}}^{2}} + \frac{1}{2} \log {\tilde{σ}}^{2}] \\ \tilde{μ} = \frac{1}{K} \sum_{i = 1}^{K} {\tilde{y}}_{i} \\ {\tilde{σ}}^{2} = \frac{1}{K - 1} \sum_{i = 1}^{K} {({\tilde{y}}_{i} - \tilde{μ})}^{2} \end{aligned}$ (7)

Figure 1.

(a) Traditional GAN. (b) MR-GAN.

The final MR loss is defined as equation (8), replacing the amplitude reconstruction loss in equation (4) with the sample moment reconstruction loss, and the final MR loss is the weighted sum of the GAN loss and MR loss. $L = L_{G A N} + λ_{M R} L_{M R}$ (8)Since the generator is set to fit the true distribution, when the generator is optimal, the sample moments of the generated distribution can be used to estimate the sample moments of the true distribution, that is, the conditional means and conditional variances of the generated and true distributions should also be similar when $P_{d a t a} (y | x) = P_{G} (y | x)$ is optimal, as shown in equation (9). Therefore, we apply the maximum likelihood loss to the sample moments, which is consistent with the mathematical meaning of the maximum likelihood loss of GAN. $\begin{aligned} p_{d a t a} (y | x) = p_{G} (y | x) \\ \Rightarrow [E_{y} [y | x] = E_{\tilde{y}} [\tilde{y} | x]] \land [V a r_{y} (y | x) = V a r_{\tilde{y}} (\tilde{y} | x)] \end{aligned}$ (9)

Moment reconfiguration loss-constrained GAN (Mr-GAN) data interpolation processing

To verify the practicality of the aforementioned method, a new error function was constructed to implement the method. The network model is divided into two parts: the generative model and the discriminative model, where the generator completes the process of reconstructing the data, using a network architecture based on the U-net structure,³³ which consists of 12 layers. The discriminator completes the evaluation of the reconstructed data and consists of five layers. The size of the convolution kernel of each layer is $4 \times 4$ , and the step size of each layer is set to 2. The output of the two-dimensional convolution is a four-dimensional matrix of the form $b \times c \times h \times w$ , where b is the number of samples used for batch gradient descent, c is the number of features of the convolution result, h is the length of the convolution result, and w is the width of the convolution result. In order to obtain nonlinear mapping capability for the convolutional layer, the LeakyReLU activation function is used in the encoding part of the generator, the ReLU activation function is used in the decoding part, and the discriminator uniformly uses the LeakyReLU activation function.³⁴ The batch size of the input data was set to 8 and the optimizer used Adam's algorithm.³⁵

Since the data-driven deep learning algorithm is sensitive to data sample features, the reconstruction effect may be affected when the variability of training data and test data features is large. To keep good feature consistency between the data to be interpolated and the training data, we construct training samples for network training by undersampling the data to be interpolated again. Before the seismic data is input, the large data is cut into 128 × 128 size blocks along the time direction and spatial direction. In the cut, the adjacent time windows overlap by 32 samples, and the average amplitude value is averaged over the overlapped area at the output to eliminate the boundary differences. All data blocks are normalized before entering the network, reconstructed, and then back-normalized to the original amplitude.

Experiment

In the following experiments, the experiments will be conducted for the regular missing and random missing data. We validated the improved MR-GAN method against the L2+GAN method as well as the F-K domain POCS (projection on convex sets) method and the curvelet domain POCS method.

To illustrate the interpolation performance of the method in this article for seismic data, three judging methods are used: mean square error (MSE), peak signal-to-noise ratio (PSNR), and structural similarity (SSIM). The three judging methods are calculated as shown in equation (10), where $d_{o u t}$ is the reconstruction result, $d_{t}$ is the true complete data, M is the total number of sampling points, $μ$ denotes the mean value, $σ$ denotes the variance, $σ_{d_{t} d_{o u t}}$ denotes the covariance, and C is a constant. For PSNR, a higher value means that the reconstruction result is closer to the original data and the reconstruction is more desirable, while the opposite is true for the MSE metric. SSIM takes values from −1 to 1, with higher values indicating less data distortion. $\begin{aligned} M S E = \frac{1}{M} \sum_{m = 1}^{M} {(d_{t} - d_{o u t})}^{2} \\ P S N R = 10 \log_{10} \frac{| | d_{t} | |_{2}^{2}}{| | d_{t} - d_{o u t} | |_{2}^{2}} \\ S S I M = \frac{2 μ_{d_{t}} μ_{d_{o u t}} + C_{1}}{μ_{d_{t}}^{2} + μ_{d_{o u t}}^{2} + C_{1}} \cdot \frac{2 σ_{d_{t}} σ_{d_{o u t}} + C_{2}}{σ_{d_{t}}^{2} + σ_{d_{o u t}}^{2} + C_{2}} \cdot \frac{σ_{d_{t} d_{o u t}} + C_{3}}{σ_{d_{t}} σ_{d_{o u t}} + C_{3}} \end{aligned}$ (10)

Rule missing seismic data interpolation

The regular undersampled data are first interpolated, and only the case of 1 missing trace between every two traces is considered here. The data were obtained using the finite difference forward of the acoustic wave equation on the classical Marmousi2 model (after re-mining, which is shown in Figure 2). The forward parameters were chosen to be 50 Hz Ricker wavelet, and a total of 801 geophones were arranged with a trace spacing of 10 m and a shot spacing of 10 m. There were a total of 561 shots, each with 241 traces of reception, a sampling interval of 2 ms, and 501 sampling points per trace. A 50% rule deficiency is applied to the total number of geophones and then applied to the recording trace of each shot. Since the reconstruction of the direct wave is not required and the near-bias strong amplitude of the direct wave will affect the data preprocessing process, the data are excised before being input into the network and then input into the trained MR-GAN network. The results of the data reconstruction are shown in Figure 3.

Figure 2.

Marmousi2 model after remining.

Figure 3.

50% rule missing MR-GAN reconstruction results. (a) Original data, (b) 50% rule missing data, (c) MR-GAN reconstruction results, (d) reconstruction residuals.

The results show that the data are consistent with the original data after MR–GAN interpolation, with smooth and continuous homophase axes, good correlation with adjacent traces, and reconstructed PSNR of 22.171 dB. From the residual profile, the residual energy is very small, with only a slight error at the homophase axis where the curvature is large.

Random undersampling seismic data interpolation

The interpolation results of 50% random missing for the Marmousi2 data are shown in Figure 4. The MR-GAN network can also reconstruct the randomly sampled data, but it can be seen from the residual plots that the recovery residuals still contain some information residuals, and the final reconstruction PSNR is 18.464 dB.

Figure 4.

50% random missing MR-GAN reconstruction results. (a) Original data, (b) 50% random missing data, (c) MR-GAN reconstruction results, (d) reconstruction residuals.

It can also be seen that the seismic traces in the middle of the missing region are less correlated than those near the missing boundary. The reconstruction results of the consecutive multi-trace missing regions show that the more neighboring sampled traces, the easier it is for the network to interpolate the missing data, while the correlation difference between the seismic traces near the missing boundary and the missing intermediate traces increases with the complexity of the interpolated region. Overall, the MR-GAN network shows better performance for processing rule-deficient data for 50% rule-deficient versus 50% random-deficient data. The continuous absence of data makes the algorithm receive less a priori information, and the randomness of the missing data increases the difficulty for the algorithm to learn its data distribution, while for regular sampling, the algorithm is more likely to capture the distribution characteristics of the data.

Comparison of interpolation effects of different methods

The interpolation results obtained using the MR-GAN network in this article are compared with the interpolation results of the network optimized using the L2+GAN error function in the following, as well as the conventional data interpolation methods, namely the F-K domain POCS method and the curvelet domain POCS method. A 50% trace of the original data is randomly missing, which can simulate a real non-uniform acquisition scene, and the reconstruction effect of the four methods is shown in Figure 5.

Figure 5.

Comparison of the effects of different interpolation methods. (a) Original data, (b) 50% random missing data, (c) F-K domain POCS reconstruction results, (d) Curvelet domain POCS reconstruction results, (e) GAN+L2 reconstruction results, (f) GAN+MR reconstruction results, (g) F-K domain POCS reconstruction residuals, (h) Curvelet domain POCS reconstruction residuals, (i) GAN+L2 reconstruction residuals, (j) GAN+MR reconstruction residuals.

From these four types of reconstruction results and residual plots, it can be seen that the F-K domain POCS method has a large error after reconstruction, many valid information residues can still be seen in the residuals, and a lot of random noise is generated. The reconstruction results of the curvelet domain POCS method are better, although the MSE and PSNR evaluation indexes of the reconstruction are not particularly outstanding among the various methods, the hyperbolic homography axis of the reconstructed data has a small residual difference from the original single gun, and the overall reconstruction has good amplitude preservation. GAN+L2 error-optimized network reconstructed seismic traces differ from the adjacent known amplitudes, and there are local discontinuities in the adjacent trace hyperbolas, which exhibit missing boundary differences on the grayscale map, and individual traces are not recovered. In contrast, GAN+MR method used in this article, the reconstructed results have smaller residuals from the original data, and there is no effective information residual of the rules, and because the trained network parameters have been saved, the reconstructing process is equivalent to a known analytic solution, so the computation time is less. To more clearly reflect the difference in amplitude preservation of the methods for the data, 141 of the missing traces were extracted to show the amplitudes, as shown in Figure 6. In addition, the evaluation parameters of MSE, PSNR and SSIM are shown in Table 1, which proves the overall superior performance of the method in this article. Table 1 also gives the training time for the different methods and the processing time for processing this single-shot data. The training time is the time to train on the training set constructed as described previously, and no additional training was performed in the interpolation of this data. Although the GAN+MR method is more expensive to train than the GAN+L2 method, it does not have an impact on the true data interpolation processing time. The MR-GAN network approach for continuously missing regions has a slight error in the residual profiles, but it still produces closer to the true seismic data without generating additional random noise or appearing to interpolate trace ambiguities. The numerical experiments are performed using PyTorch as the backend on a PC with Windows 10 and NVIDIA GeForce RTX 1080Ti. The size of a single input data is 128*128, and there are 5000 training data, and 50 Epoch is trained. ADAM optimizer is used to train the network, and its superparameter setting (batch_size = 8, initial learning rate LR = 1e-3, decay_weight = 1e-5) is adopted.

Figure 6.

Comparison plots of 141 traces for different interpolation methods. (a) F-K domain POCS method, (b) curvelet domain POCS method, (c) GAN+L2 method, (d) GAN+MR method.

Table 1.

Evaluation indexes of different interpolation methods.

Method	MSE	PSNR	SSIM	Training time	Single shot processing time
F-K-domain POCS	5.049	0.482	0.987	/	4(s)
Curvelet-domain POCS	0.636	8.514	0.998	/	36(s)
GAN+L2	0.214	13.243	0.998	24(hour)	14(s)
GAN+MR	0.050	19.518	0.999	46(hour)	14(s)

Actual data interpolation

To verify the applicability of the proposed method to the post-stack data, an inline of a post-stack profile was processed with 50% random missing using the method in this article, and the reconstructed results are shown in Figure 7.

Figure 7.

Actual post-stack data interpolation. (a) Original data, (b) 50% random missing data, (c) reconstruction results of GAN+MR method, (d) reconstruction residuals of GAN+MR.

It can be seen that the slope topographic reflections and parallel strata reconstructed by the MR-GAN network are almost indistinguishable from the original data in the shallow regions of the profile, but in the deeper regions where the structure is more complex, the reflection clutter and the presence of some random noise interference make the network interpolation difficult in the deeper steep reflections. In addition, sometimes there is a resolution difference between the original data and the interpolated data, and the record traces generated by the network present lower resolution and some non-geological artifacts. For example, the interpolated data appear as brighter and darker streaks at the missing boundary. In general, the interpolated data of MR–GAN network correlated well with the original data, and the residuals also basically showed irregular noise, and the method still showed good performance in processing the post-stack data.

Conclusions

In this article, we demonstrate the incoherence of conventional GAN loss plus L2 amplitude reconstruction loss. To prevent the phenomenon of pattern collapse in model training, we generate adversarial network training using the maximum likelihood constraint of distributed sample moments, and the model can more closely approximate the true distribution of the data because the moment reconstruction loss has the same mathematical meaning as the maximum likelihood estimate defined by GAN loss.

The experimental tests on the reconstruction problem of seismic data under two cases of regular missing and random missing, comparing other interpolation methods, and different reconstruction quality control parameters all reflect the advantages of the MR-GAN network method and verify the effectiveness of the method in seismic data interpolation. Although most of the current seismic data interpolation using deep learning uses deep prior networks for unsupervised learning, we would like to clarify that the method in this article only improves the error function in the generator optimization process, so it can be extended to deep prior GANs for unsupervised learning as well. In addition, this moment reconstruction loss-constrained GAN can be applied not only in data interpolation, but also for any conditional generation task, and can be easily extended to most GANs since it does not modify the network structure but only the error function.

Finally, we want to say that although the improved method has achieved the ideal interpolation effect, in some cases, the algorithm still needs to be further improved. For example, there is still room for improvement when there are too many continuous deletions. In addition, when there are obvious differences between the data to be reconstructed and the training data, the reconstruction error will increase. In the process of actual data processing, how to expand a large number of training data related to the actual data is also a problem to be solved. In the future, we can consider establishing corresponding data set training according to the data characteristics of different work areas, instead of applying a unified training set to all data. There are many research directions for generating countermeasure networks, and this article only discusses some of them. Other aspects, such as network stability, optimization algorithm, and network architecture, have not been discussed too much, so they can only be used as follow-up research content.

Highlights

This method is based on the error constraints of moment reconstruction. Due to the adoption of a data-driven deep learning method, it does not require any assumptions, and has good processing results for both regular and irregular missing data.

Compared with conventional GAN and other traditional methods, the proposed method has higher reconstruction accuracy and better amplitude preservation.

Footnotes

Author contributions

Conceptualization by B.L.;methodology by B.L.;software by X.D.;validation by B.L.;formal analysis by L.X.;investigation by B.Q.;resources by X.D.;data curation by B.Q.;writing—original draft preparation by B.L.;writing—review and editing by L.X.;visualization by X.D.;supervision by X.D.;project administration by L.X.,and funding acquisition by B.Q. All authors have read and agreed to the published version of the manuscript.

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research,authorship,and/or publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research,authorship,and/or publication of this article: This work was supported by the Central Government Funds of Guiding Local Scientific and Technological Development,the Science and Technology Cooperation Project of the CNPC-SWPU Innovation Alliance (grant number No.2021ZYD0003,2020CX020000).

ORCID iDs

Xuguang Dong

BoCheng Qin

Author biographies

Bin Liu received a bachelor’s degree in Exploration geophysics from Southwest Petroleum University in 1991. The main research interests are seismic exploration acquisition methods,active and passive source imaging,single point high density seismic acquisition technology,compression-sensing seismic acquisition and data reconstruction.

Xuguang Dong received MS degree in structural geology from Chengdu University of Technology in 2007. His main research interests are seismic exploration acquisition methods,irregular seismic acquisition design.

Leiliang Xu received the MS degrees in Geophysical Prospecting and Information Technology from China University of Petroleum. His major research seismic acquisition,efficient construction of vibroseis,vibroseis scanning signal design,and seismic imaging methods.

BoCheng Qin is pursuing a master's degree at Southwest Petroleum University. His main research direction is digital processing of seismic data.

References

Claerbout

Nichols

. Interpolation beyond aliasing by(tau,x)-domain PEFs. In: 53rd EAGE meeting, 1991; cp-42-00001.

Spitz

. Seismic trace interpolation in the F-X domain. Geophysics 2012; 64: 785.

Duijndam

AJW

Schonewille

Hindriks

COH

. Reconstruction of band-limited signals, irregularly sampled along one spatial direction. Geophysics 1999; 64: 524–538.

Wang

. Multiple attenuation: coping with the spatial truncation effect in the Radon transform domain. Geophys Prospect 2003; 51: 75–87.

Zwartjes

Sacchi

. Fourier Reconstruction of nonuniformly sampled, aliased seismic data. Geophysics 2007; 72: V21–V32.

Kostov

. Finite aperture slant-stack transforms, 1989; 61.

Ibrahim

Sacchi

. Simultaneous source separation using a robust radon transform. Geophysics 2014; 79: V1–V11.

Herrmann

Moghaddam

Stolk

. Sparsity and continuity promoting seismic image recovery with curvelet frames. Appl Comput Harmon Anal 2008; 24: 150–173.

Fomei

Liu

. Seislet transform and seislet frame. Geophysics 2010; 75: V25–V38.

10.

Gan

Wang

Chen

, et al. Dealiased seismic data interpolation using seislet transform with low-frequency constraint. IEEE Geosci Remote Sens Lett 2015; 12: 2150–2154.

11.

Oropeza

Sacchi

. Simultaneous seismic data denoising and reconstruction via multichannel singular spectrum analysis. Geophysics 2011; 76: V25–V32.

12.

Candès

Recht

. Exact matrix completion via convex optimization. Found Comput Math 2009; 9: 717–772.

13.

Liu

Sun

Toh

. An implementable proximal point algorithmic framework for nuclear norm minimization. Math Program 2012; 133: 399–436.

14.

Zhang

Zhou

Chen

, et al. Hybrid rank -sparsity constraint model for simultaneous reconstruction and denoising of 3D seismic data. Geophysics 2017; 82: V351–V367.

15.

Wang

. Seismic data regularization on nonequispaced grid via a joint sparsity-promotion method. IEEE Geosci Remote Sens Lett 2021; 19: 1–5.

16.

Wang

, et al. Attribute-Based double constraint denoising network for seismic data. IEEE Trans Geosci Remote Sens 2021; 59: 5304–5316.

17.

Zhu

Mousavi

Beroza

. Seismic signal denoising and decomposition using deep neural networks. IEEE Trans Geosci Remote Sens 2019; 57: 9476–9488.

18.

Fabien-Ouellet

Sarkar

. Seismic velocity estimation: a deep recurrent neural-network approach. Geophysics 2020; 85: U21–U29.

19.

Liang

Shi

, et al. Faultseg3d: using synthetic data sets to train an end-to-end convolutional neural network for 3D seismic fault segmentation. Geophysics 2019; 84: IM35–IM45.

20.

Das

Pollack

Wollner

, et al. Convolutional neural network for seismic impedance inversion. Geophysics 2019; 84: R869–R880.

21.

Pan

Chen

, et al. A surface-wave inversion method based on FHLV loss function in LSTM. IEEE Geosci Remote Sens Lett 2022; 19: 1–5.

22.

Pan

Chen

, et al. An unsupervised inversion method for seismic brittleness parameters driven by the physical equation. IEEE Trans Geosci Remote Sens 2023; 61: 1.

23.

Wang

Zhang

. Deep learning based seismic data interpolation: a preliminary result. Geophysics 2019; 84: F1–F15.

24.

Meng

Yang

Zhang

. Seismic data reconstruction method based on unsupervised residual network. Oil Geophys Prospect 2022; 57: 789–799.

25.

Goodfellow

Pouget-Abadie

Mirza

, et al. Generative adversarial nets. In: Proceedings of the 27th international conference on NIPS. Curran Associates, 2014, pp.2672–2680.

26.

Kaur

Pham

Fomel

. Seismic data interpolation using deep learning with generative adversarial networks. Geophys Prospect 2021; 69: 307–326.

27.

Alwon

. GANs in seismic data processing. In: SEG technical program expanded abstracts. Society of Exploration Geophysicists, 2018, pp.1991–1995.

28.

Dosovitskiy

Brox

. Generating images with perceptual similarity metrics based on deep networks. In: Proceedings of the 29th international conference on NIPS, 2016, pp.1–14.

29.

Isola

Zhu

Zhou

, et al. Image-to-Image translation with conditional adversarial networks. IEEE Conf Comput Vision Pattern Recognit (CVPR) 2017; 8599: 5967–5976.

30.

Carlo

Martino

Angelo

, et al. Generative adversarial networks review in earthquake-related engineering fields. Bull Earthquake Eng 2023: 1–52.

31.

Lee

Kim

. Harmonizing maximum likelihood with GANs for multi-modal conditional generation. Statistics (Ber) 2019: 1–25.

32.

James

. Variance and bias for general loss functions. Mach Learn 2003; 51: 115–135.

33.

Ronneberger

Fischer

Brox

. U-net: convolutional networks for biomedical image segmentation. Med Image Comput Comput-Assisted Intervention - MICCA 2015; 9351: 234–241.

34.

Maas

Hannun

. Rectifier nonlinearities improve neural network acoustic models. In: The 30th international conference on machine learning, 2013, pp.1–6.

35.

Kingma

. Adam: A method for stochastic optimization. In: International conference on learning representations, 2015, pp.1–15.

Seismic data reconstruction method using generative adversarial network based on moment reconstruction error constraint

Abstract

Keywords

Introduction

Theory and method

Principle of GANs

Mismatch between amplitude reconstruction loss and antagonistic loss

Sample moment reconstruction loss

Moment reconfiguration loss-constrained GAN (Mr-GAN) data interpolation processing

Experiment

Rule missing seismic data interpolation

Random undersampling seismic data interpolation

Comparison of interpolation effects of different methods

Actual data interpolation

Conclusions

Highlights

Footnotes

Author contributions

Declaration of conflicting interests

Funding

ORCID iDs

Author biographies

References