Abstract
Keywords
Introduction
The complex reservoir means that hydrocarbons are derived from multiple source rocks and have undergone multiple charging events in a superimposed petroliferous basin (Jackson, 2005; Liu, 2007; Wang et al., 2019). It is characterized by several sets of source-reservoir-cap assemblages and multicycle petroleum accumulations. Oil distribution varies regionally and is often mixed due to multistage oil charging from different sources (Pang et al., 2012). Oil sources and accumulation patterns in complex oil reservoir of superimposed basins have attracted significant attention in petroleum geology and geochemistry.
The Junggar Basin is a typical large petroliferous superimposed basin in northwestern China. Currently, two large oil provinces with an area of hundreds of kilometers have been found in Mahu Sag, a hydrocarbon-rich structure of the northwest margin of the basin (Jun et al., 2020; Li et al., 2020; Zhi et al., 2018, 2021). The Mahu Sag has four sets of source rocks and multiple sets of reservoirs of various types. The Upper Wuerhe Formation is a sandy conglomerate reservoir widely distributed south of the Mahu Sag, with a total of 6.1 billion tons of proved oil reserves and a potential oil-bearing area of approximately 1500 km2, indicating great resource prospects and economic significance (Kuang et al., 2022). This oil reservoir is confirmed to resulted from multistage hydrocarbon charging of various source rocks (Cao et al., 2020).
Despite the discovery of significant hydrocarbons in the Upper Wuerhe Formation in the southern Mahu Sag, their sources remain controversial, and the spatial distribution of different genetic oils is poorly understood. There are different opinions regarding the sources of these oils in the study area. One opinion is that these oils are single sourced and mainly come from the Fengcheng Formation (P1f) (Huang et al., 2016, 2022; Kuang et al., 2022; Tang et al., 2022), while another single-source view holds that they are mainly from the Lower Wuerhe Formation (P2w) (Bian et al., 2019). In addition, other studies indicate that these oils are of mixed origin, with contributions from the Fengcheng Formation and the Lower Wuerhe Formation (Wang, 2001; Wang and Kang, 1999; Yu et al., 2017; Zhang et al., 2022; Zou et al., 2021). Mixed contributions from the Fengcheng Formation(P1f)and the Carboniferous (C) and Jiamuhe Formation (P1j) is also proposed in other studies (Ma et al., 2015). However, the four sets of potential source rocks in the study area have reached the mature hydrocarbon-generating stage in geological history, and they are both considered to have contributed to oil accumulation in the study area based on the comprehensive assessments of source rocks and geochemical analysis of crude oil in the reservoir (Cao et al., 2005, 2006; Tao et al., 2019). This current poor understanding of the sources of oils is mainly because the oils analyzed in the previous studies are in different confined areas and do not represent the actual distribution of crude oil in the study area. However, an adequate oil-source correlation was not measured because limited source rocks were penetrated during the early conventional petroleum exploration programs.
Extensive exploitation of the oils in the Upper Wuerhe Formation provides many crude oil samples covering the study area, and increasing unconventional petroleum exploration activity has led to an adequate geochemical investigation of many different source rocks. Here we conduct a comprehensive oil-source correlation in the study area. Our aim is not only to solve the source problem of crude oils in the Upper Wuerhe Formation, but also to examine the spatial distribution of different sourced oils, which can reflect their generation, migration, and accumulation characteristics. This, in turn, can provide a new perspective for hydrocarbon enrichment behavior in a typical superimposed basin.
In this study,we (a) correlated and inferred the provenance of Upper Wuerhe Formation crude oils in the southern Mahu Sag based on representative biomarker parameters and (b) characterize the spatial distribution of different sourced oils. In particular, we investigated the stratigraphic distribution of various source rocks to examine whether there is a positive oil-source correlation in the geological framework and whether they are likely to be near-source accumulated. Finally, we analyze the factors for different spatial occurrences of oils and establish accumulation models. We hope this study will provide a new perspective on the complex hydrocarbon accumulation behavior in superimposed basins with multiple sets of source rocks.
Geological setting
The Junggar Basin is a large superimposed prolific oil basin in northwestern China, with an area of approximately 1.3× 105 km2 (Cao et al., 2005). It developed on the Precambrian crystal basement (before 800 Ma) and a slightly metamorphic Paleozoic basement that went through four stages of tectonic evolution: (1) a foreland oceanic basin stage (Carboniferous-Middle Permian); (2) foreland continental basin stage (Late Permian); (3) an intracontinental depression basin stage (Triassic-Cretaceous); and (4) a rejuvenated foreland basin stage (Paleogene-Quaternary) (Cao et al., 2005; Chen et al., 2002; Kuang et al., 2022; Zhang et al., 1999). Based on the shape of the basement and the superimposed influence of later tectonic evolution, the basin can be divided into six primary structural units: the Central Depression, Luliang Uplift, Ulungu Depression, Eastern Uplift, Southern Margin Thrust Belt, and Western Uplift (Figure 1(a)) (Kuang et al., 2022).

Location of the study area and distribution of structural units. (a) The site of the study area and the main structural units within the Junggar Basin; the blue line shows the location of the study area, the white areas indicate the present-day uplifts, and the gray areas show the depressions. (b) The structural units of the northwestern Junggar Basin and hydrocarbon accumulation in the Upper Wuerhe Formation; (c)The cross-section AA’, whose location is shown in (b), shows the super-Carboniferous stratigraphic units on Figure 2 based on the seismic reflection profile and stratigraphic correlation between wells; the faults in red were formed during the Hercynian to Indosinian, the faults in green were created during the Indosinian to Himalayan, and the faults in pink were formed during the Indosinian to Yanshanian. C = Carboniferous; P1j = Lower Permian Jiamuhe Formation; P1f = Lower Permian Fengcheng Formation; P2x = Middle Permian Xiazijie Formation; P2w = Middle Permian Lower Wuerhe Formation; P3w = Upper Permian Upper Wuerhe Formation; T1b = Lower Triassic Baikouquan Formation; T2k = Middle Triassic Karamay Formation; T3b = Upper Triassic Baijiantan Formation; J = Jurassic; K = Cretaceous.
The Mahu Sag is part of the Central Depression and is one of the most hydrocarbon-rich sags in the basin. Two prolific oil provinces with an area of more than 100 km2 have been discovered in the interior of the Mahu Sag and its northern fault zone (Lei et al., 2017; Tang et al., 2019; Xia et al., 2022). Our study area is located in the southern part of the Mahu Sag and includes four secondary structural units: (1) the Hongche fault zone, (2) the Ke-Bai fault zone, (3) the Zhongguai uplift, and (4) the southern Mahu Sag (Figure 1(b)).
Four sets of potential source rocks were deposited in the Mahu Sag. These are, from bottom to top, the Carboniferous units (C), Lower Permian Jiamuhe Formation (P1j), Lower Permian Fengcheng Formation (P1f), and Middle Permian Lower Wuerhe Formation (P2w) (Figures 1(c) and 2) (Cao et al., 2005; Hu et al., 2020). The Carboniferous units (C) and Jiamuhe Formation (P1j) can be combined into the same source rock unit (C/P1j) because of their similar stratigraphic distribution and depositional environments in the study area (Hu et al., 2020; Tao et al., 2019). Tuffaceous mudstones dominate the C/P1j source rocks with moderate to low TOC (total organic carbon) and moderate S1 + S2 (petroleum generation potential) (Cao et al., 2005; Tao et al., 2019). Most of the kerogen within the C and P1j source rocks is dominated by benthic algae and higher plant-related components, indicating both oil and gas potentials (Qin et al., 2022; Tao et al., 2019). The P1f source rocks are mainly composed of mudstone, dolomitic mudstone, and tuffaceous mudstone with moderate to high TOC and high S1 + S2 values, and kerogen types belong to types I and II, which consist primarily of bacteria and algae but few higher plants (Tao et al., 2019). The P1f source rocks are now mature to over-mature, suitable for oil and gas generation, and have the most significant oil generation potential (Cao et al., 2005, 2006; Hu et al., 2020). The P2w source rocks are dominated by mudstone and have low TOC and S1 + S2 values, while the kerogen is mainly of type III, containing more higher plant components such as sporopollenin and woody debris, which suggests that the P2w source rocks have limited potential for the generation of liquid hydrocarbons but may be favorable for gas generation (Cao et al., 2005; Tao et al., 2019).

Summary of composite stratigraphic formations of the northwest margin (Mahu Sag) of the Junggar Basin. The lithology and the thickness of the formations are based on Cao et al. (2005) and Kuang et al. (2022). The four sets of source rocks combined with reservoirs and cap rocks are shown in the right column.
The Upper Wuerhe Formation is an important oil reservoir in the basin; it is a large-scale shallow retrogradation fan-delta deposit under the background of a gentle slope (0.84–2.86°) in the depression basin stage, with a thickness of 37 to 311 m in the Mahu Sag (Tang et al., 2018; Wang et al., 2021; Zheng et al., 2019). Fan-deltas can extend for tens to hundreds of kilometers due to the gentle paleomorphology and the large amounts of coarse-grained sediments provided by continuous uplift of the orogenic belt around the basin (Tang et al., 2018). There is high primary porosity in the subaqueous channel sandy conglomerate microfacies developed in the fan-delta front, which are favorable lithofacies for oil and gas exploration in the Upper Wuerhe Formation (Kuang et al., 2022; Tang et al., 2018). The Upper Wuerhe Formation was formed during the gradual rise of the lake level. The sediments gradually became finer upward and could be divided into three members according to lithological assessment from bottom to top: the first member is mainly composed of conglomerate; the second member is primarily composed of conglomerate, sandstone, and thin mudstone; and the third member mainly consists of thick shallow lacustrine mudstone (Figure 2). The thin mudstones deposited in the inter-fan setting and the thick mudstones of the shallow lacustrine environment in the third member constitute local and regional caprocks, respectively.
Instead of the sources-reservoir-caprocks system, three types of faults developed in different tectonic stages in the Mahu Sag: (1) The NE-SW-trending reverse fault developed in the Hercynian to Indosinian, (2) strike-slip faults developed from the Indosinian to Himalayan, and (3) normal faults developed from the Indosinian to Yanshanian (Figure 1(c)) (Chen et al., 2018; Tao et al., 2006). The faults that developed from the Hercynian to the Indosinian provide a vertical channel for the hydrocarbon to migrate vertically and accumulate in the Upper Wuerhe Formation due to its steep fault plane and large fault displacement, which connected the sandy conglomerate reservoirs and multiple source rocks from the Carboniferous to the Middle-Late Triassic. Furthermore, the unconformity at the bottom of the Upper Wuerhe Formation is an efficient channel for the lateral migration of hydrocarbons and, together with various faults, constitutes a three-dimensional migration system. In particular, the Large-Zhuluogou fault, a NW-trending strike-slip fault located in the middle of the Mahu Sag, plays an important role in controlling hydrocarbon accumulation in the Mahu Sag (Bian et al., 2019; Lei et al., 2017). Previous studies have revealed differences in the hydrocarbon accumulation on both sides of the fault. In the northern part, there is mainly the Fengcheng Formation (P1f) petroleum system, with most of the oils coming from P1f source rocks (Yu et al., 2017; Zhi et al., 2021). In the south, hydrocarbon accumulation is more complex, with oils from multiple sources known as the Carboniferous to Permian (C-P) petroleum system (Figure 1(b)) (Bian et al., 2019; Tao et al., 2021; Yu et al., 2017).
Samples and methods
Forty-six crude oil samples were collected from the Upper Wuerhe Formation with a depth of 2557 to 5110 m from 41 hydrocarbon production wells distributed in the Zhongguai uplift and Mahu Sag. These crude oil samples cover the study area. Thirty-three samples are on the Zhongguai uplift, and 13 samples are in the Mahu Sag.
The crude oils were fractionated using silica gel column chromatography with n-hexane, n-hexane, dichloromethane (70:30 v/v), and dichloromethane and methanol (50:50 v/v) to yield saturated, aromatic, and polar + asphaltene fractional compositions, respectively. The stable carbon isotope (δ13C) composition of crude oil was determined using a Finnigan MAT-253 mass spectrometer. Carbon isotope ratios were standardized to Vienna Peedee Belemnite (V-PDB).
Whole oil and saturated hydrocarbon gas chromatography (GC) was carried out on a Hewlett Packard 6890 II GC instrument. A 50 m × 0.25 mm × 0.25 μm dimethylpolysiloxane capillary column was used to separate light hydrocarbons and higher molecular weight hydrocarbons employing nitrogen as a carrier gas; the components were detected using a flame ionization detector at a temperature of 320°C. The biomarkers were determined by GC (Hewlett Packard 6890 II GC) coupled to a Quattro II mass selective detector. The instrument was fitted with an HP-1 30 m × 0.25 mm inner-diameter fused silica column that used nitrogen as a carrier gas. The GC oven was initially set at 50°C for 2 minutes, then increased to 100°C at 2°C min−1, and then to 310°C at 3°C min−1, with a final hold time of 15 min.
Results
Bulk properties of crude oil
In this study, we have classified 46 crude oil samples into 4 types based on their representive molecular geochemical parameters (to be discussed below). The results show that most oils are enriched in saturated hydrocarbons relative to aromatics and nonhydrocarbons (polars + asphaltenes) (Figure 3(a) and Table 1), but there are also clear fractional composition differences between different crude oil types. Type I oils have relatively lower saturated hydrocarbon content compared to type II, with average values of 71.73% and 80.53%, respectively (Figure 3(a)). Only one oil sample belongs to type III, which has the highest proportion of aromatic hydrocarbons compared to other types (Figure 3(a)). The composition of type IV oils (a mixed origin of type I and type II, which will be discussed later) exhibits fractional composition characteristics of both type I and type II oils (Figure 3(a)).

Bulk properties of deep oils in the south Mahu Sag of northwestern Junggar Basin. (a) Ternary diagram comparing the fractional compositions of oils in the study area and (b) variations in wax contents of crude oils in the study area.
Bulk properties of crude oils from the Upper Wuerhe Formation in the study area.
The wax content of these crude oils show that type I oils have a lower wax content with an average of 4.74%, compared to the average of 8.55% of type II oils (Figure 3(b) and Table 1). Type IV oils' wax content intermediate between types I and II, with an average value of 6.73%.
When these oil samples are mapped onto their location of the study area, more details can be observed that the Mahu Sag oils are relatively rich in saturated hydrocarbons ranging from 47.47% to 87.03% (mean of 71.05%) and contains lower nonhydrocarbon content (mean of 7.82%). In contrast, the Zhongguai uplift has a decrease in saturated hydrocarbon content ranging from 41.12% to 88.46% (mean of 64.09%), but an increase in nonhydrocarbon content (mean of 14.1%) (Figure 4 and Table 1).

The distribution of fractional compositions of crude oil samples in the study area.
Similar to the spatial variation in saturation content of the crude oils, the wax contents are the richest in the Mahu Sag oil samples,with an average of 7.21%; whereas the oil samples from Zhongguai uplifts have lower wax content, with an average of 4.72% (Figure 5 and Table 1). Both parameters of saturation and wax content are rich in Mahu Sag and relatively low in the Zhongguai uplift, we attributed this spatial variation to thermal maturity or possibly different provenances of crude oils in the Upper Wuerhe Formation. Further details are discussed in the section on sources of the crude oils and their distribution.

The distribution of wax content of crude oil samples in the study area.
Carbon isotopes of crude oil
The δ13C values of the crude oils range from −28.54‰ to −31.67‰. Group II oils are generally depleted in 13C relative to group I oils, with average values of −31.35‰ and −30.22‰, respectively (Figure 6(f)). Group III consists of only one oil sample, for which the carbon isotope ratio was not obtained. Group IV oils (refered to here as geochemically hybrid, and inferred to be mixed by groups I and II oils) have a wide range of δ13C values, ranging from −31.46‰ to −28.54‰, with an average of 30.07‰. The variability of δ13C values observed in different oil groups may be attributed to variations in the sources of crude oils. These values can be used in conjunction with other molecular geochemistry parameters to confirm the distinct origins of these crude oils, which will be discussed further in the section on the sources of Crude Oils and Their distribution.

Representative biomarker parameters and carbon isotope data for crude oil samples in the Upper Wuerhe Formation. (a) Cross plot of pristane/phytane (Pr/Ph) versus β-carotane/n-Ci (maximum of n-alkanes); (b) cross plot of pristane/phytane (Pr/Ph) versus gammacerane/C30 hopane; (c) cross plot of tricyclic terpane (C19+C20)/C21 ratio versus C21/C23; (d) cross plot of pristane (Pr)/n-C17 versus phytane/n-C18 ratios; (e) relative proportion of C27, C28, and C29 regular steranes (5α(H), 14α(H), and 17α(H)20R); and (f) plot of pristane/phytane (Pr/Ph) ratio versus stable carbon isotope (whole oil).
Molecular geochemistry of crude oil
Typical molecular geochemical parameters with source and depositional environment information are used to infer the provenance of crude oil in the Upper Wuerhe Formation. These are listed in Table 2 and are graphically presented in Figure 6. Four genetic oil groups were recognized based mainly on the biomarker parameters of the 46 crude oil samples. The following paragraphs, introduce the details of the molecular geochemical characteristics of different oil groups.
Representative molecular geochemical parameters of crude oils recovered from the Upper Wuerhe Formation.
β-Car./n-Ci=β-carotane/maximum concentration of n-alkanes; C27(%)=C27ααα20R/(C27ααα20R + C28ααα20R + C29ααα20R); G = gammacerane; ID = sample identifier; J = Jin; JL = JinLong; JT = JinTan; K = Ke; MH = MaHu; Ph = phytane; Pr = pristane; n-C17 = normal alkane with 17 carbon atoms; n-C18 = normal alkane with 18 carbon atoms; TT = tricyclic terpane.
Discussion
Identification signatures of source rocks
Four possible source rock units developed in the northwestern Junggar Basin, There are, from bottom to top, the Carboniferous (C) unit, Lower Permian Jiamuhe (P1j) Formation, Lower Permian Fengcheng (P1f) Formation, and Middle Permian Lower Wuerhe Formation (P2w). In particular, the C and P1j source rocks can be considered a single source unit because they are both dominated by tuffaceous mudstones and have similar geographic distributions (Cao et al., 2005; Tao et al., 2019).
The P1f source rocks are rich in gammacerane and β-carotane and have a low Pr/Ph ratio (Huang et al., 2016; Ma et al., 2015; Tao et al., 2019; Wang, 2001; Wang and Kang, 1999; Zhang et al., 2022). The gammacerane index (gammacerane/C30 hopane) ranges from 0.18 to 0.84 (mean of 0.53), and the β-carotane/n-Ci ranges from 0.5 to 3.7 (summarized in Table 3 of this study) (Tao et al., 2019). The Pr/Ph ratio is less than or slightly >1 as shown in many previous studies (Ma et al., 2015; Tao et al., 2019; Zhang et al., 2000). In terms of tricyclic terpanes, the P1f source rocks have a relatively high abundance of C23 tricyclic terpenes, followed by C21, C20, and C19 tricyclic terpenes, which show an ascending distribution pattern between the C20, C21, and C23 tricyclic terpenes (Table 3) (Huang et al., 2016; Ma et al., 2015; Tao et al., 2019). Based on these biomarker data, the P1f source rocks were interpreted to have been deposited in a reducing to strongly reducing alkaline lacustrine setting (Tao et al., 2019). This setting is also supported by the widespread distribution of alkaline minerals such as dolomite (CaMg(CO3)2), sodalite (NaHCO3), and shortite (Na2Ca2(CO3)3) in the P1f (Cao et al., 2015; Tang et al., 2022; Zhang et al., 2018).
Representative molecular geochemical signatures of four possible source rocks in the Junggar Basin.
Ph = phytane; Pr = pristane.
The C and P1j source rocks unit have a moderate ratio of Pr/Ph (mean of 1.30) and a poor to moderate gammacerane index value (mean of 0.20); the values of β-carotane/Ci are not higher than 0.3 (Huang et al., 2016; Ma et al., 2015; Tao et al., 2019; Zhang et al., 2000). The relative abundance of tricyclic terpenes differs from that of the P1f source rocks, dominated by C21 terpenes, followed by C20 and C23 tricyclic terpenes, which appear as a mountain peak pattern (C20 < C21 > C23 tricyclic terpenes) (Table 3). The C and P1j source rocks were deposited in a weakly oxidizing to reducing environment in marginal coastal to transitional facies (Tao et al., 2019).
The P2w source rocks have a high Pr/Ph ratio, with an average value of 2.60 (Tao et al., 2019), and other studies have also shown that the ratio of Pr/Ph is >1, ranging from 1.23 to 2.89 (Table 3) (Ma et al., 2015; Zhang et al., 2000). The relative concentration of gammacerane and β-carotane are very small, with an average gammacerane index of 0.10 and the β-carotane/n-Ci ratio <0.20 mostly (Bian et al., 2019; Ma et al., 2015). The tricyclic terpenes in the P2w source rocks are dominated by C20, followed by C21 and C23 tricyclic terpenes, showing a noticeable “decline” style distribution (C20 > C21 > C23) (Table 3). Based on these biomarker data, the P2w source rocks were interpreted to have been deposited in a relatively oxidizing environment with the highest higher plant input (Tao et al., 2021).
Source identification of crude oils
The Mahu Sag contains four source units representing depositional settings, ranging from reducing alkaline lacustrine to a relatively oxidizing environment. In Wang and Kang (1999), Wang et al. (2001), and Tao et al. (2019), the tricyclic terpane is an effective parameter for separating different source rock types. Diterpanes, which include tricyclic terpane, have been proposed as marker compounds for higher plant materials in sediments and crude oils, especially the C19 tricyclic terpanes (e.g., 19-norisopimarane) and C20 tricyclic terpanes (e.g., isopimarane), which are abundant in higher plants and are the primary constituents of conifer resins (Noble et al., 1986). The plot of C21/C23 versus (C19 + C20)/C23 tricyclic terpanes showed a positive linear relationship between the ratio of C21/C23 and the ratio of (C19 + C20)/C23 tricyclic terpanes. The P1f source rocks have the lowest relative concentrations of C19, C20, and C21 tricyclic terpanes, and the ratios of C21/C23 and (C19 + C20)/C23 were all <1.1 and 2.0, respectively. In contrast, the C and P1j source rocks as well as the P2w source rocks all have C21/C23 ratios >1.1, whereas the P2w source rocks have greater C21/C23 ratios ranging from 3.0 to 9.0 than the C and P1j source units, which ranges from 0.9 to 5.0 (Tao et al., 2019, 2021). As indicated in the introductory parameters for identifying four types of source rocks, this set of parameters is then used to classify the oils into four source-distinctive groups.
Source of group I oils (P1f)
Group I oils are characterized by minor relative concentrations of C19, C20, and C21 tricyclic terpanes, and the C21/C23 and (C19 + C20)/C21 tricyclic terpanes were not >1.1 and 1.5, respectively, indicating limited terrestrial organic matter input (Figures 6(c) and 7(a)) (Noble et al., 1986; Tao et al., 2019). The lack of terrestrial input of the P1f source rocks also reflected by other molecular parameters, including the ratios of β-carotane/n-Ci and gammacerane/C30 hopane. β-Carotanes are believed to be derived from β-carotene, a compound susceptible to redox conditions and easily oxidized; they can only be preserved under strict reducing conditions (Ma et al., 2004). β-Carotane is formed by hydrogenation to carotene (reduction reaction), and its significant occurrence indicates a saline or hypersaline anoxic reducing depositional environment (Irwin and Meyer, 1990; Ma et al., 2004; Peters et al., 2007). Gammacerane is often found in sediments deposited under hypersaline conditions but is not exclusive to such an environment (Mello et al., 1988a; Moldowan et al., 1985). Gammacerane is an indicator of water column stratification, which promotes the development of hypersalinity and results in bacterivorous ciliates living below the chemocline biosynthesizing tetrahymanol, which is believed to be the precursor for the formation of gammacerane (Damsté et al., 1995). Among the four oil groups, group I oils have the highest relative concentration of β-carotanes and gammaceranes (Figures 6(a), (b) and 7(a)), indicating that group I oils originate from sediments deposited under hypersaline conditions with water column stratification, which is consistent with the mineralogy and molecular geochemical behaviors of the P1f source rock in previous studies (Cao et al., 2006, 2015; Tang et al., 2022; Tao et al., 2019; Xia et al., 2022; Zhang et al., 2018, 2000, 2022). In addition, group I oils showed the lowest Pr/Ph ratio compared to other group oils (Figures 6(a), (b) and 7(a)). Phytane (Ph) predominance is achieved by reducing phytol residues or bacterial cell walls under anoxic conditions. A lower Pr/Ph ratio also indicates a reducing environment (Moldowan et al., 1985).

Representative chromatogram for different genetic group oils of Upper Wuerhe Formation of south Mahu Sag. From left to right: GC-MS TIC of whole oil showing the alkane distribution, m/z 191 mass fragmentogram showing the distribution of terpanes, m/z 217 mass fragmentogram showing the distribution of steranes and diasteranes. Detailed information about the oil sample and peak identification is shown on the figure. GC-MS= gas chromatography-mass spectromentry; TIC=total ion chromatogram.
Pr/n-C17 and Ph/n-C18 (acyclic isoprenoids and corresponding n-alkane) data for oil from the Upper Wuerhe Formation indicate that a portion of the oil samples of group I is derived from marine or saline lacustrine organic matter in a strongly reducing environment, and the remainder fall into a weakly reducing environment (Figure 4(d)). This difference in the molecular geochemistry of group I oils is interpreted to be caused by P1f source rock facies variations; the depocenter of P1f is dominated by muddy dolomites or dolomitic mudstone, which are formed under a highly reducing condition with lower Pr/Ph; however, the off-depocenter area is dominated by mudstones deposited under the weakly reducing environment with a relatively high Pr/Ph; the averages of these two are 0.81 and 1.02, respectively (Tao et al., 2019). Group I oils have δ13C values in a relatively low and narrow distribution in the range of −30.94‰ to −29.55‰ (mean of −30.22‰), which correlated well with district I oils in Yu et al. (2017) and group I oils in Zhang et al. (2022), in which these oils are interpreted as P1f derived, and their δ13C values are mainly distributed between −28‰ and −30‰ with rarely lower than −30‰ (Yu et al., 2017; Zhang et al., 2022).
Source of group II oils (C/P1j)
Group II oils are mainly characterized by a higher ratio of C21/C23 tricyclic terpanes, all of which are >1.1, and the accompanying ratio of (C19 + C20)/C23 also increases, with an average of 1.13 compared to the average value 0.93 of group I oils (Figures 6(c) and 7(b)). This indicates that group II oils differ from those in group I because the increasing relative concentration in the C19, C20, and C21 tricyclic terpanes suggest a relatively high amount of higher plant input to group II oils, which is consistent with an origin from C and P1j source rocks unit. This inference is also supported by the lower relative concentrations of β-carotane and gammacerane (Figures 6(a), (b) and 7(b)), reflecting group II oils formed in a typical marine or lacustrine setting with lower salinity and weak water column stratification. Similar features are also shown in the plot of Pr/n-C17 versus Ph/n-C18, which was used as a proxy for predicting the organic matter types and redox conditions of a given source. All the oil samples of group II are located in the mixed sources deposited under weakly oxidizing to reducing environments, which correspond to the multiple organic matter types in the C and P1j source rocks, including primarily benthic algae and higher plant-related components (Tao et al., 2019).
The distributions of C27, C28, and C29 regular sterane (Figure 6(e)) and carbon isotope ratios of the whole oils (Figure 6(f)) may also support their origin from C and P1j source rocks with multiple organic matters; however, these two parameters are only supplementary rather than discriminative. The regular sterane 20R epimers of the 5α(H), 14α(H), and 17α(H) forms of C27 to C30 steranes are inherited directly from higher plants and algae without variation during maturation. The relative proportions of C27, C28, and C29 regular steranes in oils are controlled by the types of organisms and are often used for facies interpretation (Waples and Machihara, 1990). The predominance of C27 steranes is almost always associated with their contribution from marine organisms. The precursor of C29 sterane is closely related to nonmarine organic matter. The C28 sterane may become abundant through geological age with an increasing C28 to C29 ratio in the marine system (Grantham and Wakefield, 1988). In group II oils, the relative content of C27 sterane increased in the background of a predominance of C29 sterane (Figure 6(e)), which may be caused by a relative contribution of marine or lacustrine organisms in addition to the significant contribution of terrigenous organic matter. Isotopically, a portion of the oil samples fall into the range of group I, while the remainder are isotopically lighter than the other oil groups (Figure 6(f)). The distribution of C27 to C29 regular sterane and the δ13C of crude oil supplement in group II oils are derived from C and P1j source units with mixed organic matter (algae and higher plants) deposited in a nearshore marine or transitional environment (Tao et al., 2019).
Source of group III oils (P2w)
Group III oils stand out from the rest with relatively high concentrations of plant-derived tricyclic diterpenes such as C19 and C20 tricyclic terpane (Figure 6(c)), negligible β-carotane and gammacerane concentrations, and the highest Pr/Ph ratio (3.02) (Figures 6(a), (b) and 7(c)). In addition, the plot of pristane (Pr)/n-C17 versus phytane (Ph)/n-C18 suggests that group III oils originated from terrigenous organic matter (Figure 6(d)). Such a combination of biomarker distributions is characteristic of deposition in a relatively oxidizing proximal freshwater environment with high terrigenous organic matter inputs. Significant terrestrial input is consistent with the P2w source rocks deposited in relatively oxidizing limnetic facies dominated by organic matter derived from higher plants (Tao et al., 2019). In addition, group III oils showed a strong predominance of terrestrially derived C29 sterane with only low to moderate C27 and C28 regular sterane (Figure 6(e)), which is similar to the oil reservoirs in the lignite siltstones of the nonmarine Elko Formation of Nevada, which are dominated by terrestrial plant material (Palmer, 1984).
Compared with group I oils, groups II and III oils have biomarker signatures indicative of oxidizing environments affected by higher plant inputs. Whereas group III oils showed highly oxidative signatures compared to group II oils as reflected by the comprehensive biomarker parameters (Figure 6). Tao et al. (2019) have examined the organic petrography and molecular geochemistry of C/P1j and P2w source rocks in the northwestern Junggar Basin. The C and P1j kitchens contain organic matters derived from multiple sources including benthic algae and higher plant components, and they were interpreted as deposited in a weakly oxidizing to reducing environmen. In contrast, the P2w kitchens have distinctly different organic matter compositions and corresponding biomarker compositions; they were dominated by higher plant components and generally had high Pr/Ph values (mean of 2.6), indicating a relatively oxidizing environment (Qin et al., 2022; Tao et al., 2019).
Group III oils in this study is a high-wax oil with a high Pr/Ph ratio >3.0 and a strong odd-carbon preference in the region from n-C17 to n-C25 (Figure 7(c), left picture). These features are characteristic of a relatively oxidizing freshwater lacustrine depositional setting (Peters et al., 2000; Robinson, 1987). These oils contain little tricyclic terpane, but the low region of the m/z 191 chromatograms is dominated by C20 tricyclic terpane and C24 tetracyclic terpanes (Figure 7(c), middle picture), which are typical characteristics of oils derived from terrigenous organic matter deposited in a freshwater fluviodeatic and lacustrine depositional environment (Noble et al., 1986; Peters et al., 2007; Robinson, 1987; Waples and Machihara, 1990). A similar distribution of terpane has also been reported in Jurassic and Cretaceous oils in the Chinese Junggar Basin (Wang and Kang, 1999), Permian Karmona oils in the South Australian Cooper Basin (Philp and Gilbert, 1986), and Lower Neocomian oils in Brazilian Offshore Basins (Mello et al., 1988b), and their associated source rocks are coal and carbonaceous mudstone. Group III oil is interpreted as derived from P2w source rocks as indicated by associated molecular geochemistry behaviors, especially the peak-to-peak similarity of terpane distribution. This shows that the oil is very similar to their interpreted P2w source rock (Figure 7(d)), which has been demonstrated to be deposited in a relatively oxidizing setting containing mainly terrestrial organic matter, such as sporopollens and land–plant fragments (Tao et al., 2019).
Source of group IV oils (C/P1j mixed P1f)
Group IV oils are interpreted to represent mixed oils derived from C/P1j and P1f source rocks. Such evidence is provided, for example, by oils having a higher C21 versus C23 tricyclic terpane ratio significantly above one, but with a moderate relative concentration of β-carotane and gammacerane (Figures 6(a) to (c) and 7(e)). The high C21/C23 ratio of group IV oils is closely associated with group II oils derived from C and P1j source units. However, the moderate abundance of β-carotane and gammacerane suggests a saline lacustrine setting with definite water column stratification, which shares characteristics with group I oils derived from P1f source rocks.
C21 > C23 tricyclic terpane ratio is a characteristic geochemical signature of the C/P1j source unit and the P2w source rocks (Cao et al., 2005; Ma et al., 2015; Tao et al., 2019; Zhang et al., 2000). However, the influence of P2w source rocks was ruled out by their low to moderate Pr/Ph and (C19 + C20)/C21 tricyclic terpane ratios (Figure 6(b) and (c)). All the oil samples are located in the area suggesting a reduction to a weakly oxidizing environment with mixed sources (Figure 6(d)). The distribution of C27 to C29 regular sterane in group IV shows that multiple sources may input from aquatic algae and higher plant input (Figure 6(e)). Isotopically, the wide range of δ13C values for group IV oils, ranging from −31.46‰ to −28.54‰, made it difficult to explain the 2.92‰ carbon isotope difference observed in oils from a single source (Figure 6(f)). The above charecteristics give such oil a hypersaline, reduced depositional setting with significant algae-related input and weakly oxidzing environment with influence of terrigenous organic matter. Therefore, group IV oils are believed to be mixed by oils drived from C/P1j and P1f source rocks.
Spatial distribution of the different crude oil types
The crude oil reservoired in the Upper-Formation is mixed because it is derived from multiple kitchens, including C/P1j, P1f, and P2w source rocks. However, when we examine the spatial distribution of these supposedly mixed-source oils, we observe distinct regions of accumulation rather than mixing. A relatively clear feature is that the P1f source rocks make the most significant contribution to the study area, and almost all areas contain crude oil from this kitchen (Figure 8); the oils (24 sample) belong to group I (P1f-driven) are primarily located in the Zhongguai uplift. However, the C/P1j kitchen has a limited influence on the Zhongguai uplift, group II oils (7 sample) are all located in the transition zone (Zhongguai slope) between the Zhongguai uplift and Mahu Sag. The P2w source rocks have a very insignificant contribution to the crude oils in the study area; only one sample geochemically similar to the P2w kitchen is documented in the Zhongguai uplift (Figure 8). Group IV oils, a geochemical hybrid group from P1f and C/P1j mixed kitchens, are mainly occurred in the Mahu Sag rather than Zhongguai uplift (Figure 8).

The spatial distribution of different genetic group oils. Black dots represent oils derived from P1f source rock; red dots represent oils derived from C and P1j source unit; green dots represent oils derived from P2w source rock; blue dots represent mixed oils derived from P1f and C/P1j source rocks. The main parameters used for crude oil classification are shown to the right of the well location dot: (a) C21/C23 tricyclic terpane ratio; (b) β-carotane/n-C
Compared with the lower wax content of the crude oils in the Zhongguai uplift, the waxy or high-wax crude oils in the study area is mainly distributed in the Zhongguai slope and the Mahu Sag (Figure 5), which is distinctively related to the results of oil-source correlation that the oils in this region are mainly derived from C and P1j source rocks with significantly higher terrigenous plant input. According to the classification diagram established by previous research (Thompson, 1983), groups II and IV oils in this study fall into the highly mature region with higher heptane and isoheptane values, whereas group I oils belong to the areas of mature to high maturity, and only one oil sample (group III) fall into the low-maturity region. Although this maturity parameter can be modified by secondary effects such as biodegradation and evaporative fractionation, its reliability in reflecting changes in maturity is demonstrated by the fact that these highly mature crude oils (groups II and IV) in Figure 9(a) conrespondingly contain a higher relative content of saturated hydrocarbons (Figure 4), compared to the mature oils (group I), which containing a lower abundance of saturates (Figure 4).

Variation of light hydrocarbon compositions and the wax content within the crude oil in the study area. (a) Variations in heptane and isoheptane index values indicate the variation in the maturity of different oils in the study area (cross plot is modified from Thompson, 1983); (b) cross plot of wax content for different sourced oils versus depth. Heptane index = heptanes/isopentane ×100; isoheptane index = (2-methylhexane + 3-methyhexane)/[(1c3 + 1t3 + 1t2)-dimethycyclopentane].
However, the high degree of maturity of groups II and IV oils may contribute to their higher wax content; this problem is solved by correlating the wax content of different genetic oils and their current burial depth (Figure 9(b)). The results shows that the wax content is not directly related to the burial depth of crude oils and even decreases in deeply buried areas. However, they have a clear relationship with the oil type, as groups II and IV oils generally have higher wax contents than group I, even at the same depth. Therefore, the higher wax content of these oil samples is mainly a result of the source influence rather than thermal maturity, which reflects a positive oil-source correlation as discussed above that the waxy oils of groups II and IV are derived from C and P1j source rocks with significant terrigenous organic matter input.
Constrain factors and accumulation restoration
Source rock distribution controlled oil near-source accumulation
C and P1j are distributed widely throughout the study area and have a closed stratigraphic distribution, with a common depocenter in the southern part of the Mahu Sag, where their thickness can exceed 1000 m (Figure 10(a) and (b)). The thickness gradually decreases toward the Zhongguai slope and uplift areas, and the residual thickness in the Zhongguai uplift becomes zero. The P1f is also widely distributed in the study area, and the depocenter is located in the center of the Mahu Sag, where the thickness of the P1f is up to 1200 m (Figure 10(c)). In contrast, the P1w has a different stratigraphic distribution compared to the C/P1j and P1f source rocks; its depositional center is far from the study area and is located in the northeastern part of the Mahu Sag (Figure 10(d)).

Maps showing the stratigraphic distribution and residual thickness of four sets of source rocks in the northwestern Junggar Basin. (a) Carboniferous (C); (b) Lower Permian Jiamuhe Formation (P1j); (c) Lower Permian Fengcheng Formation (P1f); and (d) Middle Permian Lower Wuerhe Formation (P2w). The distribution of different genetic group oils is embedded in this map to better analyze the stratigraphic relationship between them and the source rocks.
By combining the oil-source correlation conclusions with the stratigraphic distributions of the four sets of source rocks, it is suggested that the hydrocarbons from these kitchens were are near-source accumulated, particularly those from C/P1j source rocks. The evidences for the near-source accumulate is two-fold. First, the oils sourced from C/P1j kitchens are mainly found in Mahu Sag area, which located near the depocenter of the C/P1j source rocks (Figure 10(a) and (b)). Second, few oil samples correlate to P2w kitchens. Only one sample is found in well K841 in the slope area, and this is consistent with the geological fact that the depocenter of the P2w is far from the study area. Although the depocenter of the P1f is not in the study area, most of the oils are found to be related to this kitchen, especially in the Zhongguai uplift, where all oil samples were interpreted as derived from P1f source rocks. Similar geochemical oils related to the P1f source rocks were also discovered in the Dabasong and Xiayan uplifts in southern Mahu Sag, both of which are far from the depocenter of the P1f source rocks. The widespread hydrocarbons derived from P1f source rocks may be related to the wide distribution of the P1f in the northwestern Junngar Basin. Meanwhile, it probably reflects a higher hydrocarbon generation capacity which could be accompanied by more significant hydrocarbon expulsion and earlier hydrocarbon charging of the P1f source rocks (Cao et al., 2006; Tao et al., 2019).
Buried-thermal evolution history controlled oil charging and mixing
To better reveal the deeper factors that influence the various hydrocarbon charging and mixing in the Upper Wuerhe Formation complex oil reservoirs, the burial history is reconstructed for two representative areas: the shallower buried Zhonguai slope zone (Figure 11(a)) and the deeper buried Mahu Sag zone (Figure 11(b)). The results show that source rocks in the different structural zones have different burial and thermal histories, and, therefore, that the two regions have different oil generation and charging histories. Three oil-charing events have been revealed based on the homogenization temperature of hydrocarbon-bearing inclusions in the northwestern Junggar Basin, which occurred in the Middle-Late Permian, Late Triassic to Early Jurassic, and Early Cretaceous, respectively (Cao et al., 2005, 2006). In the period of Middle-Late Permian, the P1f and C/P1j source rocks all reached the mature stage in the Mahu Sag, with Ro ranges of 0.7% to 1.0% and 1.0% to 1.3%, respectively. However, in the Zhongguai slope area, the C/P1j source units were only in the low-maturity stage, and the P1f source rocks did not enter the mature stage (Ro < 0.5). Thus, the different oil charging histories in the slope and sag zones had already started from this first hydrocarbon charging event.

Diagram shows the burial and thermal evolution of two representative wells in the study area. (a) Zhongguai slope zone, well K80; (b) Mahu Sag zone, well MH3. These two representative simulation wells are represented by the green five-point star in Figure 1B. The basic tectonic and geothermal evolution data are from Qiu et al. The red arrows represent the time of hydrocarbon charging and are from Cao et al. (2006).
The Late Triassic to Early Jurassic is recognized as the main phase of petroleum generation and migration (Cao et al., 2006). The P1f kitchens were at the mature to high-mature stage (∼1.3% Ro), and C/P1j source units entered a high-over mature stage (∼1.3–2.0% Ro) during the primary charing time in the Mahu Sag area (Figure 9(b)). The P1f and C/P1j source rocks are mature enough to generate and expel significant hydrocarbons, especially the P1f source rocks, which can generate a larger amount of petroleums for charing the whole study area due to their high hydrocarbon generation capacity with higher total organic content and better organic types. The C/P1j source units have moderate hydrocarbon generation potential, and the oils derived from them in this period were mainly near-source and accumulated in the Mahu Sag area.
In contrast, the C/P1j source rocks in the slope area were merely into the mature stage (∼1.0% Ro) (Figure 9(a)), and the oils generated from C/P1j kitchens in the shallower Zhongguai uplift are believed to be smaller than the deep-buried Mahu Sag zone, which is the reason why almost all of the oils related to the C/P1j source rocks are found and intensely mixed with P1f oils in the Mahu Sag, while only a small amount is in the Zhongguai uplift.
A comprehensive oil accumulation model
Oil accumulation in the complex P3w reservoir is that the different sourced oils are located at their potential zone rather than mixed in all areas. The different stratigraphic distributions of the four sets of source rocks and their distinct thermal histories in different areas contribute to the differential oil charging and mixing. In addition, multistage tectonic activity in the basin created many fracture sets which include: (1) Hercynian to Indosinian thrust faults; (2) Indosinian to Himalayan strike-slip faults; and (3) Indosinian to Yanshanian normal faults (Figure 1(c)) (Kuang et al., 2022; Shao et al., 2011; Tao et al., 2006; Zhang et al., 2010). These fractures serve as critical vertical migration systems for hydrocarbons from multiple source rocks, especially the first two types, and are active in the primary hydrocarbon generation period of the source rocks, which can favor the charging and mixing of hydrocarbons in the reservoirs.
The P1f source rocks have reached the mature stage in the Mahu Sag in the period of Middle-Late Permian, and in the Late Triassic to Early Jurassic, they were considerrd at the mature to the high-mature stage (∼1.3% Ro) (Cao et al., 2006). The P1f source rocks were experenced two mainly phase of hydrocarbon charging event and believed to generate and expel a larger amount of petroleums for charing the whole study area, which consist with the P1f oils distribution in the P3w reservoirs. The P1f oils (group I) is exclusive to the Zhongguai uplift and is also mixed with the C/P1j oils (hybrid group IV oils) in the Mahu sag (Figure 12).

Cross-section showing the general model for oil accumulation and mixing in the Upper Wuerhe Formation. The location of cross-section BB’ is shown in Figure 1(b). The differently sourced oil reservoirs are displayed in different colors. Red indicates a single genetic group derived from P1f source rocks, blue mixed pink color indicates a mixed reservoir derived from C/P1j source rocks and P1f. C = Carboniferous; P1j = Lower Permian Jiamuhe Formation; P1f = Lower Permian Fengcheng Formation.
In the Mahu Sag, significant C/P1j oils generated during the main hydrocarbon charging stage (Late Triassic to Cretaceous) vertically migrated into the reservoir along the faults, gradually displacing and mixing with the previously charged P1f oils. The areas near the fault would develop a single-sourced group II oils district dominated by later-charged C/P1j oils (Figure 12). As the C/P1j oils gradually decreased in charging capacity as they moved away from the fault center, they began to mix with the early P1f oils gradually, and show a mixture characteristic of C/P1j and P1f oils, forming hybrid group IV oils district near the Mahu Sag (Figure 12). In the Zhongguai uplift, there is little charging and mixing of C/P1j oils, as it is far from the depositional center of the C/P1j source unit. Additionally, compared to the P1f kitchen, the lower hydrocarbon-generating potential of the C/P1j makes it difficult to distribute large amounts of hydrocarbons to areas far from their hydrocarbon generation center. Therefore, the oils drived from C/P1j source unit were mainly near-souce accumulated and mixed in the Mahu Sag.
Significance
The conclusion of oil-source correlation of this study indicates that the crude oils come from source rocks formed in different depositional settings, which also implies a multistage evolution of the superimposed basin. The C/P1j source rocks were geochemically deposited in a nearshore marine or transitional environment with mixed organic matter, including algae, and a significantly higher plant input. This source rock is the result of sedimentation in a remnant ocean basin that developed in the Late Carboniferous, and river inputs influenced it due to its small remnant basin area (Carroll et al., 1990; Zhang et al., 2007). The P1f was deposited in a hypersaline lacustrine environment, as indicated by its abundant β-carotane and gammacerane. The limited input of higher plants reflects the rapid increase in the lake area and water depth during this period. Water salinity and its reducibility were increased by strong evaporation and water column stratification; consequently, a large amount of organic matter was preserved because of this strongly reducing environment (Cao et al., 2015; Tao et al., 2019; Xia et al., 2021). During the deposition of the source rocks of the P2w, compared to the depositional period of the P1f, the water body was less saline, and the P2w source rocks were deposited in a relatively oxidative proximal freshwater environment dominated by organic matter derived from terrestrial plant material. Therefore, we can infer that from the Carboniferous to the Early Permian, and finally to the Middle Permian, the study area has successively experienced remnant marine basins characterized by transitional facies, reduced hypersaline lacustrine basins, and relatively oxidized freshwater lacustrine basins.
The distribution of hydrocarbons in superimposed basins is complex and difficult to predict owing to multistage hydrocarbon generation and accumulation. This study provides a example of how hydrocarbons accumulate in superimposed basins in the presence of multiple source rocks and multistage hydrocarbon charge events. Although the mixing of oils from different source rocks is common, the intensity of mixing varies in different regions, which is mainly controlled by the different stratigraphic distributions of source rocks and their different burial histories in different areas. The significant accumulation of C/P1j oils and mixing with the early charing P1f oils only in the Mahu Sag indicates that they are characterized by (a) near-source accumulation around their source rock depocenter, and (b) greater burial depth and corresponding higher burial thermal history at Mahu Sag. Reactive Hercynian faults and newly formed Indosinian fracture systems acted as the main vertical pathways for oil migration in the primary phase of the oil charging events (Late Triassic). Therefore, although there are multiple sets of source rocks with different depocenters in petroliferous superimposed basins, the hydrocarbons generated by them tend to accumulate near their respective largest hydrocarbon-generating areas.
Conclusions
A comprehensive study of oils in the Upper Wuerhe Formation of the northwestern Junggar Basin indicates that they are derived from mixed sources. Forty-six oil samples were classified into four oil genetic groups using representative molecular and isotopic parameters. The genetic group I oils were generated from source rocks of P1f with limited higher plant input, and their distribution was wider throughout the study area, ranging from the Zhongguai uplift to Mahu Sag. The genetic group II oils were generated from the source rocks of the Carboniferous and Jiamuhe Formation (C/P1j) as indicated by their increasing terrigenous organic matter inputs, and their distribution is limited to the Mahu Sag area, coinciding with the depocenter of the C/P1j source rocks. Group III oils were generated from P2w mudstone deposited in a highly oxidizing environment with predominantly higher plant inputs; only one sample was found; it is in the slope area. The genetic group IV oils are interpreted as a mixed genetic group derived from P1f and C/P1j source rocks; their distribution is mainly limited to the Mahu Sag, with little in the slope and uplift area.
The different occurrences and mixing of various genetic oils are mainly controlled by different stratigraphic distributions and the burial thermal evolutionary history of source rocks in the study area. The significant accumulation of groups II and IV oils related to C/P1j source rocks in the Mahu Sag is the result of the depocenter of the Carboniferous and Jiamuhe Formation (C/P1j) and their higher degree of thermal evolution. Our oil-source correlation effort indicates that the hydrocarbons were generated from different kitchens in the superimposed basin and tend to accumulate near the depocenter.
