Gene networks in biological systems are not only nonlinear but also stochastic due to noise corruption. How to accurately estimate the internal states of the noisy gene networks is an attractive issue to researchers. However, the internal states of biological systems are mostly inaccessible by direct measurement. This paper intends to develop a robust extended Kalman filter for state and parameter estimation of a class of gene network systems with uncertain process noises. Quantitative analysis of the estimation performance is conducted and some representative examples are provided for demonstration.
Biochemical networks1,2 involving metabolic networks, gene regulatory networks and signal transduction networks in biological systems play important roles in diagnosis of disease such as cancer and auto-immunity. Systems biology2–4 is quite different from traditional biology; it has been developed recently to understand biological systems from system level. Researchers have been devoted to design and construct of biological models by engineering methods and molecular biology techniques. The microarray technology using high-throughput method to measure a large number of gene expression states has also been attempted in the recent decade. According to the measured data it is possible to reconstruct the structure of biochemical networks, quantitatively analyze and systematically design and simulate the system behavior in silico.
In the literature,1,5,6 some mathematical models suitable for describing behaviors of the biological systems have been proposed. That kind of models can be classified into two major categories: logical model in the discrete-time domain and differential equation set in the continuous-time domain. Different from the deterministic case, biochemical networks of the real biological systems are generally non-ideal and are invariably noisy. For modeling accuracy, the influence resulting from noise contamination should not be ignored. In general, molecular noises involve intrinsic noises resulting from molecular birth and death and extrinsic noises due to environmental perturbations.7 The stochastic model was developed to characterize the biological systems with intrinsic and extrinsic molecular noises. In,8 the authors have presented a method for measuring performance robustness and presented two mechanisms to cope with the noise and process uncertainties.
With regard to control of biological systems, nonlinear feedback control strategies were applied to regulate the steady state of biological systems in.9–11 Other issues received increasing attention for noisy biological systems are development of control strategies while ensuring robust stability and filtering ability. In,12,13 a robust filtering circuit design has been developed by regulating parameters for gene networks to reduce intrinsic and extrinsic molecular noises. An adaptive control design method was also proposed in.14
Before performing feedback control, the system states should all be available. However, the internal states of most biological systems can only be observed partially. Under the situation, a state estimator is appropriate to reconstruct the full state in the noisy environment. The Kalman filter (KF)15 is the one for the purpose in the filed of engineering which has been well applied from system and control to signal processing for decades. However, applications of the extended KF (EKF) in state estimation of biochemical networks are rarely found.16 Until recently, the EKF has been attempted to estimate parameters of the gene regulatory networks.17 Moreover, a state observer was actually established using the EKF, based on the fluorescence probe model, a dynamic state model of the plant cell bioreactor, and online green fluorescent protein fluorescence measurement.18
While there were a few papers dealing with the issue of state estimation for biological networks, most of the approaches were based on the traditional Kalman filtering theory which assumed that the noise covariances including the process noise and measurement noise have been precisely known as a priori.16–18 The KF is shown to be the optimal state estimator against noise with Gaussian distributions. However, in the biological systems, the noise distribution may not be Gaussian, its autocorrelations may not be known exactly, or even uneasily to be precisely modeled.19 In this paper, an EKF for robustly filtering the states and parameters of the noisy gene networks is introduced. Quantitative error analysis for that kind of systems is presented in details. On the basis of the results obtained one would be able to identify effect of the filtering gain and the sizes of noise uncertainties to estimation performance. Two numerical examples have been conducted to verify the proposed design.
To clarify the notation, throughout this paper, let the vector norm of the real vector x ∊ℝn, denoted by ∥x∥, be defined as with E(·) denoting the operation of expectation.
Lemma 1: The induced matrix norm ∥A∥ corresponding to the vector norm of with x ∊ ℝn is given by
where σ1 is the maximum singular value of A, ie, .
Proof: By definition, the induced norm of the operator A is defined by
Let the singular value decomposition of A is given by A = UΣVT where Σ = diag(σ1, σ2, σn) with σ1 ≥ σ2≥ … ≥ 0, the unitary matrices U = [u1 … un] and V = [ν1 … νn]. Then
That is
Alternately, if one chooses x′ = e1 = [1 0 … 0]T which obviously satisfies ∥x′∥ = 1, then
From (1) and (2), it can be concluded that ∥A∥ = σ1.
Problem Formulation
In real biological systems, biochemical networks such as signal transduction networks, gene regulatory networks, and metabolic networks are invariably noisy. As in engineering area of research, the system dynamic behavior can be mathematically described by stochastic models, which could be further used as a basis for the purpose of analysis and control.
System Modeling
S-system is a type of power-law formalism and is based on a particular type of ordinary differential equations in which the component processes are characterized by power-law functions:1,5,6
where xj (j = 1, …,ñ) are dependent variables such as intermediate metabolites and products, xj are independent variables such as substrates and enzymes, αi ≥ 0 and βi ≥ 0 are rate constants which denote, respectively, production and degradation effects for dependent and independent variables; gij and hij are kinetic orders. Gene j will active gene i when the values of kinetic orders are positive and gene j will inhibitive gene i when the values of kinetic orders are negative. Zero values of kinetic orders represent gene j won't affect gene i.
The system model can be expressed in the following generalized nonlinear biochemical dynamics with the stoichiometric equation described by
where x(t) ∊ℝn is a state vector which denotes the concentration of metabolite, mRNA or protein, p = p(αi, βi,gij, hij)∊ ℝmis the parameter vector which include rate constants and kinetic orders, S denotes the stoichiometric matrix, and V(·) is a nonlinear function of the reaction rate. It can be expressed in a more general form as
where f(·) ∊ ℝn is a generalized nonlinear function vector.
For biochemical reactions, suppose that there are M intrinsic noise sources and an extrinsic noise, the nonlinear stochastic dynamical system can be described by13
where x(t) ∊ ℝn is the state which may denote a vector of protein concentrations of n genes, f(·) and g(·) are nonlinear function vectors, ni(t) is an intrinsic noise source which is the white noise with zero mean and standard deviation σi, w(t) is an extrinsic noise source with zero mean and standard deviation σwEquation (6) implies that the system is suffered from intrinsic noise corruption due to M kinetic parameter fluctuations. In gene regulatory networks, the external and internal noises occur independently and they appear randomly.19 More precisely, it has been proposed that the pattern of protein concentration growth is stochastic, exhibiting short bursts of variable numbers of proteins at varying time intervals.
Remark: The system with intrinsic noises in (6) can also be rewritten as the following Ito stochastic equation:
where Ni and W are standard Wiener processes or Brownian motions with dNi(t) = ni(t)dt and dW(t) = w(t)dt with the property and . The formulation is widely applicable to the general nonlinear gene network with n genes.
After the stochastic differential system in (6) or (7) is modeled to mimic the realistic behaviors of the object, some design specifications can be proposed for robust design of the system so that the desired behaviors or products can be achieved in spite of intrinsic parameter fluctuations and environmental disturbances.
Estimator Design
In practice, the internal states of biological systems may not be directly accessible. Biochemical process for gene regulatory networks is DNA to mRNA by transcription, mRNA to protein by translation and the generated protein regulates the gene. Not all protein concentrations are directly measurable,18 however, one could access the status of individual protein by utilizing an appropriate state estimator. It is proposed here an EKF to estimate the internal states as well as the parameters of concern.
Fundamental Kalman Filter
The KF uses measurements for a dynamic system observed over time that contain noise, and produce estimated values that tend to be closer to the true values of the measurements.
A stochastic linear time-invariant system with measurement can be described by
and
where x(t) is the state vector, y(t) is the output, A and C are, respectively, system and output matrices, and wx(t) and ν(t) are, respectively, process and measurement noises which are zero mean Gaussian noises with covariance matrices Q and R, respectively.
With regard to the system (8)−(9), which x(t) is to be estimated, the corresponding KF is given by15
where is the estimated state and K is the optimal Kalman gain given by
where P = PT > 0 satisfies the following Riccati matrix equation:
with the estimation error state .
Extended Kalman Filter
For estimation of the stochastic nonlinear systems, the commonly applied filtering mechanism is EKF which evaluates the partial derivatives at the estimated state value and uses nonlinear functions on the estimate itself.
Suppose that the nonlinear system with measurement is given by
and
where f(·)and h(·) are nonlinear function vectors with appropriate dimensions.
In more general, we consider a generalized representation of the stochastic nonlinear dynamical model described by
where Px0 = E[(x(0) − x0)(x(0) − x0)T], Pp0 = E[(p(0) − p0) (p(0) − p0)T], p(t) ℝm denotes the aggregated parameter vector, y(t) ∊ ℝr is the measurement output, f(·) and h(·) are nonlinear function vectors. Suppose that the uncorrelated extrinsic noise wx(t), parameter noise wp(t) and measurement noise ν(t) are white noises, uncorrelated and satisfy the following properties:
where the noise uncertainties satisfy
where and R = RT > 0 with Q0, Qp0 and R0 being their corresponding nominal parts, and ε1, ε3 and ε2 are finite constants characterizing the respective upper bound of the noise covariance.
For compactness, the matrix format of (18) can next be written in the state-space representation as follows
where . The idea of the EKF is that it operates by propagating the mean and error covariance of the state through time. The EKF for (18) in the matrix form can be represented as
where is the estimated state, is the estimated parameter, ŷ(t) is the estimator output, and K is the estimator gain.
The maximal covariances of the extrinsic, parameter and measurement noises can be specified by Q0 + ε1In, Qp0 + ε3Im and R0 + ε2I respectively Define the estimation error vector and the error covariance matrix . It can be proved that the error covariance matrix P and Kalman gain K satisfy, respectively,20
where G and H are, respectively, the linearized system and output matrices given by
and
The linearization is performed around the estimated state and parameter respectively.
As it can be observed from (26) that the magnitude of the Kalman gain K is closely related to the amount of measurement noise reflected by the size of R0 and the extent of the uncertain noise covariance specified by ε2. The term L accounts for the increase of extrinsic noise and parameter noise and the term −PHT(R0+ ε2In+m)–1HP reflects the decrease of uncertainty as a result of measurement.
When there is in the absence of extrinsic noise and a priori information in the biological system, by using the following matrix identity
the continuous Riccati matrix equation (26) can be written in the linear equation in P−1 as
Clearly, either large measurement error (large R0) or large uncertainty of the measurement covariance (large ε2) cause the error covariance P to increase considerably whenever a measurement is utilized. This also results in a smaller Kalman gain for small state estimate errors and ease off the updating speed of state.
Performance Analysis
It follows from (23) and (24) that the estimation error dynamics is given by
where ΔG(z,ẑ) = Δg(ẑ) − Δg(z) and ΔH(z,ẑ) = Δh(ẑ) − Δh(z) with Δg(·) = g(·) − G and Δh(·) = h(·) − H. It is reasonable to have
where ρ1 and ρ2 are finite positive constants and
with tr(·) denoting the trace operation.
It can be observed from the estimation error dynamics (29) that its solution is
where Φ(t) = e(G−KH)t. Or equivalently,
where .
On the basis of the Kalman filtering theory, the matrix G − KH will be asymptotically stable if (G, H) is detectable around all estimated states ǩ(t). Therefore, there exist positive constants m, m1 and β such that
and
with the induced norms specified by Lemma 1, where β can be chosen to be where
K = K(∊1,∊1,∊1).
To proceed, by taking norms on both sides of (31) gives
Or equivalently
where
Applying the Bellman-Gronwall inequality further yields
Or
Clearly, when
then
That implies that wouldn't be diverged with its upper bound specified by
Equations (32) and (35) specifies a crucial condition for determining convergence of the noisily corrupted system which is determined by the Kalman gain K and the values of ρ1 and ρ2. The condition is significantly affected by the amount of linearization and noise covariances. Inspection of the equations describing the behavior of the error covariance matrix reveals several observations which confirm engineers' intuition about the operation of the KF. As it can be observed from (36) that the estimation error is closely related to the upper bounds of the unreduced certain process and parameter noises and uncertain measurement noise. The larger the statistical parameters of the disturbances as reflected in the sizes of Q and Qp, and the more pronounced effect of the disturbances as reflected in the size of R, the more rapidly the error covariance increases.
Larger Kalman gains will expedite the convergence of the estimation error. However, the estimation error increases considerably whenever there are larger noise uncertainties specified by larger ∊ i, i = 1, 2, 3 and the linearization errors characterized by larger ρ1 and ρ2.
When there are process and measurement noises and uncertainties of noise covariances, from the observation of (26), (28) and (36), one should increase the magnitude of K for a larger stability margin β so as to assure convergence of the estimation error. However, as it was shown in (28), large measurement errors and the small error convariance P result in a small K. Thus, there is always a compromise between the optimal state estimation and stability robustness while designing the state estimator.
As for the implementation issue, possibility of the practical implementation of the estimator can be referred, for example, to,18 which utilized the green fluorescent protein (GFP) as a reporter for real-time bioprocess sensing and GFP concentration and other important states in bioreactor culture of transgenic tobacco cells were successfully estimated. Application of the idea to the current estimator design deserves more attention and is worthy of further investigation.
Simulation Study
For demonstration, two examples for a class of noisy gene regulatory networks are illustrated.
Example 1: Consider first a two-order system model for a real gene regulatory network given as follows:6
and
where wx(t) ~ (0, 0.1), wp(t) ~ (0, 0.1) and ν(t) ~ (0, 0.2). States and parameters are both estimated. The parameters including 4 rate constants and 8 kinetic orders are treated as the states. Thus the state variables are extended from 2 to 14. In this network, for the first term of the first differential equation (the rate of change of x1), ie, X0.2681 (t)x2−2.26 (t) with unit rate constant α1, shows accumulation of gene product 1. Since the variable x2 is raised to the power of the kinetic parameter −2.26 which reveals gene 2 will inhibit product of gene 1. On the other hand, for the second differential equation, the first term x12.739(t) x20.155(t) with unit rate constant α2 reflects accumulation of gene product 2. Since the variable x1 is raised to the power of 2.739 which reveals that gene product 1 will activate gene 2. The second terms −x10.469 (t)x20.359 (t) and −x10.197 (t)x20.281 (t) with unit rate constants β1 and β2 in the first and second differential equations reflecting degradation effect on gene products 1 and 2 respectively. Figure 1 illustrates the branch pathway of the two-dimensional S-system network.
The linearized system matrix ∂f/∂xT based on the state estimation is
and
The gene regulatory network for Example 1.
For the initial error covariance P(0) = I14, the results of dynamic simulation of the noise-free and estimated states and parameters for the noisy gene regulatory network using the proposed robust EKF given by (26) and (27) are shown as in Figure 2.
Dynamic simulation of the noise-free and estimated gene states and parameters with P(0) = I14; A) gene states; y-axis is concentration, x-axis is time,B–D) parameters.
Consider next the existence of uncertainties of the extrinsic noise and measurement noise with ∊1= 0.05, ∊3= 0.05 and ∊2= 0.1. The state and parameter responses are shown in Figure 3. The root mean square error (RMSE) was used to quantify the filtering performance with
where T1 ≤ t ≤ T2, (t) is the i-th estimation state and xfi(t) is the corresponding noise-free state. The RMSE values for the noise-free and estimated states and parameters in 0 ≤ t ≤ 5 are listed in Table 1 which shows that the estimator is able to filter the extrinsic and measurement noises to retrieve the real state and parameter values.
Dynamic simulation of the noise-free and estimated gene states and parameters with noise uncertainties (∊ = 0.05, ∊3 = 0.05 and ∊2 = 0.1);A) gene states; y-axis is concentration, x-axis is time,B–D) parameters.
Comparison of RMSE values for the system of Example 1 with and without noise uncertainties.
Example 2: Consider a nonlinear gene regulatory of four genes described and shown in Figure 4:13
where wx ~ (0, 0.2) and wp ~ (0, 0.2). The state variables are extended from 4 to 22 when 18 parameters including rate constants and kinetic orders are all treated as the state variables. The measurement model is given as
where y(t) is the measurement output and the measurement noise ν(t) ~ N(0, 0.5). For this gene regulatory network, gene product 1 activates gene 3, gene product 2 activates genes 1 and 3, gene product 3 represses genes 2 and 4, and gene product 4 activates gene 2.
Another example of the gene regulatory network.
The linearized system matrix ∂f/∂xT based on the state estimation for the EKF design can be obtained as
and
For the initial covariance matrix P(0) = I22, the results of dynamic simulation of the noise-free gene states and the estimated states of the noisy gene network are shown as in Figure 5. As it can be seen that the estimator tracked the noise-free case well while there were extrinsic noise and measurement noise.
Dynamic simulation of the noise-free and estimated gene states and parameters with P(0) = I22; A) gene states, y-concentration, x-axis is time,B–E) parameters.
Consider next the existence of uncertainties of the extrinsic noise and measurement noise with ∊1 = 0.1, ∊3 = 0.1 and ∊2 = 0.25. The gene responses are shown in Figure 6. As in the previous example, the results exhibits larger estimation errors due to added noise uncertainties, however, the deviation is not significant while compared with magnitudes of the nominal state or parameter responses.
Dynamic simulation of the noise-free and estimated gene states and parameters with noise uncertainties (∊1 = 0.1, ∊3 = 0.1 and ∊2 = 0.25);A) gene states, y-axis is concentration, x-axis is time,B–E) parameters.
Conclusions
This paper proposes a continuous EKF to estimate internal states and parameters of a class of gene networks while there are extrinsic and intrinsic noises and parametric fluctuations. Quantitative performance analysis for state estimation of the EKF is presented. Numerical simulations have confirmed possibility of the proposed method in designing robust EKFs. This shows potential of the presented design method in bridging the engineering approach to solve for the estimation problem in biological systems.
Footnotes
Acknowledgement
This research was sponsored by National Science Council,Taiwan,ROC under the Grant NSC-98-2221-E-005–087.MY3 and partly sponsored by Chung-Shan Institute of Science and Technology,Taiwan,ROC under the Grant XB99086.
Disclosures
This manuscript has been read and approved by all authors. This paper is unique and is not under consideration by any other publication and has not been published elsewhere. The authors and peer reviewers of this paper report no conflicts of interest. The authors confirm that they have permission to reproduce any copyrighted material.
References
1.
VoitE.O.Computational Analysis of Biochemical Systems: A Practical Guide for Biochemists and Molecular Biologists.Cambridge University Press, New York; 2000.
2.
KlippE., HerwigR., KowaldA., WierlingC., LehrachH.Systems Biology in Practice: Concepts. Implementation and Application.Wiley, Berlin; 2005.
3.
PalssonB.Q.Systems Biology: Properties of Reconstructed Networks.Cambridge University Press, New York; 2006.
4.
KitanoH.Systems biology: a brief overview. Science.2002; 295: 1662–4.
5.
KarlebachG., ShamirR.Modelling and analysis of gene regulatory networks. Nature Reviews Molecular Cell Biology.2008; 9: 770–80.
6.
WangH., QianL., DoughertyQ.Inference of gene regulatory networks using S-system: a unified approach, IET Systems Biology.2010; 4: 145–56.
7.
PaulssonJ.Summing up the noise in gene networks. Nature.2004; 427: 415–8.
8.
ChenB.S., WangY.C., WuW.S., LiW.H.A new measure of the robustness of biochemical networks. Bioinformatics.2005; 21: 2698–705.
9.
Ervadi-RadhakrishnanA., VoitE.O.Controllability of nonlinear biochemical systems. Mathematical Biosciences.2005; 196: 99–123.
10.
RaoC.V., WolfD.M., ArkinA.P.Control, exploitation and tolerance of intracellular noise. Nature.2002; 420: 231–6.
11.
LinC.L., LiuY.W., ChuangC.H.Control design for signal transduction networks. Bioinformatics and Biology Insights.2009; 3: 1–14.
12.
ChenB.S., WuW.S., WangY.C., LiW.H.On the robust circuit design schemes of biochemical networks: steady-state approach. IEEE Transactions on Biomedical Circuits and Systems.2007; 1: 91–104.
13.
ChenB.S., WuW.S.Robust filtering circuit design for stochastic gene networks under intrinsic and extrinsic molecular noises. Mathematical Biosciences.2007; 211: 342–55.
14.
JinY., LindseyM.Stability analysis of genetic regulatory network with additive noises. BMC Genomics.2008; 9: S21.
15.
GrewalM.S., AndrewsA.P.Kalman Filtering: Theory and Practice Using MATLAB.Wiley, New York; 2001.
16.
LillacciG., ValigiP.State estimation for a model of gene expression.Proceedings of IEEE International Symposium on Circuits and Systems. Seattle 2008: 2046–9.
17.
WangZ., LiuX., LiuY., LiangJ., VinciottiV.An extended Kalman filtering approach to modeling nonlinear dynamic gene regulatory networks via short gene expression time series. IEEE/ACM Transactions on Computational Biology and Bioinformatics.2009; 6: 410–9.
18.
SuW.W., LiuB., LuW.B., XuN.S., DuG.C., TanJ.L.Observer-based online compensation of inner filter effect in monitoring fluorescence of GFP-expressing plant cell cultures. Biotechnology and Bioengineering.2005; 91: 213–26.
19.
McAdamsH.H., ArkinA.P.Stochastic mechanisms in gene expression. Proceedings of the National Academy of Sciences of the USA.1997: 814–9.
20.
PoorV., LoozeD.P.Minimax state estimation for linear stochastic systems with noise uncertainty. IEEE Transactions on Automatic Control.1981; 26: 902–6.