On the use of derivatives in the polynomial chaos based global sensitivity and uncertainty analysis applied to the distributed parameter models

Sensitivity Analysis (SA) and Uncertainty Quantiﬁcation (UQ) are important components of modern numerical analysis. However, solving the relevant tasks involving large-scale ﬂuid ﬂow models currently remains a serious challenge. The diﬃculties are associated with the computational cost of running such models and with large dimensions of the discretized model input. The common approach is to construct a metamodel – an inexpensive approximation to the original model suitable for the Monte Carlo simulations. The polynomial chaos (PC) expansion is the most popular approach for constructing metamodels. Some ﬂuid ﬂow models of interest are involved in the process of variational estimation/data assimilation. This implies that the tangent linear and adjoint counterparts of such models are available and, therefore, computing the gradient (ﬁrst derivative) and the Hessian (second derivative) of a given function of the state variables is possible. New techniques for SA and UQ which beneﬁt from using the derivatives are presented in this paper. The gradient-enhanced regression methods for computing the PC expansion have been developed recently. An intrinsic step of the regression method is a minimization process. It is often assumed that generating ‘data’ for the regression problem is signiﬁcantly more expensive that solving the regression problem itself. This depends, however, on the size of the importance set, which is a subset of ‘inﬂuential’ inputs. A distinguishing feature of the distributed parameter models is that the number of the such inputs could still be large, which means that solving the regression problem becomes increasingly expensive. In this paper we propose a derivative-enhanced projection method, where no minimization is required. The method is based on the explicit relationships between the PC coeﬃcients and the derivatives, complemented with a relatively inexpensive ﬁltering procedure. The method is currently limited to the PC expansion of the third order. Besides, we suggest an improved derivative-based global sensitivity measure. The numerical tests have been performed for the Burgers’ model. The results conﬁrm that the low-order PC expansion obtained by our method represents a useful tool for modeling the non-gaussian behavior of the chosen quantity of interest (QoI).

Sensitivity Analysis (SA) and Uncertainty Quantification (UQ) are important components of modern numerical analysis.However, solving the relevant tasks involving large-scale fluid flow models currently remains a serious challenge.The difficulties are associated with the computational cost of running such models and with large dimensions of the discretized model input.The common approach is to construct a metamodel -an inexpensive approximation to the original model suitable for the Monte Carlo simulations.The polynomial chaos (PC) expansion is the most popular approach for constructing metamodels.Some fluid flow models of interest are involved in the process of variational estimation/data assimilation.This implies that the tangent linear and adjoint counterparts of such models are available and, therefore, computing the gradient (first derivative) and the Hessian (second derivative) of a given function of the state variables is possible.New techniques for SA and UQ which benefit from using the derivatives are presented in this paper.The gradient-enhanced regression methods for computing the PC expansion have been developed recently.An intrinsic step of the regression method is a minimization process.It is often assumed that generating 'data' for the regression problem is significantly more expensive that solving the regression problem itself.This depends, however, on the size of the importance set, which is a subset of 'influential' inputs.A distinguishing feature of the distributed parameter models is that the number of the such inputs could still be large, which means that solving the regression problem becomes increasingly expensive.In this paper we propose a derivative-enhanced projection method, where no minimization is required.The method is based on the explicit relationships between the PC coefficients and the derivatives, complemented with a relatively inexpensive filtering procedure.The method is currently limited to the PC expansion of the third order.Besides, we suggest an improved derivative-based global sensitivity measure.The numerical tests have been performed for the Burgers' model.The results confirm that the low-order PC expansion obtained by our method represents a useful tool for modeling the non-gaussian behavior of the chosen quantity of interest (QoI).

Introduction
Sensitivity Analysis (SA) and Uncertainty Quantification (UQ) are important components of modern numerical analysis [8].In SA one studies the relative importance of different inputs on the model output, which is usually represented by the quantities of interest (QoI).The main purpose of SA is the design of optimal (and feasible) control, UQ and data assimilation (DA) algorithms.In UQ one focuses on quantifying the uncertainty in the QoI, given the uncertainty in the model inputs.The most straightforward UQ method is the direct Monte Carlo method, where the model is run in a loop with the perturbed inputs; then the sample of QoI is processed to obtain its statistical characterization (skill scores, for example).The basic Monte Carlo method has a number of more sophisticated variants, e.g. using the importance sampling [46], quasi-random sequences [43], etc.However, for computationally expensive fluid flow models such as those common in geophysics, these methods are not particularly suitable.While in the absence of a viable alternative the Monte Carlo method is presently used in weather and ocean forecasting (ensemble forecasting, [27]), the need for development of more efficient methods is obvious.
One possible approach is to use the 'local' sensitivities.In particular, the corresponding SA and UQ methods have been recently applied in the field of computational fluid dynamics [22,38,3,15].Here, the adjoint (local) sensitivities are used to evaluate the variance of the uncertainty in QoI (or the covariance for a vector of QoI).The uncertainty in the input vector can also be either prior or posterior, i.e. before and after DA, respectively.The posterior covariance of the input uncertainty is approximated by the inverse Hessian of the auxiliary data assimilation cost function [16].The adjoint approach is computationally very efficient, however its accuracy for nonlinear models depends on the validity of the tangent linear approximation for the error dynamics (case of weak nonlinearity and/or small uncertainties).This is a key assumption in the ensemble forecasting [27], since the method relies on computing the singular vectors of the propagator considered at the optimal solution point.Besides, in the case of strongly nonlinear dynamics, both the posterior input uncertainty and the uncertainty in QoI may have non-gaussian probability distributions, which means that knowing the variance/covariance is not sufficient.Thus, the global sensitivity based methods have to be considered.
An alternative option is to develop the reduced-order dynamical models, either using the proper orthogonal decomposition (POD) [48,28]o r the dynamic mode decomposition (DMD) [39,35].In the POD method, the distributed state variables are represented by the linear combinations of the spatial modes.These representations are substituted into the model equations and the Galerkin projection is applied.In this process the original model governed by the nonlinear partial differential equations is transformed into the system of nonlinear ordinary differential equations for the time dependent weights associated with the POD modes.The resulting 'surrogate' model is much less expensive than the original one, and it is generally suitable for sampling.The method is intrusive in the sense that the surrogate model have to be actually created, which means an additional development effort.This is not always easy as well, despite several successful examples being reported.In the DMD method, one looks for a (linear) Koopman operator (modes, eigenvalues and eigenfunctions), which defines the evolution model explicitly.Thus, the DMD method is non-intrusive, but relies on a linear approximation of the original propagator.It is not obvious that using this type of linearized model for UQ is better than just using the local sensitivities.
Another approach is to use a phenomenological model, i.e. a model which describes the relationship between the model inputs and outputs without attempting to explain why the variables interact the way they do.Such model can be based on the polynomial chaos (PC) expansion (for a concise review of the PC methods the reader may refer to [26]).
In the intrusive version of the PC-based UQ technique, called the 'direct nonsampling method', see [30], the uncertaintyloaded model variables are expressed using the PC expansions.These are substituted into the model equations and the Galerkin projection is applied.Technically this approach is somewhat similar to the one used with POD, however the resulting model has to be solved just once (no sampling required).
In our paper we focus on a non-intrusive approach, where the PC expansion describes the relationship between the model inputs and outputs directly.The task of constructing the continuous response surface, given a finite sample of model responses, is solved by the two main approaches: regression and projection.In the regression methods one has to minimize the appropriate cost function.A few types of regression methods are used: the least squares regression [5,11,21], l 1 -minimization [33], and least angle regression [7], adaptive methods [1].In the projection methods one benefits from orthogonality of the polynomial bases involved, which allows the PC expansion to be computed as multidimensional integrals [49,14,17,18].In its original form, i.e. involving a full-tensor product quadrature, this method is not feasible for highdimensional problems.The major upgrades of this method have been destined to improve the integration efficiency.They include, for example, the use of sparse grid sampling techniques [40], nested quadratures (Clenshaw-Curtis), quasi-random sequences [43]o r Latin Hypercube sampling [9].In all cases, however, the integrals over the model response function (QoI) are considered.As opposed to this classical approach, in the presented paper the integrals over the derivatives of the model response function are considered, which gives some essential benefits.
Variational data assimilation (DA) is a method based on the optimal control theory [25,29].Presently, this method is preferred for weather and ocean forecasting in major operational centers around the globe, particularly in the form of the incremental 4D-Var [13], and in the form of the ensemble 4D-Var [12].In general, the variational DA method belongs to the family of variational estimation methods, which are widely used in a variety of applications, including geophysics, astrophysics, bio-medical, mechanical and aerospace engineering.One useful bonus associated with the numerical models involved into variational estimation is the availability of the adjoint model which helps the gradient (first derivative) of the standard data mismatch cost-function to be calculated in a single model run.This adjoint model can be easily adapted to compute the gradient of any other QoI being defined as a function of the state variables.Moreover, computing the Hessian (second derivative) of QoI is also possible, though more expensive, with a modification of the existing adjoint model into the second-order adjoint model [47].Thus, at least two first derivatives can be available in the discussed framework for the specified group of models.
The idea of the gradient-enhanced methods for computing the PC expansion has been introduced and implemented almost simultaneously by several authors [2,34,32]i n the regression framework.In this paper we propose a derivativeenhanced projection method.It has been mentioned that the major drawback of the projection methods is a slow convergence of the corresponding integrals.However, the use of the derivatives accelerates this convergence.For example, if the gradients are used, the convergence of the corresponding integrals for the first-and higher-order PC coefficients is accelerated by order of magnitude.Besides, we develop a filtering procedure which allows the sampling error to be further reduced.Another benefit of using the gradients is an easy way of accessing the 'total effect' global sensitivity indices [42].These are used for ranking and defining the importance set, i.e. a presumably small subset of the full input including the most influential components.The proposed method is currently limited to the case of the third-order PC expansion.For the distributed parameter systems a large dimension of the full model input is expected.Moreover, the dimension of the importance set can still be large, in which case considering the PC expansion of the fourth-and higher orders does not look feasible anyway.While the method is devised for the gaussian inputs (i.e. using Hermit polynomials), it can be extended to different types of inputs using other polynomials of the Askey scheme.
The paper is organized as follows.In §1 we define QoI as a function of the model input vector (called 'design function' hereafter), and its PC expansion using the 'probabilistic' Hermit polynomials.In §2 we describe the regression approach, its gradient-enhanced version and the proposed Hessian-enhanced version.The derivative-enhanced projection method introduced in this paper is based on the explicit expressions for the PC expansion coefficients via the derivatives of arbitrary order, presented in §3 (see Theorem 1, proof -in Appendix 1).In §4 the relationship between the known derivative-based sensitivity measure [24] and the global sensitivity indices is analyzed (see Theorem 2, proof -in Appendices 2 and 3), and an improved derivative-based sensitivity estimate is suggested (see Corollary 2).The implementation details, including the description of the filtering algorithms, are presented in §5.Taking into account that the PC coefficients are estimated and represented in the limited-memory form, the appropriate expressions for the metamodels are derived in §6.The proposed method is validated using the Burgers' model: the model equation, the adjoint equation and the Hessian-vector product operator are defined in §7.The description of numerical experiments and the results of simulations are presented in §8.The main outcomes of the paper are summarized in the Conclusions, and major mathematical proofs and derivations are collected in the Appendixes.

Problem setup
Let us consider the numerical model describing behavior of a dynamical system in terms of its state variables X ∈ X .Given a full set of the model inputs U ∈ U , the model can be considered as a 'control-to-state' mapping M : U → X X = M(U ), (1) where X and U are the state and control spaces, correspondingly.
For modeling the system behavior the true input set Ū must be specified.Under the 'perfect model' assumption the following can be postulated: X = M( Ū ).In reality, some components of Ū contain uncertainties ξ ∈ U .Thus, instead of Ū we use its best available approximation, either prior (background) or posterior (estimate): Because of the presence of ξ , the predicted state X|U * , that is, X evaluated (or conditioned) on U * , also contains an error δ X = X|U * − X| Ū .In many practical situations the quantities of interest (QoI) are represented by the functional of the state D( X) ∈ G, where G is the 'design' space and D : X → G is a linear or nonlinear mapping.Thus, one can define a generalized 'control-to-design' mapping G : U → G, such that The uncertainty ξ results into the error From now on we shall call ε -the design function.Quantification of ε is an important task in a variety of applications.
Below we suppose that the input space is finite-dimensional, i.e.U = R N , and ξ = ( ξ1 , ..., ξN ) T .Consider ξ is the form ξ = R(ξ ), (5) where ξ = (ξ 1 , ..., ξ N ) T with ξ i ∼ N(0, 1) being i.i.d.gaussian random variables with unit variance, and R is a transformation from the standard normal to a desired probability distribution.For example, if R(ξ ) = B 1/2 ξ , then one obtains ξ ∼ N(0, B); if R(ξ ) = μ + σ exp(ξ ), one gets the log-normally distributed ξ .Thus, the design function can finally be defined in the form Let us consider the polynomial expansion of ε(ξ) using 'probabilistic' Hermite polynomials: where The convergence of the series (7)f o r the gaussian entries ξ is proved in [10].The total number of coefficients b = (b 0 , where p is the chosen order of polynomials.Representation (7)i s often referred as the polynomial chaos (PC) expansion.

Implicit approach (least-square, regression)
According to (9), computing the PC expansion is feasible only for small N and p.Thus, one should define the inputs U ′ (and, correspondingly, ξ ′ ) having the most impact on ε(ξ).These inputs constitute the importance set of size N ′ , which is expected to be much smaller than N.For the importance set, the vector of PC coefficients is b ′ .Below we deal with the importance set, keeping however the original notations (i.e.without ′ ) for the sake of simplicity.Let us represent (7)i n the dot product form where H (ξ ) is a vector of weights (i.e. the values of the corresponding Hermit polynomials).In the least square regression method one looks for the minimum of the cost-function with where ε * (ξ l ) is the computed value of the design function (6)a t point ξ l , and (b) is a regularization term.The latter should be used if the system is under-determined and other types of regularization are not involved.In practice, in order to get a sensible estimate of b one needs an over-determined system L = kM, where k > 2a t least.Let us mention that the least square method is not the only suitable option for finding vector b.For example, some authors prefer the l 1 -minimization (see [32,33] and the references presented there).
The idea of using the first derivative (gradient) of ε(ξ) in the UQ context comes from the optimal control theory [29], where the concept of 'adjoint equations' is introduced.The adjoint model allows the gradient of any function of the state variables to be computed at a cost comparable to the cost of computing the function itself.The gradient of (7)i s a vector of size N with elements given by ∂ε(ξ ) i = 1,...,N.
The above expression can be re-written in the dot product form ∂ε(ξ ) where b 1 = ({b i 1 }, {b i 1 i 2 }, ..., {b i 1 ...i p }) T is a vector which includes all the PC coefficients except b 0 .Now, the generalized cost-function looks as follows: with where ∂ε * (ξ l )/∂ξ is the gradient of the design function (6) computed at point ξ l .Thus, one run of the adjoint model simultaneously provides 'data' for N constraints.The earliest reports on the gradient-enhanced PC expansion method can be found in [2,34].
A natural extension to this method would be to consider the second derivative (Hessian) of (6).In particular, we know that the Hessian-vector product is given by the second-order adjoint model [47].Using the Hessian-vector product, the eigenvalue decomposition {λ s , U s }, s = 1, ..., S of the Hessian can be obtained using the Lanczos method.Then, the Hessian can be presented via its eigenpairs in the limited-memory form as follows: The second derivative of (7)i s a symmetric N × N matrix with elements The above expression can be re-written in the dot product form where b 2 = ({b i 1 i 2 }, ..., {b i 1 ...i p }) T .Now, the generalized cost-function looks as follows: with where ∂ 2 ε * (ξ l )/∂ξ 2 is the Hessian of (6) computed at point ξ l and presented in the form (15).
While the idea of using the Hessian computations in the context of the PC expansion looks attractive, its practicality depends on the circumstances.The Hessian is far more expensive object to compute than the gradient.For each Hessianvector product the tangent linear and adjoint models have to be run in a sequence (see §7.2).The number of Lanczos iterations depends on the eigenvalue spectrum of the Hessian.If this spectrum contains a well separated subset of 'leading' eigenvalues, the number of Lanczos iterations is nearly equivalent to the size of this subset; it could be significantly larger otherwise.On the other hand, one computed Hessian simultaneously adds N(N + 1)/2 constraints for each l.
There is one important aspect of the regression method which is often overlooked.It is usually assumed that running the model for data is much more expensive than solving the minimization problem for (10).In fact, this depends on N (size of the importance set).Let us consider, for example, the third-order PC expansion p = 3.In this case, computing H (ξ l )b for each l requires N + N(N + 1) + N(N + 1)(N + 2)/2 multiplications.However, taking into account the sparsity of the derivatives of H , computing ∂ H (ξ l )/∂ξ i • b 1 requires N + N(N + 1) multiplications, and computing ∂ 2 H (ξ l )/(∂ξ i ∂ξ j ) • b 2just N multiplications, for each l.Therefore, each equation from ( 14) and ( 18)i s about N and N 2 times less expensive than the one from (11), respectively.So, the question here is what is more expensive: producing the 'data', i.e. the sample of ∂ε * /∂ξ, ∂ 2 ε * /∂ξ 2 , or resolving the subsequent minimization problem.Let us note that for distributed systems, the size of the importance set N could still be quite large.
While the Hessian-based enhancement of the regression methods can be useful in certain circumstances one needs more efficient tools allowing for larger N. Below we present an alternative method in which the derivatives are used, but minimization is not required.This method, designed for the 3rd-order PC expansion, i.e. p ≤ 3, is related to the quasi-regression or projection methods [4,3,17,18].The key elements of this method are the 2D and 3D filtering algorithms which allow the sampling error to be efficiently removed.

Explicit expressions for the PC coefficients
Let us denote by E[•] the expectation with respect to the standard normal distribution N(0, 1).The PC expansion coefficients b can be expressed via expectations of ε(ξ) and its derivatives.For the high-dimensional distributed models in mind we limit ourselves by considering the third-order polynomials.The main results are summarized in the following theorem: (7), the following expressions hold: where Proof of this theorem is presented in Appendix 1. Due to the super-symmetry, order of indexing in ( 21) and ( 22)c a n be changed.The expressions for b via ε(ξ) (first equalities in the sequences ( 20)-( 22)) are known in the 'projection' or 'quasi-regression' methods.However, expressions for b via the derivatives of ε(ξ) presented here have not been previously reported in the literature.These expressions can be used as a basis for explicit evaluation of the PC coefficients using the derivatives of ε(ξ).

Proposition 1. One can prove by induction that
This implies that the chains of equalities ( 21) and ( 22)) can be generalized to the polynomials of arbitrary degree p, with corresponding weights β.Let us note, however, that for high-dimensional distributed models only computing the first two derivatives (gradient and Hessian) may be feasible.
Corollary 1.The PC-representation ( 7)can be re-written via the derivatives of the design function ε(ξ) as follows: or These expressions provide different forms of what can be referred as the probabilistic Taylor expansion.

Global sensitivity analysis
Global sensitivity analysis (GSA) aims to quantify the relative impact of the inputs ξ on the design function ε.The commonly used GSA method is the variance-based method of Sobol' sensitivity indices [41], also known as ANOVA decomposition.In this paper we limit ourselves by considering the first-order 'main effect' indices S i and the 'total effect' indices S T i , defined in [37]a s follows: where ε(ξ)|ξ i means the random value of ε with the input component ξ i being fixed at its 'generic' value, ξ −i stands for 'all, but i', and Var(ε(ξ )) is the total variance of ε(ξ).
The unique relationship between the global sensitivity indices (GSI) and the PC coefficients has been derived in [6].For the PC expansion in the form (7), which is different from the one used in [6], we can write (see Appendix 3 for detail): The derivative-based global sensitivity measures has been introduced in [42].It has been proved that for ξ being a vector of random iid normal variables ξ i ∼ N(0, 1) The above formula gives a convenient tool for computing the upper-bound for S T i , to be used as a basis for ranking the inputs and, subsequently, for defining the importance set.If the final goal is to construct the probability density function (pdf) of ε(ξ), using the upper-bound guaranties that all important variables are accounted for, however the corresponding importance set may contain some less important variables.
The upper-bound estimate can be improved if some of the PC coefficients are known.This follows from the theorem below: Theorem 2. The expectation of the squared gradient is given by the series Proof of this theorem is presented in Appendix 2. The Theorem essentially replicates some of the results from [44], but represents them in a more transparent and convenient form.

Corollary 2. For any given order of polynomials, the improved derivative-based upper-bound for the 'total effect' GSI is given by the expression
Proof of Corollary 2 is fairly trivial.Comparing ( 32) and ( 34)w e notice that For the infinite series the above expression becomes an equality.Since β ≥ β, all terms in the series to the right are non-negative.Therefore, the more terms in the series are truncated, the stronger the inequality becomes.Formula (33) corresponds to the case when all terms are truncated.That is why (38)m u s t (at least in theory) provide less overestimated upper-bound for S T i than (33).Taking into account expressions for b ij , b ijk , ... from ( 20)-( 22)w e obtain (38).It can be easily seen that, for the truncated series, formula (32)g i v e s a lower-bound for S T i , which may not be a reliable basis for defining the importance set.
In practice, given a sample of functions and gradients {ε * (ξ l ), ∂ε where is the sample variance and b ij , b ijk , ... are the estimated PC expansion coefficients based on this sample.Of course, for small samples there is no warranty that the upper-bound evaluated from (39)i s better than the one without correction terms.Let us finally note (considering ( 20) and ( 31)) that i.e. the squared expectation of the gradient gives a lower bound for the first-order 'main effect' indices.
As discussed above, given the sample of gradients one can evaluate the upper bound for the 'total effect' sensitivity indices S T i , i = 1, ..., N by (32).Then, the inputs U can be ranked accordingly, which means the mapping i = I(i ′ ) between the original and the ranked sequences of elements in U , as well as its inverse i ′ = I −1 (i), are defined.The importance set can be chosen by considering N ′ < N elements (having the largest S T i ), where N ′ satisfies the condition where ǫ is a given threshold.Below we deal with the importance set of size N ′ , which means that all indexes involved are i ′ .For the sake of simplicity, however, we keep the original notations, i.e. we use N instead of N ′ , i instead of i ′ , etc.

Zero-and first-order coefficients
It has been already mentioned that the explicit ('projection' or 'quasi-regression') methods for estimating the PC coefficients have been previously reported, though without the derivatives of the design function being used.These methods are useful if computing a sufficiently large sample of model responses is feasible.
Note that equations ( 19)-( 22)r e p r e s e n t the statistical moments of a multivariate random function ε * (ξ ) and its derivatives.In theory, the uncertainty (variance) in the sample mean, variance, skewness and kurtosis is proportional to 1/L, 2/L, 6/L and 24/L, respectively, where L is the sample size [23].These values are significantly larger in practice.
Let us assume that the following estimates of the PC coefficients have a sufficient level of statistical conditioning with For the estimates the equivalent level of statistical conditioning can be achieved with the sample size L > 2 × L * , and for -with the sample size L > 6 × L * .
Since the computational cost of the function and of its gradient are often comparable, computing the sample of the gradients of size L * is feasible.Therefore, the second formula in (42)c a n be used for evaluating b i .However, the computational cost of the Hessian is significantly (50-100 times) larger, which means that the sample of Hessians of size L * may not be available.Therefore, b i, j have to be computed using the second formula from (43).This requires either a larger sample size to be consistent with the estimates of b 0 and b i , or a special treatment to reduce the sampling error.The filtering procedures for computing b ij and b ijk are introduced below.

Second-order coefficients with filtering
The second-order coefficients b ij can be computed via the gradient of ε(ξ) using the second formula in (43).Let A be an N × N matrix with the elements where N is the importance set size.According to the last equality in (21), matrix A must be symmetric in the limit L →∞.
For a finite L, however, it is not symmetric due to the sampling error, see Fig. 3(left).From this figure one can also notice that the presented 2D-field is smooth in the horizontal direction (index j), and oscillatory in the vertical direction (index i).Indeed, as it follows from (45), each row of A is a transposed weighted average of the gradient.The latter often is a sufficiently smooth function of the spatial coordinates (for distributed parameter systems).
The purpose of filtering is to retrieve the symmetric part from a noisy 2D 'image'.Let us define the matrix-vector product w = A T v, where the summation is performed along the oscillatory direction i, as follows: This matrix-vector product is used in the Lanczos algorithm to evaluate a number S 1 of eigenpairs (the largest magnitude eigenvalues λ s and the corresponding eigenvectors U s ) of matrix A. Then, the limited-memory symmetric representation of A is given by the formula and for the second-order coefficients we finally obtain Two versions of this algorithm are possible.In the 'full-memory' version matrix A is explicitly computed (LN 2 multiplications) and stored in memory (N 2 numbers) before the eigenvalue analysis step.In this case computing the matrix-vector product costs N 2 multiplications.In the 'limited-memory' version formula (46)i s used for computing the matrix-vector in the course of the Lanczos process.Here, the matrix-vector product costs L(N + 1)N multiplications, which is significantly more expensive in terms of the CPU time.However, in this case no additional storage is required.
Remark 1. Taking into account that the described procedure results into the limited-memory representation (48)f o r b ij we note that the suggested second-order analysis might be affordable for the full input set, i.e. before defining the importance set.It means that the ranking can be performed on a basis of the improved measure (39)i n v o l v i n g the second-order correction term.However, the third-order analysis is likely to be feasible for the importance set.

Third-order coefficients with filtering
The symmetrizer based on the eigenvalue decomposition represents a very efficient tool for removing the sampling error in the two-dimensional case (see Fig. 3(right) and §8.2 for details).Unfortunately, extension of this method to higher dimensions is not straightforward, since no direct analogue of the Lanczos algorithm for tensors is presently known.Here we suggest a simplified approach.
The third-order PC coefficients b ijk can be evaluated via the gradient or Hessian of ε(ξ) using one of the equalities from (22).Let us denote By definition b ijk , i, j, k = 1, ..., N must form a super-symmetric cube (see Fig. 1), however tensor A computed by ( 49) is not symmetric for a finite L due to the sampling error.Again, the gradient is a smooth function.Thus, tensor A has a smooth dimension k, and two oscillatory dimensions i and j.As in the 2D case, the goal is to retrieve the symmetric part from a noisy three-dimensional 'image'.
First, consider matrix A ( j) -the slice of the cube by plane (i, k), ∀ j, and define the matrix-vector product w = A ( j) v, where the summation is performed along the oscillatory direction i, as follows: This matrix-vector product is used in the Lanczos algorithm to evaluate a certain number of eigenpairs {λ s }, s = 1, ..., S 2 of matrices A ( j) , j = 1, ..., N.Then, the limited-memory symmetric representation of A ( j) is given by the formula Now, instead of the original 'unfiltered' cube A, represented by a set of vectors ξ l and ∂ε(ξ l )/∂ξ k , we consider a set of matrices Â( j) , j = 1, ..., N, each being compactly represented by its eigenpairs.These matrices form a partially filtered cube Â, which has two smooth directions i, k and one oscillatory direction j.Vectors ξ l and ∂ε(ξ l )/∂ξ k are no longer needed after that and could be removed from the memory if necessary.As in the 2D case, the full-and the limited-memory implementations are possible at this stage.
Secondly, consider matrix B (k) -the slice of the cube Â by plane (i, j), ∀k.We notice that B i,k and, therefore, define the matrix-vector product w = B (k) v, where the summation is performed along the oscillatory direction j, as follows: s,k v j , i = 1,...,N. (52) This matrix-vector product definition is used to evaluate a certain number of eigenpairs {λ Then, the limited-memory symmetric representation B(k) of B (k) is given by the formula similar to (51).Subsequently, instead of Â we consider a set of matrices B(k) , k = 1, ..., N, which form the cube B. Each matrix B(k) is compactly represented by its own eigenpairs which substitute the eigenpairs of Â in memory.
Finally, consider matrix C (i) -the slice of the cube B by plane ( j, k), ∀i.We notice that C (i) i, j and, therefore, define the matrix-vector product w = C (i) v, where the summation is performed along the direction k, as follows: This matrix-vector product is used to evaluate a certain number of eigenpairs {λ Then, the limited-memory symmetric representation Ĉ(i) of C (i) is given by the formula similar to (51).Let us note that even though the direction k was originally the smooth one, it could have been perturbed during the first two stages of the filtering procedure.Thus, the last (third) step is important to remove these possible perturbations.Finally, for the third-order coefficients we obtain Remark 2. If a sample of Hessians is available, the third-order PC coefficients can be computed by then, a modified filtering procedure must be applied.Since the Hessian is given in the limited memory form via a certain number of its eigenpairs, it is a smooth two-dimensional field.Thus, the original cube has two smooth directions j and k, and one oscillatory direction i.The filtering procedure similar to the presented above can be easily built.In practice, computing the Hessian could be worthwhile if running the original nonlinear model is significantly more expensive than running the tangent linear and adjoint models (both are linear).This could be the case when the implicit time scheme is utilized, which implies that a few nonlinear iterations (fixed-point or Newton) has to be made at each time step.

Adaptation of metamodels
One can benefit from considering the PC coefficients in the form ( 48) and (54) when constructing the metamodel.In particular, the second-order term ε(ξ) in ( 7)c a n be written as By using formula (48)t o represent b ij in the above expression and, also, taking into account that U (i) s, j are the orthonormal vectors, we obtain It is easy to see that computing ε 2 (ξ ) by the above formula requires 2S 1 N instead of N(N + 1) multiplications by the original formula.
Similarly, the third-order term ε(ξ) in ( 7)c a n be written as By using formula (54)t o represent b ijk in the above expression and, also, taking into account that U (i) s, j are the orthonormal vectors, we obtain where and d s, j ξ j .It is easy to see that computing ε 3 (ξ ) by the above formula requires approximately 3S 2 N 2 /2 instead of N(N + 1)(N + 2)/2 multiplications by the original formula.

Forward model
As a model we use the 1D generalized Burgers' equation ∂ϕ ∂t For solving (58)i t s conservative representation and implicit time discretization are used as follows: where j is the time integration index, and h t = T /n t is the time step.The spatial differential operator is discretized on a uniform grid (h x is the spatial discretization step, i = 1, ..., n x is the node number, n x is the total number of grid nodes) using the "power law" first-order scheme as described in [31].For each time step we perform nonlinear iterations on ϕ, assuming initially that ϕ = ϕ j−1 , and iterating until (59)i s approximately satisfied (i.e. the norm of the left-hand side in (59) becomes smaller than the threshold equal to 10 −12 n 1/2 x ).
The design function is defined using a value of the state variable ϕ at a given point x * , integrated over certain time period, i.e.

D(X
and the full input set is Thus, G(U ) = D(ϕ(U )) and the design function is defined by the equation where Ū = ( ū(x), ā(x), c(x), d(x)) T is the unperturbed ('true') input vector, B is the covariance matrix of the primal input uncertainty ξ = (ξ u (x), ξ a (x), ξ c (x), ξ d (x)) T , ξ i ∼ N(0, 1).The true initial state is given by the expression and the true values of coefficients in are: ā(x) = 0, c(x) = 0 and d(x) = 0.002.In computations we use the following parameters: simulation period T = 0.24, discretization steps h t = 0.004, h x = 0.005, number of spatial discretization nodes n x = 201.Thus, the full size of the input set is N = 804.
Fig. 4(left) shows the initial state u(x) = ϕ(x, 0) and the evolved state at times t = T /2 = 0.12 and t = T = 0.24.A general property of Burgers' solutions is that a smooth initial state evolves into a state characterized by areas of severe gradients (or even shocks in the inviscid case).The level of nonlinearity is, therefore, the strongest around the shock area at x ≈ 0.5.This is where the most noticeable deviation from the gaussianity (for the pdf of ε(ξ)) has to be expected.Thus, we choose to consider (60)d e fi n e d at points x * ={0.4,0.45, 0.48, 0.5, 0.52, 0.55, 0.60}.The covariance matrix B is generated by the method described in [16].The correlation function for each variable in (61)i s the same (see Fig. 4(right)).The standard deviations are uniform in space (apart from near the boundaries) with the following values: σ u = 0.2, σ a = 0.2, σ c = 1.0, and σ d = 0.00033.

Derivatives
Consider equation (58)w i t h the input U = Ū + ξ .Then, the gradient of ε(ξ) (first derivative) is given by ∂ε(ξ ) ∂ξ where ϕ * is the solution of the adjoint model: In turn, the Hessian (second derivative)-vector product is given by where v = (δu, δa, δc, δd) T , and ψ and φ * are the solutions of the sequence of the tangent linear and the (second-order) adjoint models as follows: ∂ψ ∂t Equation (67)i s often referred as the 'second-order adjoint' [47].One can see that it is the usual adjoint equation (64)w i t h a distributed source term (the last term in 67).Having a solver for the discretized equation (58)a t hand, it is relatively easy to adjust it for solving problems (64), ( 66) and (67)b y manipulating the coefficients and source terms.The resulting tangent linear and adjoint models are called 'inconsistent' models, since they are obtained using the 'optimize-then-discretize' approach.In some realistic large-scale applications the accuracy of the gradients produced by such models is considered as unsatisfactory.Alternatively, one can follow the 'discretize-then-optimize' approach, developing the models either manually or using the Automatic Differentiation (AD).The resulting adjoint model provides a precise gradient, however the model itself is often computationally inefficient and may suffer from instability.Since the expectations of the derivatives are sought in the UQ framework, the accuracy of the gradients here is less important than in optimization.Thus, the inconsistent models (if available) should be preferred because they could be far more efficient computationally.In this paper, however, we use the tangent-linear and adjoint models obtained by means of the AD engine TAPENADE [20], which has been applied to the Fortran routine implementing (59).This has been done to simplify the development of the routines computing the derivatives.
Let us denote by f or ward( Ū , ξ, ε(ξ)) the subroutine which computes the design function.By differentiating this subroutine in 'forward' and 'backward' modes (scalar ε(ξ) with respect to vector ξ ) we obtain the tangent linear model Let us note that the development of the adjoint for a more complex model using AD may not necessarily be easier than the development of the 'inconsistent' versions (due to a number of reasons).

Global sensitivity indices (GSI)
Here we present the GSA results using the derivative-based measures associated with (33) and (41).First, we evaluate the reference values of the 'main-effect' (29) and 'total effect' (30)G S I using the direct Monte Carlo simulations.Numerical implementation of ( 29) and ( 30)f o r finite samples replicates the procedure presented in [37].Next, these indices are compared to the derivative-based estimates.
The derivative-based measures are computed using a sample of gradients where ξ l is an element from the sample of random vectors ξ .These measures are defined as follows: Here, a relatively small sample size (L 1 = 400) is used.
Finally, the scaled local sensitivity based measure is introduced.It is computed at the reference (unperturbed) value of the control vector Ū as follows: where The scaling enables the shapes of Ŝ0 i and Ŝ T i to be compared.The GSA results are presented in Fig. 5 and Fig. 6.The panels in Fig. 5 show the 'main effect' and 'total effect' GSI -in dotted lines, and the derivative-based measures -in solid lines.The left, central and right panels demonstrate the indices associated with the initial condition u(x), the convection coefficient a(x) and the reaction coefficient c(x), correspondingly.Each panel in the vertical direction corresponds to different values of x * used in (60).
As expected, the numerical results conform with the theoretical predictions from [42] (presented in §4), asserting that: a) the mean squared gradient is the upper bound for the 'total effect' GSI; b) the squared mean gradient is the lower bound for the 'main effect' GSI.While the possibility of using the adjoint model is mentioned in [24], no relevant example for a distributed parameter high-dimensional model has been reported so far.Besides, from our numerical tests we can draw some interesting conclusions.
First, the original derivative-based measure S T i in (69) approximates the 'total effect' GSI exceptionally well and, therefore, can be directly used for ranking.As a result we could not find an example where the benefit of using the improved measure (39)i s clearly demonstrated.This does not deny the importance of measure (39), which might be useful in different circumstances.It is interesting to note that such quality of approximation has been achieved despite the strong nonlinearity of the Burgers' model, manifesting itself in other ways (see e.g.Fig. 6 or results in §8.3).
Second, the 'strong impact' variables may cover significant areas of the model spatial domain.This depends on the transport mechanisms supported by the model and the simulation period.Thus the importance set may constitute a significant part of the full input.This is different from the practice of sensitivity analysis involving the lumped-parameter systems, where the existence of a small number of the most influential inputs is anticipated.These conclusions can likely be extrapolated to the more complex CFD models, such as described by the Shallow Water equations or the Primitive Equations of the Ocean for example, which play important role in geophysical CFD applications.
In addition, in Fig. 6 we also compare S T i with the local sensitivity based indices S0 i , defined by (70).Since the gradient can be obtained by a single run of the adjoint model, the local sensitivity based approach is naturally attractive for the high-dimensional geophysical models involved in variational estimation.The considered case corresponds to the first row of images in Fig. 5.One can observe the essential difference between S T i and S0 i , which means that the ranking based on

S0
i could be wrong.This example underlines the importance of GSA.Let us note that for a linear design function the global and local sensitivities are equivalent.Thus, the difference between the global and local sensitivities indicates the presence of the strong nonlinearity in the model equations.

Filtering procedure
The explicit expressions for computing the PC expansion using the first and second derivatives of ε(ξ) (gradients and Hessians) has been introduced in §3.The key components of the method are the filtering procedures described in §5.Given a relatively small sample of gradients, these procedures allow the expected second and third derivatives (a basis for evaluating the second-and third-order PC coefficients, respectively) to be estimated with an acceptable level of the sampling error.
In order to illustrate the effect of filtering we check numerically formula (21).First, a sample of Hessians l , l = 1, ..., L of size L = 400 is computed by the Lanczos method, using the exact definition of the Hessian-vector product (65).Each Hessian is represented in the limited-memory form (15), involving S = 20 eigenpairs.Then, the sample mean ¯ is evaluated as Second, we compute matrix A defined by formula (45).Given A • v in the form (46), the eigenvalue decomposition of A reveals its symmetric part Â in the form (47), involving S 1 = 10 eigenpairs.The numerical results, which illustrate the performance of the filtering procedure, are presented in Fig. 2 and Fig. 3.The results correspond to the case x * = 0.50.In particular, Fig. 2(left) shows ¯ , Fig. 3(left) shows matrices A (i.e.before filtering) for different L, whereas Fig. 3(right) shows the corresponding matrices Â, i.e. symmetrized and filtered.These results confirm that the suggested filtering procedure removes the sampling error very efficiently, which has a crucial effect on the histogram estimates (see discussion in the next Section).Since l have been available in this experiment, we also present in Fig. 2(right) the total interaction indices (TII) Fig. 5. First-order 'main-effect' and 'total-effect' GSI by direct Monte Carlo and using ∂ε/∂ξ .
which are important measures considered in the second-order interaction screening [36].

Probability density via histogram
The major outcome of GSA is the ranking of inputs which reveals the importance set.For this (relatively small) set the metamodels are constructed in the form (7), also taking into account expressions (56)a n d (57).By running the Monte Carlo simulations involving the metamodels the histograms are built.The histogram approximates the probability density function (pdf) of ε(ξ), providing its most exhaustive stochastic description.For comparison, the reference histogram is built by running the direct Monte Carlo simulations involving the original model (58).The large sample size L 0 = 10 5 is used in all these simulations.Given ε(ξ l ), l = 1, ••• , L, the sample variance Var(ε(ξ )) can be estimated according to its definition (40).Alternatively, the total variance can be expressed as a sum of partial variances associated with the PC coefficients of a certain order as follows (e.g.[7]): The first-order metamodel includes first two terms of ( 7), involving coefficients b 0 and b i 1 .We introduce the scaling factor α 1 = Var(ε(ξ ))/D 1 and use α 1/2 1 b i 1 in the metamodel instead of b i 1 .This guaranties that the variance of the pdf built with the first-order metamodel is exactly Var(ε(ξ )).Thus, the first-order metamodel simply replicates the gaussian distribution with the mean equal to b 0 and the variance equal to Var(ε(ξ )).The second-order metamodel includes three terms of (7)i n v o l v i n g b 0 , b i 1 and b i 1 i 2 .Now, we introduce the scaling factor and use α 1/2 2 b i 1 and α 1/2 2 b i 1 i 2 instead of b i 1 and b i 1 i 2 , respectively.This also guaranties that the variance of the pdf built with the second-order metamodel is exactly Var(ε(ξ )).However, as opposite to the first-order metamodel, the shape of the pdf is not bound to be gaussian.The similar scaling is used for the third-order metamodel.Thus, all three metamodels would provide the pdf having the same mean and variance.
The results are presented in Fig. 7. Here, each panel corresponds to the corresponding x * used in (60), and contains four graphs: the reference histogram -in wide pale line, the histogram using the first-order metamodel (p = 1) -in dotted line, the second-order metamodel (p = 2) -in dashed line and the third-order metamodel (p = 3) -in solid line.The corresponding first four distribution moments (mean, variance, skewness and kurtosis) are presented in Table 1.
First, we notice that for x * = 0.4 and x * = 0.45 the pdf remains close to the gaussian.This is despite strong nonlinear effects, demonstrated in Fig. 6 for example.In these cases all three metamodels give close results.In cases x * ={0.50, 0.52, 0.55, 0.6} the second-order metamodel shows an impressive performance, allowing to catch the asymmetry of the sought pdf and the true location of its mode.Standing apart is the case x * = 0.48, where the pdf has a strong kurtosis.Here, the second-order metamodel does not improve the results of the first-order metamodel, however using the third-order metamodel looks useful.The case x * = 0.50 is another one where the impact of the third-order coefficients is essential.From these examples we conclude that the second-order metamodel allows the asymmetry to be described, which is usually associated with the skewness, whereas the third-order metamodel helps to describe the tailedness, associated with the kurtosis.Of course, the third-order metamodel is far more expensive both to generate and to run, and it may not be feasible if the importance set is large.
In the numerical experiments the PC coefficients for metamodels have been generated both by the proposed projection method, and by the least-square method described in §2.Naturally, the projection method is less expensive (both in terms of memory and the CPU time) since no minimization is required.However, the accuracy of histograms obtained by the two methods is interesting to compare.As 'data' we have used the samples {ε * (ξ l ), ∂ε * (ξ l )/∂ξ }, l = 1, ..., L, where L = 400.In the least-square method the cost-function (13)w i t h o u t the regularization term is considered, and its gradient has been generated by the AD.For the second-order PC expansion, the least-square problem is overdetermined for L = 400, thus the minimization process is converging as long as the gradient accuracy allows.For the third-order PC expansion, the same least-square problem is under-determined, thus the minimization process is stopped by the L-curve criterion [19].Analyzing the results we found that the histograms of a better quality are obtained by the projection method.In particular, the accuracy of the second-order metamodel is noticeably better in the case shown in Fig. 8(left).The performance of the third-order metamodel generated with the least-square method is unsatisfactory, since it is unable to catch the kurtosis anomaly of the histogram Fig. 8(right).At the same time, the third-order metamodel generated with the projection method performs satisfactorily in this case.
The impact of filtering on the estimated histograms is demonstrated in Fig. 9(left).Here, the histograms obtained without and with filtering are presented, alongside the reference one.In order to understand the mechanism of this impact, let us No wonder that the third-order term obtained without filtering totally dominates the behavior of the metamodel, resulting into the erroneous histogram.Filtering reduces the norm of the second-and third-order terms by removing the sampling error.Thus, the corresponding histograms look far more sensible.Finally, Fig. 9(right) demonstrates how the accuracy of the projection method depends on the sample size.Here we observe that the histograms approach the reference one with increasing L. This is a sign of consistency in statistical computations.

CPU-time estimates
Here we present some estimates of the CPU time budget (a single processor implementation) required for different key computational steps.Of course, the attention should be paid to the time ratios.Let us remind that the size of the importance set involved is N = 210, the sample size is L = 400.a) Computing second-order coefficients b i, j (Section 8.2): 'full memory'/'limited-memory' versions -0.16 s/2.9s ; b) Computing third-order coefficients b i, j,k (Section 8.3): 'full memory'/'limited-memory' versions -61.2s /725.2s ;  c) One iteration of the gradient-enhanced regression method (Section 2) for the second-order PC expansion -1.25 s, for the third-order PC expansion -118.4s ; d) One run of the metamodel: second-order -1.51 × 10 −3 s, third-order -2.05 × 10 −2 s; e) One run of the adapted metamodel (Section 6): second-order -1.42 × 10 −4 s, third-order -2.93 × 10 −3 s; f) One run of the forward model (Section 7.1): 2.7 × 10 −2 s; g) One run of the forward-adjoint model (Section 7.2): 4.9 × 10 −2 s; h) Time of computing the limited-memory Hessian (10 eigenpairs): 0.95 s.The presented CPU-time budget confirms that the projection method provides a far more efficient tool for processing the computed samples {ε * (ξ l ), ∂ε * (ξ l )/∂ξ }.Indeed, the second-order metamodel is generated in 0.16 s.At the same time, just one iteration of the regression method requires 1.25 s, whereas a few tens of such iterations could be necessary.Besides, the adapted metamodel generated by the projection method is about ten times faster than the metamodel obtained by the regression method.The third-order metamodel is generated in 61.2s , while one iteration of the regression method requires 118.4s .The number of iterations required to resolve the regression problem of size M = 1587986 is mounting to a few hundreds.

Conclusions
Performing Sensitivity Analysis and Uncertainty Quantification involving large-scale fluid flow models currently remains a serious challenge.One difficulty is the computational cost of running such models that limits the direct use of the Monte Carlo methods.Another difficulty is the high-dimensional discretized model input, which may not always be reduced to the importance set of a feasible size.At the same time, some of the models under consideration are involved in the process of variational estimation (data assimilation).This implies that the tangent linear and adjoint counterparts of such models are available, and, therefore, one can compute the gradient (first derivative) and the Hessian (second derivative) of any design function (QoI) defined over the state variables.These derivatives can be used both for SA and UQ.
The use of the polynomial chaos expansion is considered as a promising way of reducing the computational complexity of SA and UQ.A relatively recent developments in this direction are the gradient-based global sensitivity measures and the gradient-enhanced methods for computing the PC expansion.The latter have been implemented in the framework of the regression approach.In this paper we propose a general derivative-enhanced projection method for computing the PC expansion, and the improved derivative-based sensitivity measure, both suitable for high-dimensional and computationally expensive models.By 'derivative-enhanced' we mean that both the gradient and the Hessian information can be utilized.This method allows a high-level parallelization, e.g. the PC coefficients of different order can be computed independently.Being computationally more efficient than the gradient-enhanced least square regression methods (explicit filtering instead of minimization), the proposed method has demonstrated a better performance in terms of accuracy (using the same sample data).
Regarding SA, the numerical tests have demonstrated that the gradient-based global sensitivity measure approximates the total-effect GSI exceptionally well.Thus, it must be preferred to the adjoint (local) sensitivity-based measures, the use of which may result into a seriously wrong ranking and an irrelevant importance set.The look of the total-effect sensitivity indices confirms that the impact of the spatially distributed input variables decays gradually as a function of distance.Thus, one may not anticipate the design function to be dependent on a 'few most significant inputs', which is often a basic assumption underlying the possible use of the PC expansion.Thus, the third-order PC expansion for such models looks like a limit.
Regarding UQ, the results show that the second-order PC-expansion gives a noticeable improvement over the first-order PC-expansion in approximating the pdf of the design function.In particular, it allows its asymmetry and the position of the mode to be revealed.Let us note that, with our method, computing the second-order PC expansion is feasible for the full input of almost any size.Next, given the second-order PC coefficients, one can compute the improved sensitivity measure, which can be used for ranking and defining the importance set.The third-order PC coefficients can be computed for the importance set only.The third-order PC expansion is far more expensive than the second-order PC expansion, however its contribution to the estimated pdf had been important only in one case (x * = 0.48).The main conclusion here is that the second-order analysis is, probably, the sought-for tradeoff: it is feasible for the existing models in mind and efficient to reveal some important non-gaussian features of the design function.In this respect, it is important to develop a criterion which may predict the usefulness of computing the next-order PC coefficients beforehand.Another important issue is the upgrade of the third-order filtering procedure.This requires the Lanczos algorithm to be generalized for the cube (at least).
The paper shows how the Hessian information can be utilized, however the conducted numerical tests have not provided a definite answer on its usefulness in practical terms.In one hand, the Hessian could be a far more expensive object than the gradient.On the other hand, while one gradient provides N constraints, one Hessian provides N(N + 1)/2 constraints.Thus, computing the Hessian is practical if the tangent linear and adjoint models are much less expensive than the forward model.This could be the case when the implicit time integration scheme is used and one must perform a significant number of nonlinear iterations (fixed point or Newton) at each time step.
Taking into account the properties of K α l (ξ l ), we have α l !. where Summarizing (92)-( 94) and substituting the result in (91), we obtain the assertion of Theorem 2.

Fig. 2 .Fig. 3 .
Fig. 2. Left: 'reference' expected Hessian (second derivative) using E[∂ 2 ε/∂ξ 2 ], for sample size L = 400.Right: the total interaction indices (TII).(For interpretation of the colors in the figure(s), the reader is referred to the web version of this article.) is a forcing term, along with the initial condition ϕ(x, 0) = u(x), and the Neumann boundary conditions (dϕ/dx)| x=0 = (dϕ/dx)| x=1 = 0. Equation (58)i s sometimes used in the data assimilation context as a simple model describing elements of the atmospheric motion.

Fig. 4 .
Fig. 4. Field evolution and correlation function for the input uncertainties.

Fig. 8 .
Fig. 8. Histogram of ε(ξ ) by the projection method and by the regression method (standard least-square).

Table 1
Auxiliary information to histograms in Fig.7.