ONE-STEP ESTIMATION FOR THE FRACTIONAL GAUSSIAN NOISE AT HIGH-FREQUENCY

. The present paper concerns the parametric estimation for the fractional Gaussian noise in a high-frequency observation scheme. The sequence of Le Cam’s one-step maximum likelihood estimators (OSMLE) is studied. This sequence is deﬁned by an initial sequence of quadratic generalized variations-based estimators (QGV) and a single Fisher scoring step. The sequence of OSMLE is proved to be asymptotically eﬃcient as the sequence of maximum likelihood estimators but is much less computationally demanding. It is also advantageous with respect to the QGV which is not variance eﬃcient. Performances of the estimators on ﬁnite size observation samples are illustrated by means of Monte-Carlo simulations.

untied in [3] using non-diagonal norming rates and the asymptotic efficiency of the sequence of MLE has been proved in the high-frequency scheme. This singularity has been also observed in the symmetric stable model [1] when the scale parameter and the stability index are estimated jointly and has been untied in a similar way [4].
The sequence of Whittle type estimators is also efficient in the high-frequency fGn setting [11]. Other sequences of fast computable estimators have been studied including estimators based on quadratic generalized variations [15] and estimators based on wavelet coefficients [9]. Unfortunately, these sequences of estimators are generally not asymptotically efficient.
Although the sequence of MLE is asymptotically efficient in the high-frequency scheme, it is not expressed in a closed form and its computation is time consuming. The goal of the paper is the construction of a fastcomputable sequence of estimators which is asymptotically equivalent to the MLE, the so-called sequence of one-step MLE (see [17,19]). The latter sequence is based on an initial (rate efficient but not variance efficient) sequence of quadratic generalized variations-based estimators (QGV) and a single Fisher scoring step. QGV based estimators have been defined in [15] and a recent use in the context of the linear fractional stable motion is done in [20].
The one-step MLE presents certain advantages over the MLE and the QGV in terms of computational cost and asymptotic variance. It is much less computationally expensive than the MLE while it has the same rate and the same asymptotic variance. On the other hand, it is optimal in terms of asymptotic variance which is not the case for the QGV. Here we provide theoretical results to establish the asymptotic equivalence between the MLE and the one-step MLE and Monte-Carlo simulations are employed to illustrate finite size sample performances.
The remainder of the paper is organized as follows: Section 2 describes the fractional Gaussian noise model observed in the high-frequency scheme and the asymptotic properties of the sequence of MLE. In Section 3 the fast-computable sequence of one-step MLE is described and shown to be asymptotically efficient. Performances of the estimators for samples of medium size are illustrated by means of Monte-Carlo simulations.

Fractional Gaussian noise model at high-frequency
The model considered in this paper is the scaled fractional Gaussian noise that is, where (B H (t), t ≥ 0) is a standard fractional Brownian motion of Hurst parameter H ∈ (0, 1), σ > 0 and 0 = t (n) = iT n = i∆ n for 0 ≤ i ≤ n and ∆ n = T n . In the high-frequency setting (infill asymptotics), the time horizon T is fixed and the mesh size ∆ n = t The optimal rates for the estimation of the parameter ϑ = (H, σ) * ∈ Θ ⊂ (0, 1) × R + * , where * denotes the transposition, have been recently obtained in [3]. For instance, the lower bounds for the risk of estimators are given for any C > 0 by for any sequence of estimators (ϑ n , n ≥ 1), some cost function c, some proper rate matrix ϕ n and an asymptotic Fisher information matrix I ϕ (ϑ 0 ) described in the sequel. Here φ is the density of the 2-dimensional standard normal distribution.

Preliminaries on the fGn
For a fixed n ∈ N * , the distribution of the fractional Gaussian noise (X i , 1 ≤ i ≤ n) defined in (2.1) is a centered Gaussian distribution with autocovariance for any i ∈ {1, . . . , n} and k ∈ {0, . . . , n − i}, with .
The spectral density f ∆n (x) can be rewritten as The autocovariance function is explicitly given (see [22,Sect. 7

.2.3]) by
Consequently, for the observation sample X (n) = (X 1 , . . . , X n ), the likelihood function in ϑ = (H, σ) * admits a closed form given by where Σ n = (r(i − j)) i,j . Let us denote where T n (H) = σ −2 ∆ −2H n Σ n . Then let us notice that, as shown in [6], the pair (A n , B n ) converges in law under P n (H,σ) for a nondegenerate centered Gaussian random variable (A, B) whose covariance matrix is given by Here P n (H,σ) stands for the probability measure associated to the fractional Gaussian noise with parameter (H, σ).
The log-likelihood function is further denoted by n (ϑ) = log L(ϑ, X (n) ) and the score function is given by From the last expression, we clearly see that the leading terms of ∂ H n (ϑ) and ∂ σ n (ϑ) are linearly dependent, which is the reason why we obtain a singular Fisher information matrix when considering diagonal rate matrices as in [16]. This singularity could be untied using non-diagonal rate matrices as in [3].

Properties of the sequence of MLE
The sequence of maximum likelihood estimators ( ϑ n , n ≥ 1) is defined by This sequence has been shown to be asymptotically normal and asymptotically efficient for different rate matrices in [3].
It is worth noting that in all the previous examples the rate matrices are non-diagonal and depend on the parameter σ. Other examples could be found in [3].

Asymptotic efficiency of the one-step MLE
The numerical procedure for the evaluation of the MLE requires the computation of the inverse matrix T n (H) −1 and the use of an iterative numerical optimization algorithm. As a result, the maximum likelihood estimation is a computationally demanding and time-consuming task; therefore it is very important to propose less expensive estimation counterparts enjoying the same appealing asymptotic properties as the MLE. For this purpose we present the sequence of one-step MLE (OSMLE) which is asymptotically efficient and has a smaller computational cost than the MLE [3] and the Whittle estimator [11]. The OSMLE requires a single Fisher scoring iteration and a well-chosen initial sequence of guess estimators.
Let us first consider the matrix ϕ n (ϑ) defined in (2.4). The corresponding non-degenerate Fisher information matrix is given by I ϕ (ϑ) defined in (2.5). We suppose that there exists an efficient rate sequence of estimators ("initial guess estimators") ϑ n = ( H n , σ n ) * satisfying, as n −→ ∞, in law under P n (H,σ) for some (non-efficient) positive definite matrix Γ ϕ (ϑ). Then, we consider the sequence of estimators ϑ n = (H n , σ n ) * defined by the one-step scoring procedure Here A − * denotes the transpose of the inverse matrix of A, namely A − * = (A −1 ) * .
In the next section, we present an initial guess estimator ( ϑ n , n ≥ 1) that could be used in the one-step procedure.

Initial guess estimator via quadratic generalized variations
First, let us denote the quadratic generalized variations Straightforward calculations lead to Then, we define the sequence of initial guess estimators ( ϑ n , n ≥ 1) through quadratic generalized variations, where ϑ n = ( H n , σ n ) * , it is defined by The next result states that the sequence of initial guess estimators ( ϑ n , n ≥ 1) is rate-efficient.
Proof. First, let us denote Taylor's expansion of f around (u , v ) gives Then, for ϕ n (ϑ) given by (2.6), we have where a n,ϑ = 1 Due to the properties of the rate matrix ϕ n (ϑ), we note that a n,ϑ , b n,ϑ , c n,ϑ and d n,ϑ converge, as n −→ ∞, to ,
Together with (3.2) and (3.4), we get Then the application of the delta method (see for instance [24, Thm. 3.1]) leads directly to the desired result with asymptotic variance defined by Let us mention that in the previous proof the quadratic generalized variations satisfy the central limit theorem for any value H ∈ (0, 1). The more classical quadratic variations statisticṼ n,1 = 1 n n i=1 X 2 i satisfies the central limit theorem only for H < 3 4 (see [15]) and the corresponding estimators would present a different behavior. It is worth emphasizing that this result could be extended to other rate optimal initial guess sequences of estimators (quadratic generalized variations of sufficiently high order [15], power variations estimators [12] or wavelets-based estimators [9]). For a comparison of such initial sequences, the reader can refer for instance to [5].

Properties of the one-step MLE sequence
We present now one of the main results of the paper concerning the sequence of one-step MLE, ϑ n = (H n , σ n ) * , defined in (3.1).
Theorem 3.2. The sequence of one-step MLE (ϑ n , n ≥ 1) is asymptotically normal, i.e., as n −→ ∞, in law under P n (H,σ) and asymptotically efficient (in Fisher's sense, i.e. with the same rate and the same asymptotic variance as the MLE sequence).
Proof. First, from (3.1), we have directly Let us denote I 2 the 2 × 2 identity matrix. We consider a ϑ * ∈ B(ϑ, | ϑ n − ϑ|), where B(ϑ, r) is a ball of radius r centered in ϑ. Then Taylor's expansion of ∇ n ( ϑ n ) around ϑ gives where I n (ϑ) = −ϕ n (ϑ)∇ 2 n (ϑ)ϕ n (ϑ). If the following Sweeting's conditions (see for instance [23] or [4]) hold true, then the first term on the righthand side of equation (3.6) tends in probability to 0 and the second term tends in law to a centered normal random variable with variance I −1 ϕ (ϑ). Sweeting's conditions are: C1. The uniform convergence Here, we denote by → u the ordinary uniform convergence with respect to ϑ over any compact set contained in (0, 1) × (0, ∞). C2. Control on the norming matrix ϕ n in the following sense: for any c > 0 sup ϑ ∈Nn(c;θ) is a shrinking neighborhood of ϑ, and I k denotes the k × k-identity matrix. C3. Another kind of uniform convergence of I n (θ), that is, .
We go one step further and prove the aforementioned Sweeting's conditions.
as n −→ ∞ in law under P n (H,σ) .

Simulation study
The objective of this section is to compare the performance of the three estimators of interest (QGV, MLE, OSMLE) for samples of medium size by means of Monte-Carlo simulations. The one-step MLE presents certain advantages over the MLE and the QGV in terms of computational burden and asymptotic variance, respectively. First, it is much less computationally expensive than the MLE while it has the same rate and the same asymptotic variance. Indeed, the computation of MLE necessitates iterative optimization procedures (as the Fisher scoring described in (3.1)) and a convergence criterion. On the contrary, with a proper initial condition, the OSMLE has the same appealing asymptotic properties as the MLE and, since it requires a single Fisher scoring step, it reduces significantly the computational cost. Second, the OSMLE is asymptotically efficient and outperforms the QGV in terms of asymptotic variance.
In this context, the score function (2.2) cannot be defined explicitly since it relies on the quantities ∂ H log det(T n (H)) and ∂ H {T n (H) −1 }. In the numerical procedure, the following equalities are used The non-singular Fisher information matrix is evaluated numerically through the method described in [11].

Statistical error for H
In order to compute the renormalized statistical errors √ n( H n − H) (QGV), √ n( H n − H) (MLE) and √ n(H n − H) (OSMLE), we consider the matrix ϕ n (ϑ) defined in (2.6). The corresponding non-degenerate Fisher information matrix is given by (2.3), that is, (3.7) The asymptotic variance of the QGV given by (3.5) is computed by means of the terms A Monte-Carlo procedure (with 10000 iterations) was followed to explore the behavior of the different estimators (QGV, MLE and OSMLE) of the parameter H at the medium sample size n = 2 8 . The results are depicted in Figure 1 for σ = 0.5 and for two different values of H : H = 0.2 (short-memory) and H = 0.8 > 3 4 (long-memory). In particular, the computations were implemented in R [21].
The QGV estimator is clearly not optimal with respect to the asymptotic variance for both values of H. The theoretical asymptotic variance given in (3.5) and the asymptotic efficient variance given in (3.7) are superimposed. On the other side, the (asymptotically efficient) MLE of H has an appealing asymptotic behavior for the medium sample size n = 2 8 and small values of H. However, when greater values of H generate the data, the MLE presents a rather small bias which naturally disappears with the increase of the sample size.
The bias that characterizes the OSMLE for medium sample sizes, comes from the bias of the initial QGV estimate of σ (see Sect. 3.3.2) and disappears naturally as the sample size augments. To be more precise, the bias is transferred to both components of the OSMLE due to the mixing property of the non-diagonal rate matrix.

Statistical error for σ
Let us now consider the matrix ϕ n (ϑ) defined in (2.7) with corresponding non-degenerate Fisher information matrix The asymptotic variance of the QGV could be computed by means of the terms .
Then by means of Monte-Carlo simulations, we generate 10 000 trajectories of the fGn model for n = 2 8 , σ = 0.5 and two values of the Hurst parameter H: H = 0.2 and H = 0.8. Figure 1 displays the distribution of the three estimators of σ, for the simulated trajectories generated with H = 0.2 and H = 0.8.
We observe that the QGV estimator of σ is not optimal with respect to the asymptotic variance for both values of H and presents a visible bias at the medium sample size n = 2 8 (disappearing naturally at a large sample size n). In the short-memory setting, the MLE has appealing properties for a medium sample size n = 2 8 . Nevertheless, for greater values of H, the MLE presents a visible bias, as it was also observed in [11]. The bias of the OSMLE at medium sample sizes n (disappearing naturally at large sample sizes n) derives from the bias of the initial QGV estimate of σ (see Sect. 3.3.2). Furthermore, it could be noticed that the behavior of the OSMLE is similar to the MLE (or the Whittle estimator).

Computation time
The calculation times needed for the Monte-Carlo procedure to generate random samples of the estimators were counted and displayed in Table 1. For H = 0.2 (resp. H = 0.8), the Monte-Carlo procedure for MLE (Col. 2) requires approximately 5.6 (resp. 3.4) times the calculation time of the OSMLE (Col. 1). This naturally shows that OSMLE outperforms the MLE in terms of computation time.