Estimating fast mean-reverting jumps in electricity market models

Based on empirical evidence of fast mean-reverting spikes, we model electricity price processes $X+Z^\beta$ as the sum of a continuous It\^o semimartingale $X$ and a a mean-reverting compound Poisson process $Z_t^\beta = \int_0^t \int_{\mathbb{R}} xe^{-\beta(t-s)}\underline{p}(ds,dt)$ where $\underline{p}(ds,dt)$ is Poisson random measure with intensity $\lambda ds\otimes dt$. In a first part, we investigate the estimation of $(\lambda,\beta)$ from discrete observations and establish asymptotic efficiency in various asymptotic settings. In a second part, we discuss the use of our inference results for correcting the value of forward contracts on electricity markets in presence of spikes. We implement our method on real data in the French, Greman and Australian market over 2015 and 2016 and show in particular the effect of spike modelling on the valuation of certain strip options. In particular, we show that some out-of-the-money options have a significant value if we incorporate spikes in our modelling, while having a value close to $0$ otherwise.

1. Introduction 1.1. Motivation. A striking empirical feature of electricity spot prices is the presence of spikes, that can be described by a jump in the price process immediately followed by a fast mean reversion (see Figure 1 showing the behaviour of electricity spot prices in different markets over one year of historical data). These spikes are due to the non-storability of electricity, an abrupt change in the demand or the offer (due to weather conditions, outages and so on) having a direct impact on prices. For risk management purposes, the modelling of these extreme events is essential. And, due to the non-storability of electricity, the modelling of forward contracts (used as hedging products) are also needed. If (S t ) t≥0 denotes the electricity spot price, the forward price f (t, T ) at time t delivering 1 megawatt hour (MWh) at time T can be defined as where F t is the available information up to time t and the expectation is taken under a risk-neutral probability. In this context, one usually faces two major issues: first and prior to data analysis, a stochastic model that captures the main characteristics of spot prices, and especially the presence of fast mean-reverting spikes has to be set, however simple enough to give tractable formulas for the forward prices f (t, T ). Second, the chosen model must be calibrated with efficient statistical procedures to show its adequacy to the data, and to properly quantify risk measures. The main difficulty is the estimation of the characteristics of the spikes. To this end, Cartea and Figueroa [8] extend the popular and tractable approach of Lucia et al. [16] by introducing jumps in the price process, resulting in the model log S t = ρ(t) + Y t , dY t = −βY t dt + σ(t)dW t + log JdN t , t ≥ 0, 1 where ρ(t) and σ(t) are deterministic components, (W t ) is a Wiener process, (N t ) is a Poisson process and J is the jump size drawn proportional to a log-normal distribution. A similar model is proposed in Geman and Roncoroni [12], adding up a threshold parameter that determines the sign of the jumps.  In these approaches, the mean reverting coefficient β > 0 is the same for the continuous component and for the spike component. However, statistical evidence shows that the mean reversion of the spike component is much stronger than the one of the Brownian component, see for instance Benth et al. [6]. The estimated β then underestimates the mean reversion of the spike component and overestimates the one of the continuous component. A similar model slightly more realistic is also proposed by Geman and Roncoroni [12] but this one does not provide explicit formulas for deriving f (t, T ). Yet another approach is undertaken in Benth et al. [5,6] with multi-factor models: for some weights w i , and where (L i t ) t≥0 are independent time-inhomogeneous subordinators ensuring that (S t ) t≥0 remains nonnegative. Benth and collaborators establish in [5,6] that m = 2 is sufficient for modelling purposes, each factor (Y i t ) t≥0 having its own mean reverting parameter, allowing for a fast mean reversion and a slower one. However, the use of subordinators implies that the volatility of the process seems to be underestimated. Finally, multi-factor models with a Brownian component and a spike component are studied in Meyer and Tankov [19], Schmidt [23] and Gonzales et al. [13]. Meyer and Tankov estimate the mean-reverting parameters using spectral methods and the jumps are detected by filtering. In Schmidt [23], the parameters of the model are estimated using maximum likelihood with the EM algorithm, implying an approximation of the process with its Euler scheme. Gonzales and co-authors develop a Bayesian framework and recover the parameters of the model by MCMC. In a more general context than electricity price modelling, Moreno et al. [20] use a method of moments to estimate the parameters of a jump diffusion model when the log-price is the sum of an arithmetic Brownian motion and a mean reverting compound Poisson process.
In this paper, we mostly focus on the estimation of the characteristics of the spike process in the case of a wide and unifying range of models, including the aforementioned ones and allowing us to get analytical formulas of forward prices. More precisely, the goal of the paper is at least threefold: i) Introduce a general electricity spot price model consistent with historical data that encompasses the previous approaches and overcomes their limitations. ii) Develop within this model efficient and robust statistical procedures that estimate the spikes' characteristics. iii) Derive an explicit correction formula for the value f (t, T ) of forward contracts revealing the effect of spikes along maturities. Therefore, our proposed approach can be easily used in practice, solving both estimation and pricing issues.

Main results.
We consider an extended framework that encompasses [19], [23] and [13]. In particular, our approach does not require that the continuous part of the price process is an Ornstein-Uhlenbeck, a necessary condition in the aforementioned models.
A semimartingale model with fast mean-reverting jumps. On a rich enough filtered probability space (Ω, F , (F t ) t≥0 , P) that will accommodate all the considered random quantities, we model the electricity spot price X t = S t or X t = log S t by (2) X t = X c t + Z β t , t ≥ 0, where (X c t ) t≥0 is a continuous Itô semimartingale and (Z β t ) t≥0 is the so-called spike process, governed by a mean-reverting factor β > 0. More specifically, we assume that where (σ t ) t≥0 and (µ t ) t≥0 are two adapted càdlàg processes, (W t ) t≥0 a (F t )-standard Brownian motion and with p a random Poisson measure on [0, ∞) × R independent of (W t ) t≥0 , with intensity for some λ > 0 and a probability measure ν(dx) on R. We thus model the electricity spot price as a classical continuous Itô semimartingale (X c t ) t≥0 allowing for the usual financial fluctuations and usual models (factor models, mean-reverting models and so on) to which we add a perturbation (Z β t ) t≥0 of "spikes" or "jumps", triggered by exogenous physical hazard, at intensity λ and magnitude ν(dx), but with a relaxation period 1/β comparable to λ that accounts for the absorption of such events by the market toward resulting in stable prices at large scales. The term comparable is a bit vague at this stage, and will be assessed precisely in Section 2.1, enabling us to speak of fast mean-reversion. In this setting, model (2)-(3)-(4) is well posed and can reproduce, at least visually, the general shape of electricity spot markets, compare historical data from Figure 1 and sample paths simulations given in Figure 2 and detailed in the simulation Section 3.2.  Statistical setting. We assume that we observe the process (X t ) t≥0 given by (2)-(3)-(4) over the time interval [0, T ] on a regular grid 0 = t 0,n < t 1,n < . . . , t n,n = T, t i,n = i∆ n , for 0 ≤ i ≤ n, with mesh ∆ n . Thus we have n (or rather n + 1) observations (5) X n = (X 0 , X ∆n , . . . , X n∆n = X T ).
In the following, for a given process Y , we use the classical notations ∆ n i Y = Y ti,n − Y ti−1,n and ∆Y s = Y s − Y s − . Asymptotics are taken as n → ∞. We assume that T is constant, and we take T = 1 with no loss of generality. Equivalently, ∆ n = 1/n → 0 as n → ∞. This asymptotic setting is usually referred to as the "high-frequency" framework (for instance the classical textbook [3] by Aït-Sahalia and Jacod), but this terminology is a bit misleading: our framework certainly belongs to statistical finance, but it has no link to high-frequency finance or microstructure modelling of any sort. In practice, we apply our methodology to three markets: the French EPEX, the German EPEX and the Australian electricity spot in Queensland, see Section 3 below. We use data between 2015, Jan. 01 and 2016, Dec. 31. with hourly data (even less in the case of Australian data), so that n = 17064 is considered to be large. Equivalently, one hour is considered to be small in front of 2 years. In our setting, the important fact about the assumption that T is fixed is that we leave out any stationarity or ergodicity of the underlying process. We thus make an implicit statistical robustness assumption, which we believe is of importance when considering recent and changing energy markets over such time horizons.
The parameters of interest are λ, β > 0 that govern our correction formulas (see the application to forward contracts prices f (t, T ) below). In particular we leave out the issue of identifying the continuous semimartingale part (X c t ) t≥0 i.e. the drift (µ t ) t≥0 and the volatility process (σ t ) t≥0 as well as the jump distribution ν(dx).
The mean-reversion factor over the observation increment [t i−1,n , t i,n ] is of size β∆ n , and by requiring β∆ n to be large compared to the order of magnitude √ ∆ n of ∆ n i X c , we may hope to recover β asymptotically. We thus introduce the asymptotic setting β = β n with the requirement if we let λ = λ n → ∞, another necessary condition that enough jumps are available over the observation period [0, T ]. A second crucial assumption is since otherwise, the spikes caused by the jumps of p are absorbed by the Brownian fluctuations of X c due to the fast relaxation period 1/β n and therefore cannot be detected by X n .
Statistical results. Heavily relying on classical techniques in high-frequency finance (for instance [3,Theorem 10.26,p.374]), we estimate in a first step the times and sizes of the jumps which are random quantities, taking into account the interplay between β n , λ n and ∆ n dictated by the asymptotic regime (6)-(7)-(8), see Proposition 4. In a second step, we construct an estimator of β n based on an estimator s n of the right-derivative or instantaneous slope of t → Z β t right after a jump is detected. The estimator s n is based on averaging of instantaneous slope proxies of the form ∆X Tq (1 − e −βn∆n ) that govern the relaxation effect after a jump of size ∆X Tq has occurred at time T q and it enables us to consider as our estimator of β n . Since β n itself varies with n and grows to infinity, the notion of convergence has to be considered carefully. Under suitable assumptions, we prove in Theorem 6 that the relative error in probability as n → ∞. The error E n has two components: a first term of order 1/(β n √ λ n ∆ n ) due to Brownian oscillations, and a second term of order min{λ n /β n , 1/ √ β n } + √ λ n /β n that accounts for the effect of jumps that are still present in the price process despite the relaxation effect. When β n √ ∆ n λ n → ∞ and λ n /β n 1, we have E n converges to 0. If we assume further √ β n /λ n → 0, we obtain a central limit theorem for E n with a Gaussian limit and an explicit rate of convergence that depends on the interplay between λ n , β n and ∆ n . We have an analogous result (although less demanding) for the estimation of the jump intensity λ n detailed in Proposition 5.
Application to pricing forward contracts. We show in Theorem 7 that in the model for the spot price (with X = S) defined by (2)-(3)-(4), the price f (t, T ) of a forward contract is given by .
The term f c (t, T ) corresponds to the price of the forward contract in a continuous case framework.
The computation of this value has been extensively studied for different continuous models and it is known analytically for the most common models, see for instance [5,6] among others. The term f β (t, T ) is a correction that follows from our approach. It is of order λ/β and is usually small for the applications we have in mind, see the practical implementation Sections 3 and 4. On balance, the presence of spikes does not significantly impact the price of forward contracts to within these order of magnitudes. This is consistent with our data, for which spikes are not observed on forward prices. By neglecting the term f β (t, T ), we can calibrate the process X c t to the observed forward prices and the process Z β t to the observed spot prices with the estimation procedure proposed in this paper. The results are similar for the modelling of the logarithm of the spot price (i.e when X = log(S)) by (2)-(3)-(4), see Theorem 8. We implement the prices of the forward contracts and Call options from a model calibrated to historical data on electricity prices in Section 4 and we show the impact of the spike modelling on the valuation of a strip of options with payoff of the form I i=1 S ti,n − K + for different times t i,n . As expected, the value of this option increases if we add significant large spikes. In particular, we show that some out-of-the-money options have a significant value if we incorporate spikes in our modelling, while having a value close to 0 otherwise.

1.3.
Organisation of the paper. Section 2 develops a rigorous mathematical framework for the stochastic model (2)-(3)-(4) and gives the explicit construction of the estimators described in Section 1.2 above together with their asymptotic properties in Propositions 4, 5 and Theorem 6. Section 3 establishes the numerical feasibility and consistency of our statistical estimation results on simulated and real data, based over two years (2015 and 2016) of electricity spot prices in three different markets (French, German and Australian). Section 4 is devoted to the application of our model and statistical results to forward contracts. We establish in Theorem 7 a correction formula to get analytical forward prices and study the valuation of a strip of European Call options. The proofs are given in Section 5.

Statistical results
2.1. Model assumptions. We consider the process (X t ) t≥0 defined by (2)-(3)-(4) in Section 1.2. Following closely the standard notation of Aït-Sahalia and Jacod [3], if ν is a positive measure and f a ν-integrable function, we write f (x) ⋆ ν = R f (x) ν (dx). Remember also that we work over a finite time horizon T = 1.
Since our asymptotic results will be given in distribution (see Theorem 6 below), the conditions on the drift (µ t ) t≥0 and (σ t ) t≥0 can substantially be weakened (in order to accommodate for instance diffusion coefficients of the form σ t = X c t h(X c t ) with a bounded h or even locally integrable) by standard localisation procedures, see for instance [14,Section 4.4.1]. We observe (5) and asymptotics are taken as n → ∞ or equivalently ∆ n = n −1 → 0. We also allow λ = λ n and β = β n to either grow to ∞ with n or remain bounded, compare Equations (6)-(7)-(8) and the accompanying discussion in Section 1.2 above.
The condition λ n β n ensures the stability of X t as n → ∞ since Var(X t ) = Var (X c t ) + Var(Z βn t ) = Var (X c t )+|x| 2 ⋆ν λnt 2βn 1 − e −2βn → ∞ if sup n λ n /β n = ∞. The condition β n ∆ n 1 is necessary to identify spikes (or jumps): otherwise, a spike that occurs in the interval ((i − 1) ∆ n , i∆ n ] will be absorbed by the relaxation effect before we observe X i∆n . Finally, the condition λ n ∆ n → 0 controls the no accumulation of jumps within the rate of observation.
In order to estimate the times and sizes of the jumps, we need the following: Assumption 3 (I) implies β n ∆ n = o (1). Condition (i) ensures that the number of jumps in a interval of size ∆ n is essentially 1. In the case where β n is bounded, (ii) is implied by (i) and we have the usual conditions for the detection of jumps (see Mancini [17]). Condition (ii) controls the size of the mean-reversion. Condition (iii) controls the size of the small jumps that cannot converge too fast to 0. If the jumps are bounded below by some constant as Mancini [17] the condition is automatically satisfied. Assumption 3 (II) (iv) implies that β n ∆ 1−̟ n → ∞ and in particular β n ∆ 1/2 n → ∞, implying that the mean reversion of order β n ∆ n is stronger than the order of magnitude ∆ 1/2 n of Brownian increments. It also allows for the case β n ∆ n ≈ 1. In the setting of Assumption 3 (II), the mean reversion is more difficult to distinguish from the jumps and in the case β n ∆ n ≈ 1, the jumps and the drift have the same size and are not distinguishable from their size solely. Condition (i) states that there is at most one jump in an interval of size k n which is large enough for the spike to vanish. Assumption 3 (II) also implies that λ 2 n /β n → 0. Assumption 3 (II) allows for high values of β n but the number of jumps needs then to be smaller than in case Assumption 3 (I) compared to β n .

2.2.
Estimation of the jumps times and λ n . We first construct estimators of the sequence of the jumps and of their intensity λ n . Assumption 3 is in force. Let denote the number of jumps of (X t ) t≥0 up to time t and let T 1 < T 2 < · · · < T q < · · · denote the random times at which jumps occur. By construction, the sizes of jumps ∆X Tq q≥1 form a sequence of independent and identically distributed random variables independent of (N t ) t≥0 . Let i n,q be the random integer such that Define the increasing sequence I n (1) < ... < I n ( λ n ) of indices i ∈ {1, . . . , n} defined by the realisation of the following successive events: with v n ≍ ∆ −̟ n , following the notations of [3, Ch. ??]. Under Assumption 3 (II), we need the supplementary condition ∆ n i X∆ n i+1 X < 0 for the following reason: whenever a jump occurs, the mean reverting is dominant in the next observation interval and has a direction opposite to the sign of the jump. Furthermore, it enables us to discard the increments caused by the mean reversion that are large enough to be detected as jumps. Indeed, if we detect a false jump due to the mean reversion effect, the next increments will follow the same dynamics and it will share the same sign with first increment. The property that no jump lies within the next observation interval is ensured by the existence of k n . Let However, the presence of a drift term −β n t 0 Z βn s ds that depends on n together with the fact that λ n → ∞ makes the extension not trivial. Proposition 4 also provides us with an estimator λ n of λ n . In the case λ n → ∞, we have the following asymptotic property: Proposition 5. Work under Assumption 1, 2 and 3 and assume that λ n → ∞. We have This result is straightforward and asymptotically optimal: consider indeed the seemingly richer experiment where one continuously observes a Poisson process (P t ) 0≤t≤1 with intensity λ > 0. The variable P 1 is a sufficient statistic and the Cramer-Rao bound tells us that any unbiased estimator λ necessarily satisfies E[( λ − λ) 2 ] ≥ I(λ) −1 , where I(λ) = 1 + λ −1 is the Fisher information associated to the observation of P 1 , i.e. a Poisson random variable with parameter λ. Equivalently E λ−λ λ A natural estimator of the jump sizes is ∆ n In(q) X for q ∈ {1, ..., λ n }, see [3,Theorem 10.21,p.370]. In our case, ∆ n In(q) X is equal to ∆X Tq e −βn(Tq−In(q)∆n) plus a negligible term. If β n ∆ n → 0, ∆ n In(q) X is then equivalent to ∆X Tq but if β n ∆ n ≍ 1, the bias ∆X Tq 1 − e −βn(Tq−In(q)∆n) remains and it is not possible to identify the size of the jump. However, if λ n → ∞, one can infer some statistical properties of the jumps size. Indeed, (T q − I n (q) ∆ n ) 1≤q≤ λn has a known distribution and this error can be averaged. We can easily prove the following result: in probability for every integer m > 0 such that x m ⋆ ν < ∞. The proof for the case m = 1 appears in the proof of Theorem 6. Combining this result with our estimator of β n provided below enables us to have an estimator for the moments of ν.

2.3.
Estimation of β n . We are ready to construct an estimator of β n . Define the sign function as sgn(x) = 1 if x ≥ 0 and −1 otherwise. On the event { λ n > 0}, define β n via , ∆ n and set β n = 0 otherwise. Our main result describes precisely the behaviour of β n under the different asymptotic regimes of interest.
Some remarks on the different error terms: 1) the term of order 1/(β n √ λ n ∆ n ) accounts for the presence of a Brownian motion in the term (X c t ) t≥0 . When λ n is bounded, we need β n √ ∆ n → ∞ or equivalently √ ∆ n = o (β n ∆ n ): the size of the slope of (Z β t ) t≥0 after a jump needs to dominate the Brownian motion part which is of order √ ∆ n . In the case where λ n → ∞, we can average the error due to the Brownian martingale part and then diminish the order of the error. In that case, we do not need the restriction 2) The error terms of order min{ 1 √ βn , λn βn }, √ λn βn and λ n ∆ n account for the jumps that occur before the observation increment used to estimate the slope of the process. (11) is a bias correction that enables us to obtain a consistent estimator in the case λ n /β n ≈ 1.

Practical implementation
3.1. Choice of the threshold v n . The method to detect the jumps is based on the classical use of the threshold which is proportional to ∆ −̟ n . As for the choice of the threshold and ̟, no exact method is provided in the literature. However, the threshold is recommended to be chosen of the form v n = Cσ∆ −̟ n in [2, Section 5.3] and [3, Section 6.2.2, p. 187] where C is a constant andσ is an estimator of the integrated volatility 1 0 σ 2 s ds. A popular rule-of-thumb consists in picking ̟ close to 0. Moreover, [2] suggests to choose C between 3 and 5.
It remains to find an estimator of the integrated volatility. A natural choice is the multipower variation estimator, see [4] and [25] for more details. The order of the multipower variation estimator is set to 20 in the practical applications in this paper, which is high compared to the orders typically chosen in the literature. This choice is justified by the strong mean reversion of the spikes. As for the jumps, spikes have large increments that need to be compensated in the multipower variation estimator and can be present during two or three time steps. We compensate these large increments with a higher order of the multipower variation. Some simulations on simple models show that an order of 20 looks reasonable.

Numerical illustration.
In this section, we study the performances of our estimation procedures on simulated data of the process defined by (2)-(3)-(4). We tested a wide range of values for (λ n , β n ) in order to illustrate the results of Theorem 6. To be consistent with real data, the process is simulated with a step time ∆ n = 10 −4 , which is the order of magnitude corresponding to one year of hourly data observations. We pick (12) dX c t = X c t ( 1 2 2 2 − 100 log(X c t ))dt + 2dW t corresponding to the logarithm of an Ornstein-Uhlenbeck process with the mean reverting parameter equal to 100 and volatility parameter equal to 2. The sizes of the jumps follow the law 0.4 (−E (15)) + 0.6E (10), with E(ρ) denoting the exponential distribution with parameter ρ > 0. Figure 2 illustrates a sample paths of the process for different parameters λ n and β n . We realize 10000 simulations. We use jump detection under Assumption 3 (I) as well as under Assumption 3 (II), the latter corresponding to filtering the previous jumps by keeping only increments that have successive opposite signs. Table 1 shows the different results of estimated ( λ n , β n ) using a threshold equal to 5σ∆ −0.01 n . As expected, for large β n , the estimation on filtered jumps gives satisfactory results of β n , whatever the value of λ n , and seems to slightly underestimate λ n , whereas the estimation under Assumption 3 (I) gives bad results on β n and overestimates λ n . For small values of β n , the estimation under Assumption 3 (I) is more efficient for both λ n and β n . The results for β n = 20 highlights, as expected, the need in Assumption 3 (II) to have a small number of jumps (λ n = 10) to get satisfactory results. Because of the expected β n on real data, we will focus, in the next section, on the estimation procedure on filtered jumps (i.e. under Assumption 3 (II)). This choice is also justified by the fact that an underestimating the numbers of spikes seems to have a lower impact on the quality of β n than an overestimating them.

3.3.
Practical implementation on real data. Electricity spot historical data do exhibit spikes with strong mean reversion, see Figure 1. We expect to obtain relatively high values for β n , a necessary condition in order to apply our estimation procedure, especially under Assumption 3 (II). The goal is then to estimate the parameters λ n and β n of the process Z βn on a time series of spot prices, assuming that the spot price is the sum of a continuous semimartingale and a spike process. We dispose of the following data: (1) French electricity EPEX spot prices between the first of January of 2015 (included) and the first of January 2017 (not included) with data each hour 1 , (2) German electricity EPEX spot prices between the first of January of 2015 (included) and the first of January 2017 (not included) with data each hour 1 , (3) Australian electricity spot prices in Queensland between the first of January of 2015 (included) and the first of January 2017 (not included) with data each 30 minutes 2 . We estimate those parameters using a threshold v n = Cσ∆ −0.01 n , withσ the multi-power variation of order 20 and C a constant set to 3, 4 or 5. Results are presented in Table 2. Figure 1 gives the time series of these three sets of data with jumps time estimated in the case C = 5. As expected, the estimated λ n is sensitive to the value of C, the number of detected jumps decreasing with C. The estimated β n is much less sensitive, although we can observe a slight increase of values with C in the French and German markets. We will see in section 4.3 the sensitivity of these estimators to the value of a strip of Call options.  Table 2. Estimation of (λ n , β n ) for different markets using a threshold of the form v n = Cσ∆ −0.01 n whereσ is the multi-power variation estimator of order 20 and C takes different values.

4.
Derivative pricing in the electricity market 4.1. Forward price formula. Due to the non-storability of electricity, the spot price is not an asset. The modelling of (and then an analytical formula for) the forward prices (i.e. the real assets and hedging products) is essential for risk management purposes. The question of the choice of the risk-neutral probability is addressed in the next section. Here, we consider that the electricity spot price (S t ) t≥0 (respectively the logarithm of the spot price (S t ) t≥0 ) is modelled by S t = X c t + Z β t (respectively log(S t ) = X c t + Z β t ), according to (2)-(3)-(4) under a risk-neutral probability Q.
The forward price f (t, T ) quoted at time t, delivering 1MWh at time T , is defined by: The available contracts in the electricity markets are of the form f (t, T, θ): a contract that delivers 1MWh continuously from T to T + θ. The delivery period θ can be one week, one month, one year and so on. For example, the contract called "one-week-ahead" (1WAH) will deliver 1MWh continuously from the first hour of next Monday to the last hour of the following Sunday; the contract called "one-month-ahead" (1MAH) will deliver 1MWh continuously between the first and the last hour of next month. By classical no arbitrage arguments [5] the price of such a product is defined by: Theorems 7 and 8 give analytic formulae for the forward price f (t, T ) respectively for the modelling of the spot price and the modelling of the logarithm of the spot price by (2)-(3)-(4).
Theorem 7. Suppose that the spot price is modelled by S t = X c t + Z β t , t ≥ 0, according to (2)-(3)-(4) under a risk-neutral probability Q.
(1) We have an explicit representation of f (t, T ) = E Q S T F t given by (2) We also have f (t, The proof is straightforward. Theorem 7 has three major consequences. (1) The proposed model (2) It is then easy to consider the proposed model as an extension of any classical (continuous) model written on the forward prices, allowing to represent spikes in the spot price dynamics. (3) If λ/β is small and β is large, the impact of f β on the forward prices is negligible and the additive spike process has only an impact on the spot prices. This is consistent with the observations in the electricity markets, the forward prices showing no spikes.
These consequences are of significant importance. Especially in the case where λ/β is small and β is large, this means that the spike process can be treated independently, both in parameter estimation and in simulation. Indeed, consider any existing (continuous) model describing (or simulating) f c (t, T ) and calibrated on forward prices, the proposed model then consists in adding (simulations of) the spike process calibrated on spot prices following the estimation procedure previously described.
Theorem 8. Suppose that the logarithm of the spot price is modelled by log(S t ) = X c t +Z β t , t ≥ 0, according to (2)-(3)-(4) under a risk-neutral probability Q. Let assume that R e ux ν (dx) < ∞ for all u ∈ [0, 1]. We have an explicit representation of f (t, T ) = E Q S T F t given by The proof is straightforward and comments for the results of Theorem 7 can be transposed to the ones of Theorem 8. In particular, if λ/β is small and β is large, the term f β (t, T ) is close to 1 and f (t, T ) to f c (t, T ).

4.2.
Specific model and change of measure. We address the problem of choosing the risk neutral probability which we illustrate with a specific and simple model: in the rest of this section, we consider the model defined by with f β (t, T ) defined by (13) and where the continuous part f c (t, T ) follows the dynamics a two-dimensional Brownian motion under the historical probability P with correlation ρ. This dynamics corresponds to a classical two factors model for forward prices of electricity [15] or gas [26]. The forward price is driven by a short term factor with volatility σ s e −α(T −t) and a long term factor with volatility σ l . The short term volatility σ s e −α(T −t) captures the Samuelson effect: the volatility increases when T − t decreases. The spot price is then equal

and the model then falls within the class of models (2)-(3)-(4).
We have seen that the forward products are not impacted by the spikes if λ/β is small and β is large. However, it can have an important impact on options on the spot, for instance on a strip of Call options, with payoff of the form I i=1 S ti,n − K + for prescribed dates t i,n . If we consider an option with payoff having a single component (S t − K) + , the jump process will have a weak impact: the probability to have a jump at time t is equal to 0 and even if there is a jump before, it disappears very quickly. However, the jump process may have a significant impact on the value of options with payoff I i=1 S ti,n − K + because the probability of having spikes on [0, 1] is non zero. (Note that only upward spikes will have an impact on the price of these options.) Unlike spot prices, forward contracts are tradable assets. In the following, we assume absence of arbitrage opportunity. According to the fundamental theorem of asset pricing, there exists a probability measure Q equivalent to the historical measure P such that f (t, T ) is a local martingale under Q 3 . Because of the presence of jumps, the market is incomplete and Q is not unique. According to [22,Theorem 2], there exists a predictable process (γ t ) t≥0 and a predictable process (Y (., t, x)) t≥0,x∈R such that: with c t equal to f c (t, T ) σ 2 l + σ 2 s e −2α(T −t) + 2ρσ l σ s e −α(T −t) 1/2 in our case. Under the equivalent measure, f (t, T ) is an Itô semi-martingale with drift 0, volatility c t and jump measure p * = Y p following for two Brownian motions (W s, * , W l, * ) under the new measure. The volatility does not change unlike the intensity and the law of jump sizes of the Poisson process.
In order to choose the change of martingale measure, one usually consider an optimisation criterion. One of the most common used criterion is the local risk-minimisation introduced by Föllmer and Schweizer (see [24] for details). The variance of the cost of the strategy is minimised locally, infinitesimally at each time. This strategy corresponds to choose the minimal martingale measure defined in [11]. Under certain assumptions, this measure is a true probability measure and the asset is a local martingale under this measure. Furthermore, the intensity changes and depends on the drift µ, which is also true for most common criteria. Since we work on a finite time framework, the drift is not identifiable and it is not possible to estimate it. In the following we choose the historical approach of Merton consisting in picking a change of probability that does not affect the intensity and the jump sizes of the Poisson measure [18]. The equivalent probability measure is defined by where W l,Q M and W s,Q M are two Q M -Brownian motions. Merton chooses this probability considering that the risk associated to the jumps is diversifiable. As noticed in Tankov in [9, Section 10.1], using this strategy leaves one exposed to the risk of the jumps. It only corrects the average effect of jumps (provided that the jump component of the electricity price is independent of the other assets, which is the case here: we understand the electricity spikes caused by physical exogenous events; it can in particular be caused by the production capacity and the demand which are not assets (see the structural model of Aid et al. [1] for instance). Finally, the price of an option with payoff H (S T ) = H (f (T, T )) at time t is given by

4.3.
Application to Call option pricing in the French market. In the following, we focus on the French market and we work on the model of Section 4.2. We dispose of the hourly spot and daily forward prices in 2015 and 2016.
Parameters of f β . We use the parameters found in Table 2 to calibrate Z β to the spot prices. We model the size of the jumps by its empirical distribution, each jump being estimated with ∆ n In(q) X, knowing that a bias (mentioned in the end of section 2.2) is present.
Parameters of f c . We consider the following forward products in the French market: 1 to 4 Week-ahead, 1 to 3 Month-ahead, 1 to 4 Quarter-ahead and 1 and 2 Year-ahead products. As λ n / β n is small and β n is large, we can neglect the jump part on the forward prices and consider that the forward products have only a continuous part. We use the method of Féron and Daboussi [10] to calibrate the parameters of f c to the observed forward prices. We find for the different parameters α = 12.56 y −1 , σ s = 1.03 y − 1 2 , σ l = 0.25 y − 1 2 and ρ = −0.11.
Forward products. In Figure 3, we display a simulation of the spot price, the 1WAH and the 1MAH with and in absence of spikes. The parameters of the spike component are the one of Table  2 with C = 5. We observe that the difference between the trajectory of the forward products with and without spikes is very small but significant for the spot price.  . We give in Table 3 confidence intervals at level 95% for the option price with the different strikes 100, 200 and 300, computed using Monte Carlo method with 10000 simulations. We consider the case where there is no spikes and the cases with spikes using the different threshold of the form v n = Cσ∆ −0.01 n with C = 3, C = 4 and C = 5. Considering spikes leads to higher value for the strip options. Furthermore, options valued at zero have now non negligible values. We notice that the choice of the threshold have a low impact on the price of the option. Indeed, a higher C leads to less jumps, but the empirical distribution contains only the larger jumps which are the ones impacting the price for high strikes.  Table 3. Confidence intervals at level 95% for the price of strip options computed using Monte Carlo method with 10000 simulations for different strikes and different models.

Proofs
In the following proofs, we set the drift (µ t ) t≥0 vanishes identically. Generalizing to the nonnull drift case is done using the usual argument based on Girsanov theorem. Also, we assume for simplicity that (σ t ) t≥0 is a deterministic function, in order to simplify in particular the proofs for central limit theorems.

5.1.
Proof of Proposition 4. The proof follows the path of [3,Theorem 10.26,p.374] and is also close to Mancini [17] in spirit. We will denote by ξ ν a random variable distributed according to ν. Let A n = i ∈ {1, . . . , n}, i = i (n, q) ∀q ≥ 1 be the set of indices i such that the interval ((i − 1)∆ n , i∆ n ] contains no jump.

Proof of Proposition 4 under Assumption 3 (I).
We first need to show (14) P max i∈An |∆ n i X| √ ∆n > v n → 0, by Markov's inequality and [7, Equation (10)] on the expectation of the crossings of shot noise processes. This last term converges to 0 by Assumption 3 (ii) and (17) follows which completes the proof of (14).
We next turn to (16). The left hand side of (16) is equal to P (∪ n i=1 ∆ n i N ≥ 2) λ 2 n ∆ n which converges to 0 if λ n √ ∆ n → 0. With no loss of generality we may (and will) work on the set { max 1≤i≤n ∆ n i N ≤ 1}. In the interval ((i (n, q) − 1) ∆ n , i (n, q) ∆ n ], there is only one jump and we have ∆ n i(n,q) Z β = − 1 − e −βn∆n Z t i(n,q)−1,n + e −βn(i(n,q)∆n−Tq) ∆X Tq for all q ≥ 1, therefore It follows that P min i∈A c n |∆ n i X| √ ∆n ≤ v n is dominated by the sum of the three following terms: The term (18) equals and converges to 0 under the assumption λ n P |ξ ν | ≤ ∆ 1 2 −̟ n → 0. The term (19) is dominated by The left hand side of (21) is equal to (18) and converges to 0. According to [17,Corollary 3.3], for i ∈ {1, ..., n}, The right hand side of (21) is dominated by The term (20) is dominated by The left hand side of (21) is equal to (18) and converges to 0. The right hand side of (21) also converges to 0 with the same argument as for (17).

Proof of Proposition 4 under Assumption 3 (II).
Since therefore, we only need to prove the result on the set B n = { max 0≤i≤n−kn We need to show: We bound (23) above by the sum of P ∃i ∈ A n , |∆ n i X c | √ ∆ n > v n 2 and ∆ n i X∆ n i+1 X < 0 ∩ B n and P ∃i ∈ A n , |∆ n i Z β | √ ∆ n > v n 2 and ∆ n i X∆ n i+1 X < 0 ∩ B n . The first term is bounded by n /8σ 2 and converges to 0. For the second term, we consider two cases: a jump occurs before i∆ n or no such jump occurs, which leads us to further consider For (25) since we work on B n , we have i + 1 ∈ A n hence ∆ n i+1 Z β = −(1 − e −βn∆n )Z β ti,n = e −βn∆n ∆ n i Z β and (25) is dominated by the probability of the event and dominated by Concerning (26), we have, if k n ≤ i + 1, we have that |∆ n i Z β | is equal to The inequality remains true if i+1 < k n as ∆ n i Z β is equal to 0 in this case. Thus, (26) is dominated by using the same argument as for (17) and the convergence (23) follows.
We now turn to (24). It suffices to show that both terms P ∃i ∈ A c n , |∆ n i X| √ ∆ n ≥ v n ∩ B n , and P ∃i ∈ A c n , ∆ n i X∆ n i+1 X ≥ 0 ∩ B n converge to 0. The proof for the first term is similar to the one of (15), the only difference being ∆ n i(n,q) Z is equal to (1 − e −βn∆n )e −βn∆n(kn−2) Z t i(n,q)−kn +1,n + e −βn(t i(n,q),n −Tq) ∆X Tq if k n ≤ i + 1 and e −βn(t i(n,q),n −Tq) ∆X Tq otherwise and that the term P(v n √ ∆ n ≤ 3 max 1≤i≤n β n ∆ n |Z β ti,n |) needs to be replaced by P v n ∆ n ≤ 3 max 1≤i≤n β n ∆ n e −βn∆nkn |Z β ti,n | ≤ λ n P |ξ ν | > e βn∆n(kn−2) vn √ ∆n βn∆n → 0.
For the second term, it is sufficient to prove that ∆ n i(n,q) X has the same sign as ∆X Tq and that ∆ n i(n,q)+1 X has the opposite sign with probability one. We are thus led to show that We have that ∆ n i(n,q) X∆X Tq dominates We thus have that (27) is dominated by the probability of the event that converges to 0 if we use a similar proof than the one of (15). The proof of (28) is similar since no jump occurs in the interval (i (n, q) , i (n, q) + 1] and The term (28) is then dominated by the probability of the event The convergence to 0 is obtained in the same way, except for the extra control of the terms |x|e −βn(t−s) p (ds, dx) > e βn∆n(kn+1) v n ∆ n ≤ λ n P |ξ ν | > e βn∆nkn ∆ 1 2 −̟ n that both converge to 0. The proof of Proposition 4 is complete.

Proof of Theorem 6.
Preparation for the proof. In order to prove Theorem 6, we start by giving an oracle estimator of β n when the jump times and their sizes are known. with E n = q ∈ {1, .., N 1 }, i (n, q) + 1 ∈ A n and i (n, q) < i (n, q + 1) .
i) The following expansion holds on the set {N 1 > 0}: iv) If β n /λ 2 n → 0, the conditions of iii) and |x| 4 ⋆ ν < ∞ hold together, we finally obtain in distribution as n → ∞.
Proposition 9 is the core of Theorem 6.
Proof of Proposition 9.
Step 1). We first need three approximation results.
Lemma 10. We have q / ∈En sgn ∆X Tq ∆ n i(n,q) X λ 2 n ∆ n in probability.
Proof. We plan to use the decomposition The term I is centred and as |I| ≤ N1 q=1 |∆ n i(n,q) X c |1 {Tq+1−Tq<t i(n,q),n −Tq +∆n} , we have In turn I is of order λ n ∆ n hence negligible. For the term II, we have where ∆X j i denotes the jth jump in the interval ((i − 1) ∆ n , i∆ n ] that occurs at time T j i . First, we have that the term N1 q=1 |Z β t i(n,q)−1,n |1 {Tq+1−Tq<t i(n,q),n −Tq+∆n} is dominated by Because of the independence of ∆ n i N and ∆ n i+1 N conditional on F ti−1,n,n , we derive that its expectation is less than |Z β t i(n,q)−1,n |1 {Tq+1−Tq<t i(n,q),n −Tq+∆n} λ 3 n ∆ n β n in probability. In the same way, it is not difficult to see that is of order λ 2 n ∆ n . The result of the lemma follows.
Proof. We plan to use the decomposition N1 q=1 sgn(∆X Tq )∆ i(n,q) X = I + II, with sgn ∆X Tq ∆ n i(n,q) X c , and II = N1 q=1 sgn ∆X Tq ∆ n i(n,q) Z β .
With the notation of Lemma 10, we write and in the same way as for the proof of Lemma 10, using the independence between X c and N , it is not difficult to see that I is centred with variance of order λ n ∆ n . For the second term, we write sgn(∆X Tq )Z β t i(n,q)−1,n .
Proof. If sup n λ n < ∞, then Otherwise, using Step 2). We are ready to prove Proposition 9. Define the oracle slope Using the canonical decomposition X = X c + Z β and the fact that for i ∈ A n , we have We study the convergence of each term separately.
Step 3). The term I. From the proof of Lemma 10, we readily have q / ∈En sgn(∆X Tq )∆ n i(n,q)+1 X c λ n ∆ n in probability, so we shall replace the sum in q over E n by the sum in q over {1, . . . , N 1 } in the following. By Lemma 12, we derive (30) and where both R (1) n and R (2) n are bounded in probability. Next, denoting by ∆X j i the j-th jump in [(i − 1)∆ n , i∆ n ], we have where R n is bounded in probability, using the same argument as in Lemma 11. It is not difficult to see that ∆n σ 2 s ds up to an error of order (λ n ∆ n ) 2 . Therefore, when sup n λ n < ∞, we have that IV is of order √ ∆ n and of order ∆ n /λ n otherwise, using that 1 0 R |x|e −βn((⌊t∆ −1 n ⌋+1)∆n−t) p (dt, dx) is equivalent to λ n ∆ n |x| ⋆ ν 1−e −βn ∆n βn∆n . We thus obtain the decomposition is bounded in probability and V (n) 3 is defined in the statement of Proposition 9. We investigate further the convergence of J (n) 3 . Define Clearly, n i=2 sgn(∆X 1 i−1 )∆ n i X c 1 {∆ n i−1 N =1} is centred and we claim that for every η > 0: as n → ∞. Indeed, applying successively Cauchy-Schwarz's, Markov's and Burckolder-Davis-Gundy's inequality, we obtain n say. Summing up and taking expectation, it follows that by Jensen's inequality. Since V 2 n has a Binomial distribution with parameters (n−1, λ n ∆ n e −λn∆n ), we have that V 2 n → ∞ in probability since λ n → ∞ and is bounded below on {N 1 ≥ 1} which has probability that converges to one, the Lindeberg condition (32) follows by dominated convergence and we further infer in distribution as n → ∞. Observing that in view of (31).
Step 4). The term II. We write, using the proof of Lemma 10 and Lemma 12, n and R (2) n are bounded in probability. By Step 2), we know that N1 q=1 sgn(∆X Tq )∆ n i(n,q) X c where U (n) is bounded in probability and asymptotically normal if λ n → ∞ and V (n) 4 is defined in the statement of Proposition 9. Finally, we have proved is bounded in probability and asymptotically normal if λ n → ∞.
Step 5). The term III. By the proof of Lemma 10 and Lemma 12, we have n and R (2) n are bounded in probability. Indeed, in the same way as for Lemma 10, we have in probability, as follows from (29) and the computations of Lemma 10. When exactly one jump occurs in (i (n, q) − 1, i (n, q)], we have Z β t i(n,q)−1,n = e −βn(t i(n,q)−1,n −Tq) Z β T q − and where the remainder term R n is bounded in probability and accounts for the case where more than one jump occurs in the intervals (i (n, q) − 1, i (n, q)]. By Fubini's theorem, the main term splits into M  By standard yet tedious computations, evaluating centred terms and their variances in terms of asymptotics in n, we can infer from the previous representation that where J (2) n is bounded in probability and g We eventually obtain the decomposition III = β n ∆ n M n + β n ∆ n e −βn∆n V (1) n J (1) n + β n ∆ n e −βn∆n V (2) n J ( n defined in the statement of Proposition 9. Step 5'). Using classical results of [21,Theorem 3], one can further investigate the asymptotic distribution of (J (1) n , J n ) under the additional assumption |x| 3 ⋆ ν < ∞ assuring the existence of 1 0 R g 3 n (t, x) ν (dx) dt and the condition (sgn (x) ⋆ ν) 2 (|x| 2 ⋆ ν) + (x ⋆ ν) 2 − 2(sgn (x) ⋆ ν)(|x| 2 ⋆ ν) = 0, otherwise the term V n becomes negligible. We then have that J (1) n is asymptotically normal, and under the stronger assumption |x| 4 ⋆ ν < ∞, we even have the convergence of (J (1) n , J n ) towards a standard two-dimensional Gaussian distribution. We omit the details.
Step 6). Combining Step 2) and the results of Steps 3), 4) and 5), we obtain s oracle n = 1 − e −βn∆n + β n ∆ n M n + β n ∆ n e −βn∆n V T n J n with the notation introduced in the statement of Proposition 9. Now, let β oracle n = − 1 ∆n log(max{1− s oracle n , ∆ n }). For n large enough, a first-order Taylor's expansion entails β oracle n = β n + β n e βn∆n M n + e −βn∆n V T n J n where the random vector J n has the same asymptotic properties as J n . Setting M oracle n = e βn∆n M n , and checking that all the terms have the right order, we obtained the desired result and Proposition 9 is proved.
Completion of proof of Theorem 6. We start by a simple approximation result.
Thus, P min 1≤q≤N1 ∆ n i(n,q) X ∆XT q ≤ 0 ≤ P max i∈A c n |∆ n i X c | + β∆ n max 1≤i≤n |Z β ti,n | ≥ min 1≤i≤N1 |∆X Ti | and this term converges to 0 in the same way as the terms (19) and (20). The convergence (33) follows and Lemma 13 is proved. The case where Assumption 3 (II) is fulfilled corresponds to showing the convergence (27) and the proof follows likewise.
We are ready to prove Theorem 6.

say.
Step 2). We quickly study each term separately. The term I further splits into I = III + IV , with With the same kind of arguments as developed in the proof of Proposition 9, it is not difficult to see that |IV | ∆ 2 n λ 2 n in probability. For the term III, we use the same kind of arguments and obtain III =M n + β n ∆ n λ n ∆ n + ∆ n β −1 n R (2) n , withM n = 2∆ n λ n sgn(x) ⋆ ν (x ⋆ ν) 1 2 − 1−e −βn ∆n 2βn∆n + 1+e −βn ∆n 2βn − 1−e −βn ∆n β 2 n ∆n (|x| ⋆ ν) 1−e −βn ∆n βn∆n and R (2) n is bounded in probability. We also have |II| ∆ 3/2 n in probability. We omit the details.
Step 3). Finally we obtain the decomposition s n = s oracle n +M n + β n ∆ n λ n ∆ n + ∆ n β −1 n R (2) n and we conclude by applying Proposition 9, replacing s n by s oracle n and studying the order of each term carefully.