Fractal And Multifractal Models Of Volatility Economics Essay

Published: November 21, 2015 Words: 6633

Many patterns and processes have proven to be efficiently described by fractals. In economy, the concept of fractals has been successfully applied to the exchange rates fluctuations (Richards 2000), stock markets (Turiel and Pérez-Vicente 2003), Dow Jones fluctuations (Andreadis and Sertelis 2002), stock exchange prices (Ausloos and Ivanova 2002) etc.

However, many patterns and processes are now widely acknowledged as being highly intermittent (that is, characterized by a few hotspots dispersed over a wide range of low-density areas) as the distribution of micro scale fluctuations of turbulent price index rate, for example). In particular, there are clear differences arising from the comparison of these intermittent patterns with standard fractal processes such as fractional Brownian motion or fractional Gaussian motion, which raises the question of the reality of describing such processes in terms of fractals and scaling behavior.

The variety of fractal and multifractal formalisms, described in Chapters 2 and 3, is directly derived from the theory of nonlinear complex systems and fully developed turbulence. As a direct (and unfortunate) consequence, they are far from comprehensible (and then usable), at least for economists without a reasonable mathematical and statistical background.

Recent multifractal studies include characterization of river flows and networks (De Bartolo et al. 2006; Koscielny-Bunde et al. 2006; Livina et al. 2007), asteroid belts (Campo Bagatin et al. 2002), seismicity and volcanic activity (Telesca et al. 2002; Dellino and Liotino 2002), ocean circulation (Chu 2004; Isern-Fontanet et al. 2007), diffusion limited reactions (Chaudhari et al. 2002), pore and particle distributions in soil (Martín and Montero 2002; Bird et al. 2006; Chun et al. 2008), Internet traffic (Masugi and Takuma 2007), rainfall (Labat et al. 2002; Lovejoy and Schertzer 2006), exchange currency markets (Alvarez-Ramirez 2002), stock markets (Turiel and Pérez-Vicente 2003), Dow Jones fluctuations (Andreadis and Sertelis 2002), stock exchange prices (Ausloos and Ivanova 2002), surface properties (Stach et al. 2001; Moktadir et al. 2008), DNA sequences (Tiˇno 2002) and vascular branching (Zamir 2001; Grasman et al. 2003).

4.1 What is intermittence?

It is not easy to give a succinct definition of intermittence. Different authors use different names for this phenomenon, which can lead to terminology ambiguities. Following Feder (1988), one may distinguish a measure (of probability, or of some physical quantity such as mass, energy, or a number of individuals) from its geometric support, which might or might not have a fractal geometry. Then, if a measure has different fractal dimensions on different parts of the support, the measure is a multifractal.

To make this point clearer, consider a modern city viewed directly from above from a plane. From this point of view, one may consider this city as black and white objects: in black are buildings and in white are streets and parks (Figure 4.1).

Figure 4.1

The only information one can get is the distribution of the built and the unbuilt areas. This is the so-called geometric support of the city. Now, if one changes the angle of vision by taking a position still in the air but not directly above the city, the view is from the side (Figure 4.1). The black and white city is now a set of buildings of different heights. This is the so-called measure we are interested in. It could also have been the color, the width, or the age of the buildings. It is now possible to estimate the distribution of a wide range of building heights. Each height will (eventually) be characterized by a fractal dimension, thus the concept of multifractal.

The concept of multifractality can implicitly be found in the formulation of self-organized criticality and the related cumulative frequency distributions. It is nevertheless specifically the study of intermittency in the framework of dynamical systems (Grassberger 1983; Grassberger and Procaccia 1983; Henstchel and Procaccia 1983; Halsey et al. 1986) and fully developed turbulence (Schertzer and Lovejoy 1983; Parisi and Frisch 1985) that led to the introduction of multifractality.

The concept of intermittency finds its origin in the early measurements of turbulent velocity fluctuations of Batchelor and Townsend (1949), who recognized that "as the wave number is increased the fluctuations seem to tend to an approximate on-off, or intermittent, variation." Two decades later, Stewart (1969) was more specific and identified that "the non-Gaussian, intermittent character of the small scale structure becomes more marked as the Reynolds number increases." He also acknowledged that while intermittency "seems to be fundamental to the nature of the turbulence cascade…we do not have a fully satisfactory theoretical explanation" (Stewart 1969). This limitation still stands today. Until very recently (Baumert et al. 2005), intermittency has seldom been referred to, or defined precisely, even in monographs devoted to turbulent processes (Tennekes and Lumley 1972; Pond and Pickard 1983; Summerhayes and Thorpe 1996; Bohr 1998; Kantha and Clayson 2000; Pope 2000). Surprisingly, in a 74-page chapter devoted to intermittency, Frisch (1996) only states that a process is intermittent when it "displays activity during only a fraction of the time, which decreases with the scale under consideration." Intermittency has similarly been described as "the active turbulent regions do not fill the whole volume, but only a subvolume in a very irregular way" (Jiménez 1997) and "active regions occupy tiny fractions of the space available" (Seuront et al. 1999).

The definition of intermittency greatly varies from author to author and from field to field, leading to a largely scattered and nonunified framework. For instance, in rainfall and river flow studies, intermittency refers to the episodic nature of the underlying process, often considered an "on-off" basis, especially in arid environments (Chesson et al. 2004). A similar use of the term intermittency can also be found in energy resources (Asmus 2003; Anderson and Leach 2004). In the field of nonlinear dynamical systems, intermittency has been related to several types of transitions to chaos and classified as types I, II, and III intermittency when the system under consideration is in proximity to the saddle node, Hopf, and reverse period doubling bifurcation (Pomeau and Manneville 1980). By analogy to the bifurcation diagrams detailed in Section 6.1.1 (Figure 6.3 and Figure 6.4 these three types), in of intermittency, the temporal evolution of a system can be divided into ranges of the time in which the behavior of the system is almost periodic (that is, laminar phases) and exhibits chaotic bursts.

Chaos-chaos intermittency is due to crisis phenomena occurring in the system (Ott 1993, 2002) and the on-off intermittency is due to a symmetry breaking bifurcation (Pikovsky 1984; Platt et al. 1993). Practically, the identification of the type of the intermittency observed may yield important information about a system by defining the bifurcations possible for its dynamics (see, for example, Zebrowski and Baranowski 2004; Alvarez-Llamoza et al. 2008).

The phenomenon of intermittency has widely been mixed up with its statistical consequences, and thus generally poorly defined even in specialized monographs. The literature, hence, recurrently refers to intermittency through statements such as "the kurtosis is a useful measure of intermittency for signals having a bursty aspect" (Frisch 1996), "the signals tended to become bursty when the order of differentiation is increased" (Frisch 1996), "most of the time the gradients would still be of the order of magnitude of their standard deviation, but occasionally we should expect stronger bursts, more often than in the Gaussian case" (Jiménez 1997), "the discrepancies between the Kolmogorov predictions and the experimental values of the high-order moments" (Pope 2000), and "we occasionally should expect stronger bursts than expected in a non-intermittent, homogeneous turbulence, which accentuate the skewness of a given probability distribution, causing it to deviate from Gaussianity" (Seuront et al. 2001).

The production of turbulence is not a continuous process but usually has an intermittent character and the turbulence appears as bursts (Svendsen 1997). This intermittency has been acknowledged as "a common phenomenon in many complex systems and a natural consequence of cascades" (Jiménez 2000). Intermittency has also been related to the coherent nature of turbulence and the presence of strong vortices, with diameters on the order of 10 times the Kolmogorov length scale lk, lk = (n3/e)1/4 where n is the kinematic viscosity (m2 s−1) and e the turbulent kinetic energy dissipation rate (m2 s−3) (Siggia 1981; Jiménez et al. 1993; Jiménez and Wray 1994).

The term intermittency has alternatively been coined to describe "the phenomena connected with the local variability of the dissipation" (Jiménez 1998) as well as "instantaneous gradients of scalars such as temperature, salinity or nutrients, greatest at scales similar to the Kolmogorov microscale" (Gargett 1997).

Pope (2000), and more recently Jiménez (2006) in the Encyclopedia of Mathematical Physics, distinguished external from internal intermittencies. External intermittency refers to the coexistence of turbulent and laminar regions in inhomogeneous turbulent flows, such as in boundary layers or in free-shear layers. The interface between laminar irrotational flow and turbulent vortical fluid is typically sharp and corrugated (Jiménez 2006). As a consequence, an observer sitting near the edge of the layer is immersed in turbulent fluid only part of the time and hence experiences an intermittently turbulent flow. In this context, an intermittent flow is characterized by a fluid motion that is "sometimes laminar and sometimes turbulent" (Pope 2000). For the engineering community in fluid mechanics, intermittency is also viewed as a transition between laminar and turbulent flows. Specifically, Wilcox (1998) considers that "approaching the free stream from within the boundary layer, the flow is not always turbulent. Rather, it is sometimes laminar and sometimes turbulent, that is, it is intermittent." Internal intermittency (Pope 2000; Jiménez 2006) is specifically related to the increasingly non-Gaussian properties of velocity fluctuations as spatial separation increases. This property is responsible for the long tails (Black Swan) of the probability distributions of the velocity derivatives.

A more intuitive definition that can directly be applied in economy stated that "this form of variability reflects heterogeneous distributions with a few dense patches and a wide range of low density patches" (Seuront et al. 2001). Most of the previously published work referred to intermittency in the framework of turbulent flows, including wave turbulence (Biven et al. 2001; Newell et al. 2001; Bouruet-Aubertot et al. 2004), plasma and solar wind turbulence (Sorriso-Valvo et al. 2001; Hidalgo et al. 2006), and economic turbulence (). However, a general consensus can be reached considering that a given pattern or process is intermittent in space or in time if: (1) it is characterized by sharp local fluctuations, (2) it is responsible for a skewed probability distribution, and (3) it has a long-term memory signature, perceptible from the power-law form of its autocorrelation function.

4.2 Econometric Models of Volatility (ARCH and GARCH Models)

There has been considerable volatility (and uncertainty) in the past few years in mature and emerging financial markets worldwide. Most investors and financial analysts are concerned about the uncertainty of the returns on their investment assets, caused by the variability in speculative market prices (and market risk) and the instability of business performance (Alexander, 1999). Recent developments in financial econometrics require the use of quantitative models that are able to explain the attitude of investors not only towards expected returns and risks, but towards volatility as well. Hence, market participants should be aware of the need to manage risks associated with volatility. This requires models that are capable of dealing with the volatility of the market (and the series). Due to unexpected events, uncertainties in prices (and returns) and the non-constant variance in the financial markets, financial analysts started to model and explain the behaviour of stock market returns and volatility using time series econometric models.

One of the most prominent tools for capturing such changing variance was the Autorgressive Conditional Heteroskedasticity (ARCH) and Generalized ARCH (GARCH) models developed by Engle (1982), and extended by Bollerslev (1986) and Nelson (1991). Two important characteristics within financial time series, the fat tails1 and volatility clustering (or volatility pooling), can be captured by the GARCH family models. A series with some periods of low volatility and some periods of high volatility is said to exhibit volatility clustering.

Volatility clustering can be thought of as clustering of the variance of the error term over time: if the regression error has a small variance in one period, its variance tends to be small in the next period, too. In other words, volatility clustering implies that the error exhibits time-varying heteroskedasticity (unconditional standard deviations are not constant).

In this paragraph, we capture financial time series characteristics by employing GARCH(p,q) model, and its EGARCH, Threshold GARCH (TGARCH), Asymmetric component (AGARCH), Component GARCH (CGARCH) and Power GARCH (PGARCH) extensions. These models have the advantage of permitting investigation of the potentially asymmetric nature of the response to past shocks prices (or returns) are changing over a given period. In financial markets, fluctuation of prices (or returns) goes under the name of volatility - how much prices (or returns) are changing over a given period. Linear models are unable to explain a number of important features common to much financial data, including leptokurtosis, volatility clustering, long memory, volatility smile and leverage effects. That is, because the assumption of homoscedasticity (or constant variance) is not appropriate when using financial data, and in such instances it is preferable to examine patterns that allow the variance to depend upon its history. Therefore, to model the nonconstant volatility parameter, we consider GARCH-type models. Bollerslev (1986) proposed a GARCH(p,q) random process, which can represent a greater degree of inertia in its conditional volatility or risk. Following the literature (Akgiray, 1989; Connolly, 1989; Baillie and DeGennaro, 1990; Bera and Higgins, 1993; Bollerslev et al., 1992; Floros, 2007, among others), a simple GARCH model is parsimonious and gives significant results. GARCH allows the conditional variance of a stock index to be dependent upon previous own lags. The GARCH (p,q) model is given by:

where p is the order of GARCH, while q is the order of ARCH process. Error, εt, is assumed to be normally distributed with zero mean and conditional variance , Rt are returns, so we expect their mean value (μ) to be positive and small. We also expect the value of  to be small. All parameters in variance equation must be positive, and α+β is expected to be less than, but close to, unity, with β > α. News about volatility from the previous period can be measured as the lag of the squared residual from the mean equation (ARCH term). Also, the estimate of β shows the persistence of volatility to a shock or, alternatively, the impact of old news on volatility.

Financial theory suggests that an increase in variance results in a higher expected return. To account for this, GARCH-in-Mean models are also considered; see Kim and Kon (1994). Standard GARCH-M model is given by:

β2 is positive (and significant), then increased risk leads to a rise in the mean return (β2σ2) can be interpreted as a risk premium).

Exponential-GARCH models were designed to capture the leverage effect noted in Black (1976) and French et al. (1987). A simple variance specification of EGARCH is given by:

The logarithmic form of the conditional variance implies that the leverage effect is exponential (so the variance is non-negative). The presence of leverage effects can be tested by the hypothesis that γ < 0. If 0, then the impact is asymmetric.

Furthermore, the Threshold-GARCH model was introduced by Zakoian (1994) and Glosten, Jaganathan and Runkle (1993). The TGARCH specification for the conditional variance is given by:

where dt = 1 if εt < 0 and dt= 0 otherwise.

In this model, good news (εt > 0) and bad news (εt < 0) have differential effects on the conditional variance. Good news has an impact of a, while bad news has an impact of a + γ. If γ > 0 then the leverage effect exists and bad news increases volatility, while if the news impact is asymmetric.

An alternative specification for the conditional volatility process is Component-GARCH. The conditional variance in the CGARCH(1,1) model is given by

The component model shows mean reversion to (constant over time), while it allows mean reversion to a varying level qt, see (5.2) and (5.3). In equations (5.2) and (5.3), σt is volatility and qt is the time varying long run volatility. Equation (5.2) describes the transitory component, , while equation (5.3) describes the long run component qt.

An extension of CGARCH model is Asymmetric component GARCH. The AGARCH model combines the component model with the asymmetric TGARCH model. This specification introduces asymmetric effects in the transitory equation. The AGARCH model is given by:

where and are the exogenous variables and d is the dummy variable indicating negative shocks. β1 > 0 implies transitory leverage effects in the conditional variance.

Finally, Taylor (1986) and Schwert (1989) introduced the standard deviation GARCH model, where the standard deviation is modeled rather than the variance. This model is generalized in Ding et al. (1993) with the Power ARCH specification. In the Power-GARCH model, the power parameter β1 of the standard deviation can be estimated rather than imposed, and the optional γ parameters are added to capture asymmetry of up to order r :

where β1 > 0, for i = 1,...,r, γt = 0 for all i > r and r p. The symmetric model sets γi = 0 for all i . Note that if β1 = 2 and γi = 0 for all i , the PGARCH model is simply a standard GARCH specification. As in the previous models, the asymmetric effects are present if .

4.3 Cascade Models of Volatility

This analysis technique is devoted to the direct study of the multifractal properties of the fluctuations of any scalar field S, and is based on the qth-order structure functions:

where for a given time lag t the fluctuations of the scalar S are averaged over all the available values ("〈.〉" indicates statistical averaging). For scaling processes, one way to statistically characterize intermittency is based on the study of the scale-invariant structure exponent ζ(q) defined by the following:

where T is the largest period (external scale) of the scaling regime. The scaling exponent ζ(q) z(q) is estimated by the slope of the linear trends of 〈(ΔSt)q〉 vs. t in a log-log plot. The first moment ζ(1), characterizing the scaling of the average absolute fluctuations, corresponds to the scaling Hurst exponent H = ζ(1), characterizing the degree of nonconservation of a given field. The second moment is linked to the power spectrum exponent β as:

For simple (monofractal) processes, the scaling exponent of the structure function z(q) is linear; that is, z(q) = qH. In particular, ζ(q) = q/2 for Brownian motion, and ζ(q) = q/3 for nonintermittent turbulence. For multifractal processes, this exponent is nonlinear and concave, and relates to the function β = 1 + ζ(2) as:

K(q) is then an intermittent correction, hence expresses the deviation of the function z(q) from linearity due to intermittency.

Since the first attempt to provide a quantitative description of the Richardson cascade was made by Yaglom (1966) and Gurvich and Yaglom (1967), a range of discrete and continuous cascade models have been introduced to describe intermittent fluxes (see Seuront et al. [2005] for an exhaustive review).

A first family of such models is composed of discrete cascade models, for which the scale ratio between a structure and the daughter structure is a discrete integer. Due to their discrete nature, these models are not realistic but have been introduced for their simplicity and ability to reproduce experimental intermittency. These models include the lognormal model, the mono-fractal β-model, the α-model, the p-model, the random β-model, and the B-model. Detailed reviews of these models may be found in Paladin and Vulpiani (1987), Meneveau and Sreenivasan (1991), Frisch (1996), and Seuront et al. (2005). In addition, the limitations of these models and their limited ability to fit experimental data, especially for the higher orders of moment q, are detailed in Frisch (1996).

The continuous log-infinitely divisible (log-ID) stochastic models represent a more realistic family of cascade models. Specifically, infinite divisibility specifies that any random variable belonging to this law may be written as a sum of an arbitrarily large number of independent random variables, each having the same law (independent identically distributed) (see, for example, Feller, 1971).

This property intrinsically limits the number of probability laws; the most known ID laws are the Gaussian, Lévy stable, Poisson, and Gamma.

4.3.1 Discrete Cascades Models

4.3.2 Continuous Cascades Models

Since one can simply built a multifractal model for return fluctuations from a multifractal trading time, let us first focus on explicit constructions of non decreasing multifractal processes. Such processes are called multifractal measures. The main paradigm of multifractal measures are multiplicative cascades originally introduced by the Russian school for modelling the energy cascade in fully developed turbulence. After the early works of Mandelbrot [16-18], a lot of mathematical studies have been devoted to random cascades [19-23]. Very recently, continuous versions of these processes have been defined: they share most of the original properties however they display continuous scaling and possess stationary increments [24-28], whereas original multifractal cascades only display discrete scaling and do not possess stationary increments.

In general, we will denote T the time scale above which the process ceases to be multifractal. This scale will be called the integral scale:

The large integral scale T, below which the multiscaling (3) holds can be defined as the scale where the cascading process "starts". The simplest multifractal cascade can be constructed as follows: one starts with an interval of length T where the measure is uniformly spread (meaning that the density is constant) and split this interval in two equal parts: On each part, the density is multiplied by (positive) i.i.d. random factors W. Each of the two sub-intervals is again cut in two equal parts and the process is repeated infinitely. At construction step n, if one addresses a dyadic interval of length T 2−n be a kneading sequence k1, . . . , kn, with ki = 0, 1, the measure of this interval (denoted as I k1, . . . , kn) is simply:

where all the Wk = eδωk are i.i.d such that E[W] = 1. Since Peyri`ere and Kahane [19], it is well known that the previous construction converges almost surely towards a stochastic measure θ∞ provided E[WlnW] < 1. The multifractality of θ∞ (hereafter simply denoted as θ) directly results from its recursive construction. Indeed, from the previous definition, it is easy to show that θ is self-similar in the generalized sense as in Eq. (9):

If In = [tn, tn+l] is a short notation for dyadic intervals of size l = T 2−n, then the order q moments of δlθ(tn) = θ[In] behave has power-law:

Comparison of Eqs. (16) and (4), with ln = T 2−n directly yields the expression of the spectrum ζ(q) in terms of cumulant generating function of δω = lnW:

The previous cascade construction, though simple, does not provide a satisfying solution to model volatility fluctuations. Indeed, it is build on a fixed time interval [0, T], it is not causal and not stationary. Moreover, it involves an arbitrary fixed scale ratio s = 2. Very recently, several constructions have been proposed to generalize Mandelbrot cascades to stationary, causal continuous cascades. The idea on which such generalizations rely is illustrated in fig. 4.2.

One starts with the discrete construction (fig. 1(a)): It can be conveniently represented is some 2D half-plane (t, s) where the parameter s can be identified as a time scale. If one associates at construction step n, with each dyadic interval In of size T 2−n, located at tn = kT 2−n, a point (tn, sn = T 2−n), one constructs a dyadic tree as illustrated in fig. 1(a). The measure θ(t) at some time t is roughly obtained as the products of weights Wi associated with each point inside a cone-like domain C(t) represented in the figure:

(a) (b)

Figure 4.2

The non stationarity of this construction appears immediately as being associated with the fixed dyadic grid corresponding to successive refinements of the interval [0, T ]. A natural way to obtain a stationary model is to replace, at each scale sn = T 2−n the periodic grid by points located randomly according to a Poisson process with a rate rn that is precisely rn = s−1n . We will refer to this type of construction as the "semi-continuous" Poisson construction. It is illustrated in fig.4.2 (b) and corresponds roughly to the "Poisson Multifractal Model" proposed by Calvet and Fisher. One can however go a step further and use a "fully continuous" Poisson construction: instead of keeping the scales sn at exact values T2−n, one can draw the whole grid randomly, over the plane (t, s) using a non-homogeneous Poisson process with rate r(s) = s−1. Then one associates with each Poisson point (ti, si) an independent weight Wi and one can build the measure θ according to (18); one then obtains exactly the "Multifractal Product of Cylindrical Pulses" (MPCP) introduced by Barral and Mandelbrot [25]. Finally, there is a last possible extension if one considers limits of products of MPCP: This amounts replacing the compound Poisson density by some arbitrary infinitely divisible random density in the plane (t, s). In that case, the discrete sum over Poisson points, in (18), is replaced by a stochastic integral over the cone-like domain C(t):

This construction, that involves the concept of "independently scattered random measure" dω(t, s), has been proposed by Bacry and Muzy [26, 27] and allows one to build stationary random cascades with continuous scale invariance properties and with a multifractal spectrum ζ(q) that can be associated with an arbitrary infinitely divisible law. The precise description of this construction is beyond the scope of the chapter and we refer the reader to the cited references for more details. In this chapter we will rather focus on the simplest of infinitely divisible law, namely the normal law, that corresponds to the case when dω(t, s) is a Gaussian white noise.

4.4 Modeling Stochastic Volatility

The models we describe in the following are based on regime-switching, which was advanced in economics and finance by the seminal work of James Hamilton (1988, 1989). While the theoretical formulation of regime-switching is very general, researchers typically employ only a small number of discrete states in empirical applications. This partly stems from the common view that regimes change infrequently. In a general formulation, a more practical limitation is that the transition matrix, and therefore the number of parameters, grows quadratically with the cardinality of the state space. Restrictions on switching probabilities offer a natural solution, as pursued, for example, by Bollen, Gray, and Whaley (2000) in a four-regime model.

We start with the assumption that volatility is determined by components that have different degrees of persistence. These components randomly switch over time, generating a volatility process that can be both highly persistent and highly variable. The transition probabilities are heterogeneous across components and follow a tight geometric specification. We obtain additional parsimony by assuming that when a component switches, its new value is drawn from a fixed distribution that does not depend on the frequency. Our model therefore assumes that volatility shocks have the same magnitude at all time scales. These restrictions, which are inspired by earlier research on multifractals in the natural sciences, provide parsimony and appear broadly consistent with financial data at standard confidence levels.

This specification, which we call the Markov-Switching Multifractal (MSM), offers a number of appealing features to the practitioner and applied researcher. Because it is based on a Markov chain, MSM is a highly tractable multifrequency stochastic volatility model. The empiricist can apply Bayesian updating to compute the conditional distribution of the latent state and thus disentangle volatility components of different durations. Multistep forecasting is convenient, and estimation can be efficiently conducted by maximizing the likelihood function, which is available in closed form.

The researches show that MSM can outperform some of the most reliable forecasting models currently in use, including Generalized ARCH ("GARCH," Bollerslev, 1986) and related models, both in- and out-of sample. These improvements are especially pronounced in the medium and long run, and have been confirmed and extended in a variety of financial series (e.g., Bacry, Kozhemyak, and Muzy, 2008; Lux, 2008). MSM also captures well the power variation or "moment-scaling" of returns, works equally well in discrete time and in continuous time, and generalizes to multivariate settings.

In contrast to the GARCH volatility models discussed earlier, stochastic regime-switching models permit the conditional mean and variance of financial returns to depend on an unobserved latent "state" that may change unpredictably.

The general approach considers a latent state Mt ∈ {m1, ...,md}, where the positive integer d describes the number of possible states. Returns are given by

where μ(Mt) and σ(Mt) are, respectively, the state-dependent conditional mean and variance of returns. The dynamics of the Markov chain Mt are fully characterized by the transition matrix A= (ai,j), 1≤i, j≤d with components aij = P(Mt+1 = mj |Mt = mi).

Traditional Markov-switching approaches such as MS-GARCH use regime-switching only for low-frequency events, while also using linear autoregressive transitions at medium frequencies and a thick-tailed conditional distribution of returns. By contrast, MSM captures long-memory features, intermediate frequency volatility dynamics, and thick tails in returns all with a single regime-switching approach. It is noteworthy that a single mechanism can play all three of these roles so effectively, and the innovation that achieves this surprising economy of modeling technique is based on scale-invariance.

4.4.1 Markov-Switching Multifractal (MSM) Process

We consider a financial series Pt defined in discrete time on the regular grid t = 0, 1, 2, . . . ,∞. In applications, Pt will be the price of a financial asset or exchange rate. Let rt ≡ ln (Pt/Pt−1) denote the log-return. The economy is driven by a first-order Markov state vector with components:

The components of Mt have the same marginal distribution but evolve at different frequencies, as we now explain.

Assume that the volatility state vector has been constructed up to date t−1. For each k ∈ { 1, . . . , }, the next period multiplier M k,t is drawn from a fixed distribution M with probability γk, and is otherwise equal to its previous value: M k,t = Mk,t−1. The dynamics of Mk,t can be summarized as:

where the switching events and new draws from M are assumed to be independent across k and t. We require that the distribution of M has a positive support and unit mean: M ≥ 0 and E(M) = 1.

Under these assumptions, the random multipliers Mk,t are persistent and nonnegative, and satisfy E(Mk,t) = 1. The multipliers differ in their transition probabilities γk but not in their marginal distribution M. Components of different frequencies are mutually independent; that is, the variables Mk,t and Mk',t' are independent if k differs from k'. These features greatly contribute to the parsimony of the model.

We model stochastic volatility by

where is a positive constant. Returns rt are then

where the random variables {εt} are i.i.d. standard Gaussians N(0,1). Since the multipliers are statistically independent, the parameter coincides with the unconditional standard deviation of the innovation rt.

The transition probabilities γ ≡ (γ1, γ2, . . . , ) are specified as

(4.1)

where γ1 (0, 1) and b (1,∞). This specification was initially introduced in connection with the discretization of Poisson arrivals with exponentially increasing intensities. Consider a process with very persistent components and thus a very small parameter γ1. For small values of k, the quantity γ1bk−1 remains small, and the transition probability satisfies relation:

. (4.2)

The transition probabilities of low-frequency components grow approximately at geometric rate b. At higher frequencies, the rate of increase slows down, and condition (4.2) guarantees that the parameter γk remains lower than 1. In empirical applications, it is numerically convenient to estimate parameters of the same magnitude. Since γ1 < ... < < 1< b, we choose (, b) to specify the set of transition probabilities.

We call this construct the Markov-Switching Multifractal (or Markov-Switching Multifrequency) process. The notation MSM() refers to versions of the model with frequencies, and we view the choice of as a model selection problem. Economic intuition suggests that the multiplicative structure (4.1) is appealing to model the high variability and high volatility persistence exhibited by financial time series. When a low-level multiplier changes, volatility varies discontinuously and has strong persistence. In addition, high-frequency multipliers produce substantial outliers.

MSM imposes only minimal restrictions on the marginal distribution of the multipliers: M ≥ 0 and E(M) = 1, allowing flexible parametric or nonparametric specifications of M. A simple example is binomial MSM, in which the random variable M takes only two values, m0 or m1. For simplicity, we often assume that these two outcomes occur with equal probability, which implies that m1 = 2− m0. The full parameter vector is then:

where m0 characterizes the distribution of the multipliers, is the unconditional standard deviation of returns, and b and define the set of switching probabilities.

We can naturally consider other parametric specifications for the distribution M. For example, multinomial MSM extends binomial MSM by allowing any discrete distribution satisfying the positivity and unit mean requirements. Continuous densities can also be useful. If the distribution of M is lognormal, we define lognormal MSM.

4.4.2 Some Basic Properties of MSM Process

MSM() permits the parsimonious specification of a high-dimensional state space. Assume, for instance, that the distribution M is a binomial. Each volatility component Mk,t is either high or low, and the state vector Mt can take possible values. We will routinely work with models that have 10 components, or = 1,024 states.

MSM is also remarkably parsimonious. In a general Markov chain, the size of the transition matrix is equal to the square of the number of states. For instance, a Markov chain with states generally needs to be parameterized by Ã- or more than a million elements. In contrast, binomial MSM only requires four parameters.

Because binomial MSM is a pure regime-switching model, we can use all the tools that commonly apply to this class of processes. For example, we can use Bayesian updating and write the closed-form likelihood function.

This class of stochastic volatility models that have multiple degrees of persistence creates a bridge between Markov-switching and multifractals, and permits the application of standard inference techniques to multifractal processes.

The MSM construction permits low-frequency regime shifts and long volatility cycles in sample paths. We will see that in exchange rate series, the duration of the most persistent component, 1/γ1, is typically of the same order as the length of the data. Estimated processes thus tend to generate volatility cycles with periods proportional to the sample size, a property also apparent in the sample paths of long-memory processes. Long memory is often defined by a hyperbolic decline in the autocovariance function as the lag goes to infinity. Fractionally integrated processes generate such patterns by assuming that an innovation linearly affects future periods at a hyperbolically declining weight. We now show that over a large range of intermediate lags, MSM similarly provides a slow decline in autocovariances, and hence mimics a defining characteristic of long memory with a Markov regime-switching mechanism that also gives abrupt volatility changes.

MSM thus illustrates that a Markov chain can imitate one of the defining features of long memory, a hyperbolic decline of the autocovariogram. The combination of long-memory behavior with sudden volatility movements in MSM has a natural appeal for financial econometrics.

A representative return series is illustrated in Figure 4.3.

Figure 4.3

The graph reveals large heterogeneity in volatility levels and substantial outliers. This is notable since the return process has by construction finite moments of every order. It would be easy to obtain thick tails by considering i.i.d. shocks εt with Paretian distributions. In this chapter, however, we focus on the Gaussian case for several reasons. First, the likelihood is then available in closed form. Second, we will show that even when εt is Gaussian, high frequency regime switches are sufficient to mimic in finite samples the heavy tails exhibited by financial data. Finally, the basic specification performs well relative to existing competitors and provides a useful benchmark for future refinements.

4.4.3 Bivariate MSM

We consider two financial series α and β defined on the regular grid t = 0, 1, 2, . . . ,∞. Their log-returns and in period t are stacked into the column vector:

As in univariate MSM, volatility is stochastic and is hit by shocks of heterogeneous frequencies indexed by k ∈ {1, . . . , }. For every frequency k, the currencies have volatility components and . Consider:

The period-t volatility column vectors are stacked into the 2 Ã- matrix:

Each row of the matrix Mt contains the volatility components of a particular currency, while each column corresponds to a particular frequency. As in univariate MSM, we assume that at a given time t are statistically independent. The main task is to choose appropriate dynamics for each vector.

Economic intuition suggests that volatility arrivals are correlated but not necessarily simultaneous across currency markets. For this reason, we allow arrivals across series to be characterized by a correlation coefficient λ.

Assume that the volatility vector associated with the kth frequency has been constructed up to date t − 1. In period t, each series c ∈ {α, β} is hit by an arrival with probability γk. Let denote the indicator function equal to 1 if there is an arrival on and equal to 0 otherwise. The arrival vector 1k,t = () is specified to be i.i.d., and its unconditional distribution is defined by three conditions.

First, the arrival vector is symmetrically distributed:

(

Second, the switching probability of a series is equal to an exogenous constant:

Third, there exists λ ∈ [0, 1] such that:

As shown in the literature, these three conditions define a unique distribution for . Arrivals are independent if λ = 0 and simultaneous if λ = 1.More generally, λ is the unconditional correlation between .

Given the realization of the arrival vector 1k,t, the construction of the volatility component is based on a bivariate distribution M = (Mα,Mβ) ∈ .

We assume for now that M is defined by two parameters and and that each of its components has a unit mean: E(M) = 1. If arrivals hit both series (), the state vector is drawn from M. If only series c ∈ {α, β} receives an arrival, the new component is sampled from the marginal of the bivariate distribution M. Finally, if there is no arrival.

Consistent with previous notation, let:

where , > 0. Individual returns satisfy. Individual returns satisfy:

The vector is i.i.d. Gaussian N(0, Σ), where:

The construction permits correlation in volatility through the bivariate distribution M and correlation in returns through the Gaussian vector εt. As in the univariate case, the transition probabilities are defined by

where ∈ (0, 1) and b∈ (1,∞). This completes the specification of bivariate MSM.

Under bivariate MSM, univariate dynamics coincide with the univariate model presented in paragraph 4.4.2. The parameter is again the unconditional standard deviation of each univariate series c ∈ {α, β}, and other univariate where ∈ (0, 1) and b∈ (1,∞).

Focusing on the simple specification where each Mk,t is drawn from a bivariate binomial distribution , the first element Mα takes values and ∈ [0, 1] with equal probability.

Similarly, Mβ is a binomial taking values and with equal probability. Consequently, the random vector M has four possible values, whose unconditional probabilities are given by the matrix (pi,j) = (P{M = ()}), 0≤i,j≤1. The conditions P (Mα = ) = 1/2 and P (Mβ = ) = 1/2 impose that

for some ∈ [−1, 1], where is the correlation between components Mα and Mβ under the distribution M.