Framework | RDP 2025-03: Fast Posterior Sampling in Tightly Identified SVARs Using ‘Soft’ Sign Restrictions

RDP 2025-03: Fast Posterior Sampling in Tightly Identified SVARs Using ‘Soft’ Sign Restrictions 2. Framework

Matthew Read and Dan Zhu

May 2025

Download the Paper 1.45MB

This section describes the SVAR and its orthogonal reduced-form parameterisation, explains the range of identifying restrictions considered, and outlines the standard and robust Bayesian approaches to inference in this class of models.

2.1 SVAR and orthogonal reduced form

Let y_t = (y_1t,...,y_nt)′ be an n×1 vector of random variables following the SVAR(p) process:

(1)

A_{0} y_{t} = A_{+} x_{t} + ε_{t}

where A₀ is an invertible n×n matrix with positive diagonal elements and $x_{t} = {({y^{'}}_{t - 1}, ..., {y^{'}}_{t - p})}^{'}$ . Conditional on past information, $ε_{t} \sim N (0_{n \times 1}, I_{n})$ . The orthogonal reduced-form parameterisation is

(2)

y_{t} = B x_{t} + Σ_{t r} Q ε_{t}

where: $B = (B_{1}, ..., B_{p}) = A_{0}^{- 1} A_{+}$ are the reduced-form coefficients; $Σ_{t r}$ is the lower-triangular Cholesky factor of the variance-covariance matrix of the reduced-form VAR innovations, $Σ = 𝔼 (u_{t} {u^{'}}_{t}) = A_{0}^{- 1} {(A_{0}^{- 1})}^{'}$ with u_t = y_t – Bx_t; and Q is an n×n orthonormal matrix. The reduced-form parameters are denoted by $ϕ = {(vec {(B)}^{'}, vech {(Σ_{t r})}^{'})}^{'}$ and the space of n×n orthonormal matrices by $O (n)$ .

Impulse responses to standard deviation shocks can be obtained from the coefficients of the vector moving average representation:

(3)

y_{t} = \sum_{h = 0}^{\infty} C_{h} Σ_{t r} Q ε_{t - h}

where C_h is defined recursively by $C_{h} = \sum_{l = 1}^{\min {h, p}} B_{l} C_{h - l}$ for $h \geq 1$ with C₀ = I_n. Element (i, j) of $C_{h} Σ_{t r} Q$ is the horizon-h impulse response of variable i to structural shock j, denoted by $η_{i j h} (ϕ, Q) = {c^{'}}_{i h} (ϕ) q_{j}$ , where ${c^{'}}_{i h} (ϕ) = {e^{'}}_{i, n} C_{h} Σ_{t r}$ is row i of $C_{h} Σ_{t r}$ and $q_{j} = Q e_{j, n}$ is column j of Q.

2.2 Identifying restrictions

Imposing identifying restrictions on functions of the structural parameters is equivalent to imposing restrictions on Q, where the restrictions depend on $ϕ$ .^[5] Our algorithms allow for a wide variety of sign restrictions, including restrictions on:

Impulse responses. The restriction $η_{i j h} (ϕ, Q) \geq 0$ is equivalent to ${c^{'}}_{i h} (ϕ) q_{j} \geq 0$ , which is a linear inequality restriction on q_j with coefficient vector that depends on $ϕ$ . We also allow for ‘shape’ or ‘ranking’ restrictions on impulse responses (e.g. Amir-Ahmadi and Drautzburg 2021). An example is that $η_{i j h} (ϕ, Q) \geq η_{i j l} (ϕ, Q)$ for $l \neq h$ , which is equivalent to $({c^{'}}_{i h} (ϕ) - {c^{'}}_{i l} (ϕ)) q_{j} \geq 0$ .
Structural coefficients. A restriction on the matrix of contemporaneous structural coefficients is ${e^{'}}_{j, n} A_{0} e_{i, n} \geq 0$ , which is equivalent to ${(Σ_{t r}^{- 1} e_{i, n})}^{'} q_{j} \geq 0.$ We can also consider restrictions on $A_{+} = Q^{'} Σ_{t r}^{- 1} B$ .
Elasticities. Kilian and Murphy (2012) propose augmenting sign restrictions with restrictions on the magnitudes of particular elasticities, which they define as ratios of impulse responses. For example, a lower bound on the impact impulse response of variable i to a shock in the first variable that raises the first variable by one unit is $({e^{'}}_{i, n} Σ_{t r} q_{1}) / ({e^{'}}_{1, n} Σ_{t r} q_{1}) \geq λ$ , where $λ$ is a known scalar. If the impulse response entering the denominator is restricted to be positive, we can equivalently represent this restriction as the linear inequality restriction $({e^{'}}_{i, n} - λ {e^{'}}_{1, n}) Σ_{t r} q_{1} \geq 0$ . We can similarly allow for bounds on ratios of elements of $A_{0}$ .
Functions of the structural shocks (‘narrative restrictions’). Narrative restrictions are inequality restrictions on functions of the structural shocks in specific periods (Antolín-Díaz and Rubio-Ramírez 2018; Ludvigson, Ma and Ng 2017, 2021; Giacomini et al 2023). For example, the restriction that shock j in period k was non-negative is $ε_{j k} = {(Σ_{t r}^{- 1} u_{k})}^{'} q_{j} \geq 0$ . The restriction on the historical decomposition that shock j was the ‘most important contributor’ to the observed unexpected change in variable i between periods k and k+h is $| H_{i, j, k, k + h} | \geq \max_{l \neq j} | H_{i, l, k, k + h} |$ , where

(4)

H_{i, j, k, k + h} = \sum_{l = 0}^{h} {c^{'}}_{i l} (ϕ) q_{j} {q^{'}}_{j} Σ_{t r}^{- 1} u_{k + h - l}

Similarly, the restriction that shock j was the ‘least important contributor’ to the observed unexpected change in variable i between periods k and k+h is $| H_{i, j, k, k + h} | \leq \min_{l \neq j} | H_{i, l, k, k + h} |$ .^[6]

Other restrictions. We can also allow for other types of inequality restrictions, including on long-run cumulative impulse responses (e.g. Furlanetto et al 2025), forecast error variance decompositions (e.g. Volpicella 2022), and the relationships between proxy variables and structural shocks (e.g. Arias et al 2021; Giacomini et al 2022b; Braun and Brüggemann 2023).^[7] We do not detail these types of restrictions here, except to note that they can also be cast as (potentially nonlinear) inequality restrictions on Q.

As general notation, let $S (ϕ, Q) \geq 0_{s \times 1}$ represent a collection of s sign restrictions.^[8] Given the sign restrictions, the identified set for Q is

(5)

𝒬 (ϕ | S) = {Q \in 𝒪 (n) : S (ϕ, Q) \geq 0_{s \times 1}}

The identified set $𝒬 (ϕ | S)$ collects observationally equivalent parameter values, which are parameters corresponding to the same value of the likelihood function (Rothenberg 1971). The identified set for Q induces identified sets for other parameters of interest, such as impulse responses. For example, the identified set for $η_{i j h} (ϕ, Q)$ is ${η_{i j h} (ϕ, Q) : Q \in 𝒬 (ϕ | S)}$ .

2.3 Bayesian inference

The typical approach to conducting Bayesian inference in sign-restricted SVARs involves specifying a normal-inverse-Wishart prior for $ϕ$ along with a uniform prior for Q (e.g. Uhlig 2005; Rubio-Ramírez et al 2010; Arias et al 2018; Inoue and Kilian 2024, 2025). The uniform prior for Q can be motivated by the fact that it assigns equal prior weight to observationally equivalent models or vectors of impulse responses (Arias et al 2025). As discussed below, it is also computationally convenient to obtain draws from a uniform distribution over $𝒪 (n)$ .

In practice, obtaining draws from the resulting posterior for $θ = {(ϕ^{'}, vec {(Q)}^{'})}^{'}$ requires drawing $ϕ$ from its normal-inverse-Wishart posterior and Q from a uniform distribution over $𝒪 (n)$ , and rejecting draws of $θ$ if they do not satisfy $S (ϕ, Q) \geq 0_{s \times 1}$ . As discussed in Uhlig (2017), this procedure implicitly assigns higher prior density – relative to the notional normal-inverse-Wishart prior – to values of $ϕ$ corresponding to ‘larger’ identified sets. It therefore may be appealing to instead use a ‘conditionally uniform’ prior. The conditionally uniform prior also implies that the marginal likelihood is invariant to the choice of conditional prior for Q when the identified set is never empty (Amir-Ahmadi and Drautzburg 2021).

Given these considerations, we focus on the conditionally uniform prior and refer to the corresponding prior (and posterior) for $θ$ as ‘conditionally uniform normal-inverse-Wishart’. Under this prior, draws from the posterior can be obtained by drawing $ϕ$ from its normal-inverse-Wishart posterior, checking whether the identified set is non-empty and, if so, obtaining a fixed number of draws of Q from a uniform distribution over $𝒬 (ϕ | S)$ .

It is possible that $𝒬 (ϕ | S)$ is empty. When this is the case, the support of the reduced-form prior is implicitly truncated to parameter values such that $𝒬 (ϕ | S)$ is non-empty. It is possible to verify whether $𝒬 (ϕ | S)$ is non-empty under particular types of identifying restrictions before attempting to draw values of Q (e.g. Amir-Ahmadi and Drautzburg 2021; Giacomini, Kitagawa and Volpicella 2022; Read 2022). However, we are unaware of approaches to do this that are applicable under the wide class of sign restrictions that we consider. For the purposes of describing the sampling problem in Section 3, we assume that $𝒬 (ϕ | S)$ is non-empty. In Section 5, we describe how to handle the possibility that $𝒬 (ϕ | S)$ is empty in the context of the empirical application.

2.4 Robust Bayesian inference

Let $π_{ϕ}$ be the prior for $ϕ$ (truncated so $𝒬 (ϕ | S)$ is non-empty) and let $π_{Q | ϕ}$ be the conditionally uniform prior for Q, which is proportional to $𝟙 (Q \in 𝒬 (ϕ | S))$ . After observing the data Y, the posterior for the joint parameter vector $θ$ is $π_{θ | Y} = π_{ϕ | Y} π_{Q | ϕ}$ , where $π_{ϕ | Y}$ is the posterior for $ϕ$ . The prior $π_{ϕ}$ is therefore updated by the data (via the likelihood), whereas the conditional prior $π_{Q | ϕ}$ is not. This implies that the posterior for $θ$ may be sensitive to the choice of conditional prior, even asymptotically (e.g. Poirier 1998; Moon and Schorfheide 2012; Baumeister and Hamilton 2015).

Giacomini and Kitagawa (2021a) propose a prior-robust approach to Bayesian inference that eliminates this posterior sensitivity. Their approach can be used to quantify the degree to which posterior inferences are sensitive to the choice of conditional prior and to assess the informativeness of identifying restrictions. In some applications of set-identified SVARs, robust Bayesian methods have revealed that much of the apparent information in the standard Bayesian posterior is contributed by the conditional prior for Q (e.g. Giacomini and Kitagawa 2021a; Giacomini et al 2022b, 2023, forthcoming; Read forthcoming), though this is not necessarily always the case, particularly when rich sets of identifying restrictions are imposed (e.g. Inoue and Kilian 2024).

Conceptually, the prior-robust approach involves replacing $π_{Q | ϕ}$ with the class of all conditional priors that are consistent with the identifying restrictions, and summarising the corresponding class of posterior distributions. Practically, implementing this procedure requires computing the bounds of the identified set for each parameter of interest at each draw from the posterior for $ϕ$ . Giacomini and Kitagawa (2021a) suggest approximating the bounds by obtaining many draws of Q from the uniform distribution over $𝒬 (ϕ | S)$ and computing the minimum and maximum of the parameter of interest over the draws.^[9] A large number of draws may be required to approximate the identified set with a high degree of accuracy (e.g. Montiel Olea and Nesbit 2021).

In the empirical application (Section 5.3), we summarise the class of posteriors using the ‘set of posterior medians’ and a ‘robust credible interval’. The set of posterior medians is an interval with lower (upper) bound equal to the posterior median of the lower (upper) bound of the identified set; this interval contains all posterior medians that could be obtained given the class of priors consistent with the identifying restrictions. The $α$ -level robust credible interval is a robust Bayesian analogue of an equi-tailed Bayesian credible interval; it is an interval that is assigned at least posterior probability $α$ uniformly under any posterior in the class of posteriors.

Footnotes

See Stock and Watson (2016) or Kilian and Lütkepohl (2017) for overviews of identification in SVARs. [5]

Restrictions on the contribution of shock j to the realisation of (as opposed to the forecast error in) variable i could be imposed by placing restrictions on H_i,j,1,t. [6]

To be clear, our approach cannot handle exogeneity restrictions related to proxy variables, since these are equivalent to zero restrictions. [7]

Some types of restrictions depend on other parameters or objects not captured in the current definition of $ϕ$ . For instance, narrative restrictions depend on the data in specific periods via the reduced-form VAR innovations (Antolín-Díaz and Rubio-Ramírez 2018; Giacomini et al 2023). We leave this potential dependence implicit. [8]

An alternative approach is to obtain the bounds by solving a numerical optimisation problem using, for example, gradient-based methods, but this can be computationally burdensome and convergence to the true bounds is not always guaranteed (e.g. Amir-Ahmadi and Drautzburg 2021; Giacomini and Kitagawa 2021a; Montiel Olea and Nesbit 2021). Gafarov et al (2018) propose an active-set algorithm for computing the bounds, but this is only applicable when there are linear restrictions on a single column of Q; their algorithm can also become burdensome when there are many restrictions (Read 2022). [9]