RDP 2012-07: Estimates of Uncertainty around the RBA's Forecasts Appendix B: Data

The data used in this paper are available at <www.rba.gov.au/publications/rdp/2012/2012-07-data.html>, except for the proprietary forecasts from Consensus Economics, which are available via subscription at consensuseconomics.com.

B.1 Sample

Our main results use forecasts beginning in 1993:Q1, when the RBA began targeting an inflation rate of 2 to 3 per cent. Errors before this period were larger, and are arguably unrepresentative of those likely to be encountered under the existing policy framework. Furthermore, the forecasts were less detailed before 1993 – for example, horizons were shorter – which makes comparisons difficult. The latest quarter for which we calculate forecast errors is 2011:Q4.

B.2 Forecasts

Our dataset of forecasts has been put together over the years by a long series of RBA staff, including Dan Andrews, Andrea Brischetto, Adam Cagliarini, David Norman, Anna Park and Ivan Roberts. That data collection represented an enormous effort without which this paper would not have been possible. Previous public uses of these data include Stevens (2004, 2011) and Edey and Stone (2004).

We have spot-checked these data against original and other data sources, but have not sought to rebuild it. Previous compilers of the data made many choices regarding what to include and we largely follow their judgement. One consequence of that approach is that our dataset includes forecasts for different variables made at different times in the quarter. That inconsistency may matter for some questions but does not seem important for our purposes.

The RBA produces several different forecasts throughout a quarter, of which we use one. For the past few years, we use the detailed forecasts summarised in the SMP. Before these were available, our choices largely follow those made in previous internal RBA research, summarised in Table B1. The table lists main data sources, however, there are many exceptions, for example when forecasts are missing, when they are superseded by more authoritative sources, or when an alternative source has a longer horizon. Forecasts sometimes combined the general contour from the SMP with detail from other sources.

Table B1: Main Data Sources for Forecasts
Forecast date Underlying inflation CPI inflation GDP Unemployment
1991:Q1–2000:Q1 JEFG, SMP text
and Board papers
JEFG and
SMP
text
JEFG JEFG
2000:Q2–2004Q2 PDG and SMP text PDG JEFG PDG
2004:Q3–2007:Q4 SMP SMP JEFG and Board papers PDG
2008:Q1–present SMP SMP SMP SMP
Notes: ‘JEFG’ represents the forecast taken to the Joint Economic Forecasting Group meeting.
SMP text’ refers to the verbal description of the inflation outlook in the Statement on Monetary Policy.
‘PDG’ is the forecast prepared for the internal Policy Discussion Group in the middle month of the quarter.
‘SMP’ represents the detailed quarterly forecasts prepared for the Statement on Monetary Policy. The forecasts actually presented in the Statement, typically for year-ended growth rates, have less quarterly detail and precision.
‘Board papers’ represents the forecast prepared in the third month of the quarter for the next Board meeting.

The various data sources differ in terms of detail, intended audience and in other ways, but perhaps their most important difference concerns timing. The forecasts prepared for the Joint Economic Forecast Group (JEFG) were prepared toward the end of the quarter, following the release of the national accounts. Forecasts for the Statement on Monetary Policy (SMP) and the internal Policy Discussion Group (PDG) were prepared in the middle of the quarter, between the release of the CPI data and the national accounts.

At the beginning of our sample the forecast horizon varied between 3 and 6 quarters ahead. It has been gradually extended since then, recently varying between 9 and 11 quarters ahead. Because short-horizon forecasting began earlier, and because those forecasts overlap less, we have many more independent observations of short-horizon errors than we have at longer horizons. So we can talk more confidently about uncertainty regarding the next few quarters than we can about uncertainty regarding the next few years. Indeed, we believe we have too few errors at horizons beyond eight quarters for a reliable sample and we do not include these in our formal analysis.[18]

As mentioned in Section 2, the forecasts have often been conditioned on an assumption of unchanged interest rates. Alternative assumptions, such as choosing a path in line with market expectations, might give more accurate forecasts, however the potential improvement seems likely to be very small. That assessment is partly based on internal post-mortems on specific forecast errors, which have concluded that the constant interest rate assumption was not important. More generally, we regress RBA GDP errors on a measure of the yield curve (the difference between one-year and overnight Treasuries) at the time of the forecast. The coefficient is highly statistically significant (p = 0.001 at a 3-quarter-ahead horizon) and correctly signed, suggesting market assumptions on the path of interest rates could potentially reduce RBA errors. However, the effect on forecast accuracy is tiny: subtracting predicted values from the errors lowers the RMSE by only 5 per cent at a 3-quarter-ahead horizon or by 2 per cent 7 quarters ahead. This difference is barely discernible in charts, and would not qualitatively affect most of the comparisons we make in this paper. Even then, it overestimates the effect, given that it maximises accuracy after the event rather than using information available in real time.

The unimportance of interest rate assumptions is partly because the yield curve predicts short-term interest rates only slightly better than a random walk (see Guidolin and Thornton (2010), and references therein). The unimportance of interest rate assumptions also reflects a large part of the ‘transmission mechanism’ being projected separately. Market expectations are already priced in to the exchange rate, asset prices, and longer-term interest rates; a change in the short-term interest rate assumption need not affect the anticipated path of these variables. Similarly, many of the models used to construct the forecast (most obviously, univariate time series) implicitly embody historical interest rate behaviour. Goodhart (2009) discusses the role of interest rate conditioning assumptions.

As we discuss in the text, there have been many other changes to the forecasts over this period. For example, the forecasts are informed by models that evolve in response to new data and research. The RBA is continually learning, including from examination of past forecast errors.

Finally, before 2000, the source data for GDP forecasts are paper records showing quarterly changes. These records provide quite a limited history; sometimes only two or three quarters. This is insufficient to calculate near-term forecasts of year-ended or year-average changes. Accordingly, we splice the forecast data with real-time estimates from the Stone and Wardrop (2002) database. This is not an issue for inflation forecasts, where the source data are 4-quarter changes.

B.3 Outcomes

Defining actual outcomes or ‘truth’ involves judgement. The most recently published estimates are easily available, and reflect more information and better methods than earlier estimates. In that sense, they may be closer to ultimate ‘truth’. However, they often do not correspond to the series being forecast because definitions have changed.

This is most importantly a problem for underlying inflation, which has changed definition several times, as discussed below. In practice, redefinitions of other variables have not been empirically important in our sample, though two examples may illustrate problems that could occur in the future. First, forecasts up to 2009 assumed that GDP included research and development as an intermediate input. However, the data published since then treat this expenditure as final output of capital. GDP under the later definition (and other changes introduced at the same time) is about 4 per cent larger than GDP measured according to the earlier definition, though average growth rates were little affected. As a second example, until the adoption of chain-weighting in 1998, the ABS periodically used to update the base period from which constant-price estimates were calculated, reducing the weight of goods with declining relative prices, such as computers. This gave rise to predictable revisions to both growth rates and levels. Even though these and other revisions were predictable, our data sources do not include forecasts of revisions to published data. Implicitly, the variable being forecast is an early version, not the ‘final’.

Revisions that arise from changes in definitions are difficult to classify as a forecast error, or as an example of economic uncertainty. One obvious example is when the redefinition is known in advance, but not incorporated in the forecast or in ‘backcasts’. Another obvious example is when multiple forecasts are generated for different definitions, as occurs for inflation. Changing views on the merits of each definition should not be confused with the accuracy of each forecast. This problem can be reduced by using multiple measures of outcomes or by measuring outcomes with real-time data – that is, data published soon after the event.

Using real-time measures of outcomes has other advantages. First, if one can take the data available at the time of the forecast as given, then forecast errors apply both to changes and levels of the variable being forecast. This is a substantial simplification, especially for unemployment. Second, to perform statistical tests, it is helpful if the forecast errors are independent of each other, but that is not the case when subsequent redefinitions or benchmarking impose serial correlation.

However, there are also substantial costs to using real-time data. First, the initial estimates published by the ABS reflect a partial inclusion of source data, combined with various interpolations and extrapolations. Errors defined using these preliminary estimates may reflect skill in mimicking ABS internal procedures rather than an understanding of macroeconomic behaviour. Second, real-time data for some variables can be difficult to obtain.

The literature on forecast errors has generally assumed that the problems from changing definitions outweigh those from incomplete incorporation of source data. So outcomes are typically measured with near-real-time data. Examples include forecast evaluations conducted by the OECD (Vogel 2007), the IMF (Timmerman 2007), the US Federal Reserve (Reifschneider and Tulip 2007), and the ECB (ECB 2009). For a discussion see Robertson and Tallman (1998).

The different timing of data revisions in Australia leads us to a balance that is similar in principle, though slightly different in practice. For GDP, we use the fourth-published estimate; released four quarters after the relevant event. That permits inclusion of most source data, including one round of annual data, while minimising the effect of data redefinitions. For the unemployment rate, we use the estimate as of the forecast one quarter after the event.

For inflation, we have not judged the benefits of compiling a real-time dataset as being worth the costs and instead we use recent estimates. For the total CPI, that is unimportant, given that the data are not revised (forecasts are on a not seasonally adjusted basis). Instead of the headline CPI, some previous researchers have used the CPI excluding interest charges, in an attempt to correct for the constant interest rate assumption, but we did not find the rationale for this complication compelling. For underlying inflation we use recent estimates of various measures, and match these with the definition used at the time of the forecast. For recent forecasts, which are seasonally adjusted, ‘truth’ is the 15th series CPI through 2011:Q2, and 16th series estimates for 2011:Q3 and 2011:Q4. (Because the distribution of price changes is skewed, changes in seasonal adjustment have a noticeable effect on estimates of year-ended underlying inflation). With the exception of the change in seasonal adjustment just mentioned, our forecast errors do not reflect changes in the definition of underlying inflation, though they will include subsequent revisions to each measure. Table B2 shows the series we use for both forecasts and actual outcomes.

Table B2: Measures Of Underlying Inflation
Date of forecast(a) Measure
1991:Q1–1995:Q1 CPI excluding interest charges, fresh fruit and vegetables and automotive fuel.
1995:Q2–1998:Q2 Treasury's underlying rate.
1998:Q3–2005:Q3 Weighted median CPI, excluding interest and tax, not seasonally adjusted.
2005:Q4–2006:Q4 Trimmed mean. Outcomes are seasonally adjusted using 15th series seasonal factors.
2007:Q1–2009:Q2 Average of trimmed mean and weighted median. Outcomes are seasonally adjusted using 15th series seasonal factors.
2009:Q3–2011:Q4 Trimmed mean. Outcomes through 2011:Q2 are seasonally adjusted using 15th series seasonal factors. Outcomes for 2011:Q3 and 2011:Q4 use 16th series seasonal factors.
Notes: (a) Definitions vary with the date of the forecast, not the date of the event. For quarters following a change in definition, we carry two measures of truth: errors are measured using the old measure of truth for long-horizon forecasts and the new measure of truth for short-horizon forecasts.

Croushore (2006) shows that the definition of truth can make a substantial difference to how forecasts are evaluated. However, for the questions raised in this paper, the definition of truth matters only slightly. As one might expect, using a definition closer to that used at the time of the forecast results in smaller errors. For example, as shown in Table 1, the RMSE of RBA 3-quarter-ahead forecasts of underlying inflation is 0.54 percentage points when outcomes are measured using the definitions used at the time of the forecast. These forecasts were more accurate than forecasts using the midpoint of the target at marginal significance levels (p = .06). However, if we measure these errors using the most recent data, the RMSE increases to 0.64, which is no longer statistically different from that of the target (p = .60). Similarly, early estimates of GDP growth have tended to be revised toward the historical mean (but not toward the forecast), so measuring actual GDP growth using recent data would result in a further deterioration in the explanatory power of the GDP forecasts.

Footnote

For underlying inflation, we have 57 six-quarter-ahead errors, 36 seven-quarter-ahead errors, 22 eight-quarter-ahead errors and 10 nine-quarter-ahead errors. [18]