Blog Archive

Showing posts with label hypothesis. Show all posts
Showing posts with label hypothesis. Show all posts

Wednesday, December 15, 2021

Conducting Augmented ARDL in Eviews Using Addin

Introduction

The Augmented ARDL is an approach designed to respond to the question of whether or not the dependent variable should be either I(0) or I(1). With I(0) as the dependent variable, it is difficult to infer long-run relationship between the dependent variable and the regressor(s) even if the F-statistic is above upper critical bound in the well used bounds-testing procedure. The reason is that, in the event that the I(0) is used as the dependent variable, the series will necessarily be stationary (Permit my tautological rigmarole)! This means in jointly testing for long-run relationship via F statistic, the fact that the computed F value is above the upper bound might just reflect the I(0)-ness of the dependent variable. What is more? Other exogenous variables may turn out to be insignificant, suggesting without testing the I(0) dependent variable along with them, the resulting t statistic (if there is only one exogenous variable) or F statistic (if there are more than one exogenous variable) becomes insignificant. Thus, the I(0) variable in the joint relationship may dominate whether or not other variables are significantly contributing to the long-run relationship. The result is always wrong inference.

ARDL at a glance

While the PSS-ARDL approach is a workhorse for estimating and testing for long-run relationship under the joint occurrence of I(0) and I(1) variables, there are certain assumptions the applied researchers often take for granted thereby violating the conditions necessary for using the PSS-ARDL in the first place. For a bivariate specification, the PSS-ARDL(p,q), in its most general form, is given by\[\Delta y_t=\alpha+\beta t+\rho y_{t-1}+\gamma x_{t-1}+\sum_{j=1}^{p-1}\delta_j\Delta y_{t-j}+\sum_{j=0}^{q-1}\theta_j\Delta x_{t-j} +z_t^\prime\Phi+\epsilon_t\]where \(z_t\) represents the exogenous variables and could contain other deterministic variables like dummy variables and \(\Phi\) is the vector of the associated parameters. Based on this specification, Pesaran et al., (2001) highlight five different cases for bounds testing, each informing different null hypothesis testing.  Although some of them are less interesting because they have less practical value, it is instructive to be aware of them:

        • CASE 1: No intercept and no trend
        • CASE 2: Restricted intercept and no trend
        • CASE 3: Unrestricted intercept and no trend
        • CASE 4: Unrestricted intercept and restricted trend
        • CASE 5: Unrestricted intercept and unrestricted trend

Intercept or trend is restricted if it is included in the long-run or levels relationship. For each of these cases, Pesaran et al., (2001) compute the associated t- and F-statistic critical values. These critical values are reported in that paper and readers are invited to consult the paper to obtain the necessary critical values (if you want since these values are reported pro bono). 

The cases above correspond to the following restrictions on the model:
  • CASE 1: The estimated model is given by \[\Delta y_t=\rho y_{t-1}+\gamma x_{t-1}+\sum_{j=1}^{p-1}\delta_j\Delta y_{t-j}+\sum_{j=0}^{q-1}\theta_j\Delta x_{t-j} +z_t^\prime\Phi+\epsilon_t\] and the null hypothesis is \(H_0:\rho=\gamma=0\). This model is recommended if the series have been demeaned and/or detrended. Absent these operations, it should not be used for any analysis except the researcher is strongly persuaded that it is the most suitable for the work or simply for pedagogical purposes.
  • CASE 2: The estimated model is \[\Delta y_t=\alpha+\rho y_{t-1}+\gamma x_{t-1}+\sum_{j=1}^{p-1}\delta_j\Delta y_{t-j}+\sum_{j=0}^{q-1}\theta_j\Delta x_{t-j} +z^\prime_t\Phi+\epsilon_t\]where in this case \(\beta=0\). The null hypothesis \(H_0:\alpha=\rho=\gamma=0\). The restrictions imply that both the dependent variable and the regressors are moving around their respective mean values. Think of the parameter \(\alpha\) as \(\alpha=-\rho\zeta_y-\gamma\zeta_x\), where \(\zeta_i\) are the respective mean values or the steady state values to  which the variables gravitate in the long run. Substituting this restriction into the model, we have \[\Delta y_t=\rho(y_{t-1}-\zeta_y)+\gamma (x_{t-1}-\zeta_x)+\sum_{j=1}^{p-1}\delta_j\Delta y_{t-j}+\sum_{j=0}^{q-1}\theta_j\Delta x_{t-j} +z^\prime_t\Phi+\epsilon_t\]This model therefore possesses some practical values and is suitable for modelling the behaviour of some variables in the long run. However, because the dependent variable do not possess the trend due to the absence of intercept in the short run, this specification's utility is limited given that most economic variables are I(1).
  • CASE 3: The estimated model is the same as in CASE 2 with \(\beta=0\). However, \(H_0:\rho=\gamma=0\). This implies the intercept is pushed into the short-run relationship and it means the dependent variable has a linear trend,  trending upwards or downwards depending on the direction dictated by \(\alpha\). This characteristic is benign if the dependent variable is really having the trend in it. However, this is not a feature of I(0) dependent variable. As most macroeconomic variables are I(1), this specification is often recommended. In Eviews, it's the default setting for model specification.
  • CASE 4: The model estimated for CASE 4 is the full model. Here, trend is restricted while the intercept is unrestricted. The null hypothesis is therefore  \(H_0:\beta=\rho=\gamma=0\). This specification suggests that the dependent variable is trending in the long run. If, in the long run, the dependent variable is not trending, it means this specification might just be a wrong choice to model the dependent variable.
  • The last case where both the intercept and the trend are unrestricted is a perverse description of the macroeconomic variables. It is a full model but it means the dependent variable is trending quadratically. This does not fit most cases and is rarely used. The null hypothesis is  \(H_0:\rho=\gamma=0\).
The F statistic and the associated t statistic for bounds testing are reported in PSS. 

Getting More Gist about ARDL from ADF

The F statistic for bounds testing referred to above is necessary, but it is not sufficient, to detect whether or not there is long run relationship between the dependent variable and the regressors. The reason for this is the presence of both I(0) and I(1) and their treatment as the dependent variable in the given model. Note that one of the requirements for valid inference about the existence of cointegration between the dependent and the regressors is that the dependent variable must be I(1). We can get the gist of this point by looking more closely at the relationship between the ARDL and the ADF model. You may be wondering why the dependent variable must be I(1) in the ARDL model specification. The first thing to observe is that ARDL is a multivariate formulation of the augmented Dickey Fuller (ADF). Does that sound strange? 

Suppose \(H_0: \gamma=\theta_0=\theta_1=\cdots=\theta_q=0\), that is, the insignificance of other exogenous variables in the model, cannot be rejected. Then the model reduces to the standard ADF. From this, we can see that if \(\rho\) is significantly negative, stationarity is established. If this is the case, variable \(y_t\) will be reckoned as I(0). Thus, the ADF is given by\[\Delta y_t=\alpha+\rho y_{t-1}+\sum_{j=1}^{p-1}\delta_j\Delta y_{t-j}+\epsilon_t\]The fact that \(y_t\) is stationary at levels means that \(\rho\) must be significant whether or not the coefficients on other variables are significant. Therefore, in a test involving this I(0) variable as a dependent variable and possibly I(1) as independent variable(s), and where the coefficient on the latter is found to be insignificant, it's still possible to find cointegration not because there is one between these variables, but because the significance of the (lag of) dependent variable dominates the joint test and because only a subset of the associated alternative hypothesis is being considered. This is what the bounds testing does without separating the significance of \(\rho\) and \(\gamma\). Note that F test for bounds testing is based on the joint significance of these parameters. However, the joint test of \(\rho\) and \(\gamma\) does not tell us about the significance of \(\gamma\).

Degenerate cases

How then can we proceed here? More tests needed. To find out how, we must first realize what the issues are really like in this case. At the center of this are the two cases of degeneracy. They arise because the bounds testing (a joint F test) involves both the coefficient on the lagged dependent variable \(\rho\) in the model above and the coefficients of lagged exogenous variables. Although PSS reported the t statistic for \(\rho\) separately with a view to having robust inference, not only do the researchers often ignore it, the t statistic so reported along with the F statistic is not enough to avoid the pitfall. In short, the null hypothesis for the bounds testing \(H_{0}: \rho=\gamma=0\) can be seen as a compound one involving  \(H_{0,1}: \rho=0\) and \(H_{0,2}: \gamma=0\). So rejection of either is not a proof of cointegration. This is because the alternative is not just \(H_{0}: \rho\neq\gamma\neq 0\) as often assumed in application; the alternative instead involves \(H_{0,1}: \rho\neq0\) and \(H_{0,2}: \gamma\neq0\) as well. In other words, a more comprehensive hypothesis testing procedure must involve the null hypotheses of these alternatives. Thus, we have the following null hypotheses:
        1. \(H_{0}: \rho=\gamma= 0\), and \(H_{1,1}: \rho\neq0\), \(H_{1,2}: \gamma\neq0\) 
        2. \(H_{0,1}: \rho=0\) and \(H_{1,1}: \rho\neq0\)
        3. \(H_{1,2}: \gamma=0\) and \(H_{1,2}: \gamma\neq0\) 

Taxonomies of Augmented Bounds Test 

Therefore, we state the following taxonomy for testing hypothesis:
      • if the null hypotheses (1) and (2) are not rejected but (3) is, we have a case of degenerate lagged independent variable. This case implies absence of cointegration;
      • if the null hypotheses (1) and (3) are not rejected but (2) is, we have a case of degenerate lagged dependent variable. This case also implies absence of cointegration; and
      • if the null hypotheses (1), (2) and (3) are rejected, then there is cointegration 
We now have a clear roadmap to follow. What this implies is that one needs to augment the testing as stated above. Hence the augmented ARDL testing procedure. With this procedure for testing for cointegration, it is no longer an issue whether or not the dependent variable is I(0) or I(1) as long as all the three null hypotheses are rejected.

Now the Eviews addin...

First note that this addin has been written in Eviews 12. Its functionality in lower version is therefore not guaranteed. 

Using Eviews for testing this hypothesis should be straightforward but may be laborious. Eviews can help you here. All that is needed is reporting all the three cases noted above as against the two cases reported in Eviews. The following addin helps you with all the computations you might need to do. 

To use it, just estimate your ARDL model as usual and then use the Proc tab to locate the add ins. In Figure 1, we have the ARDL method environment. Two variables are included. I choose the maximum lag of eight because I have enough quarterly data, 596 observations in total.

Figure 1

Once the model is estimated, use the Proc tab to locate Add-ins as shown in Figure 2 

Figure 2

Click on Augmented ARDL Bound Test and you will have the figure referred to in Figure 3. The tests are reported underneath what you see here. Just scroll down to look them up.


Figure 3

Figure 4


What is shown in Figure 4 should be the same as reported natively by Eviews. The addition that has been appended by this addin is the Exogenous F-Bounds Test shown in Figure 5. For the confirmation of the test, we append the Wald test for exogenous variables in the spool. It comes under the title exogenous_wald_table. You can click to view it.

Figure 5

In this example, we are sure of cointegration because all the three computed statistics are above the upper bound, suggested no case of degeneracy is lurking in our results.

Note the following...

Before working on the ARDL output, be sure to name it. At the moment, if the output is UNTITLED, Error 169 will be generated. The glitch is a really slippery error. It will be corrected later. 

The results have the fill of the existing table for bounds testing in Eviews but have been appended with the tests for the exogenous variables. The F statistic is used for testing the exogenous variables. Thus, we have the section for Overall F-Bounds Test which is the Null Hypothesis (1) above; the section for the t-Bounds Test which is the Null Hypothesis (2); and, the section for the Exogenous F-Bounds Test which is the Null Hypothesis (3). The first two of these sections should be the same as in the native Eviews report. The last is an addition based on the paper by Sam, McNown and Goh (2018).  

From the application point of view, in this case, the Exogenous F-Bounds test for Cases 2 and 3 are the same:
  • CASE 2: Restricted intercept and no trend
  • CASE 3: Unrestricted intercept and no trend
  •  just as Cases 4 and 5 are the same:

  • CASE 4: Unrestricted intercept and restricted trend
  • CASE 5: Unrestricted intercept and unrestricted trend
  • Therefore, the same critical values are reported for them in the literature. Thus, in the Eviews addin the long run for both cases are reported. 

    The link to the addin is here. The data used in this example is here.

    Thank you for reading a long post😀.

    Sunday, December 5, 2021

    Bootstrap for the critical values in small sample: Addin download

    Introduction

    The bootstrap is a Monte Carlo strategy being employed in statistical analysis with growing popularity. This growing popularity among the practitioners is due to a couple of reasons. First, it helps overcome the small sample problem. As most econometric critical values derive from asymptotic distribution, it is often difficult to reconcile the small sample used in the estimation with these asymptotic results when testing hypotheses. Secondly, the approach is non-parametric in the sense that it is not based on a priori assumption about the distribution. The distribution is data-based and therefore bespoke to the internal consistency of the data used. In some cases, the analytical results are difficult to derive. Thus, when asymptotic critical values are difficult to justify because the observations are small, or the distributional assumptions underlying the results are questionable, one can instead employ bootstrap approach. 

    If you bootstrap, it figuratively means you are helping yourself out from the quicksand all by yourself, using the strap of your boots to lift yourself out. The same way, the approach does not require those stringent assumptions that clog the wheel of analysis. Thus, the bootstrap uses available data as the basis for computing the critical values through the process of sampling the data with replacement. The process of sampling this way is iterated many times, each iteration using the same length of observations as the original data. Each observation in the sample has equal chance of being included during each iteration. The sampled data, also called pseudo or synthetic data, are used to perform the analysis. The result is a collection of statistics or values that mimicks the distribution from where the observed data come from. 


    How it works...

    The generic procedure to carrying out bootstrap simulation is as follows:

    Suppose one has \(n\) observations \(\{y_i\}_{i=1}^n \) , and is interested in computing statistics \(\tau_n\). Let \(\hat{\tau}_n\) be its estimate using \(n\) observations. Now, assume one samples with replacement \(n\) observations. At a given iteration, the possibility of some of these observations being sampled more than once while some others won't be sampled is there. But if one drags the chain long enough repeating the iterations, hopefully all of them will be included. Then the set of statistics based on this large number of iterations, say B, is represented as\[\hat{\tau}_n^1, \cdots,\hat{\tau}_n^B\]Basic statistics makes us understand that if it's a statistic, then it must have a distribution. The above is the distribution using the bootstrap strategy. One can then compute all manners of statistics of interest: mean, median, variance, standard deviation etc. In fact, one can decide to plot the graph to see what the distribution is like. 

    Within the regression analysis, the same steps are involved although the focus is now on the parameters and the restrictions derived from them. To concretize the analysis, I suppose we have the following model:\[y_t=\alpha + \theta_1 y_{t-1} + \theta_2 y_{t-2}+ \theta_3 y_{t-3} +\eta_1 x_{t-1} + \eta_2 x_{t-2}+ \eta_3 x_{t-3} +\epsilon_t\] This equation can be viewed as the y-equation of a bivariate VAR(3) model. A natural exercise in this context is the non-causality test. For this, one can set up the null hypothesis that\[H_0: \eta_1=\eta_2=\eta_3=0\]Of course in Eviews this is easily carried out. After estimation, click on the View tab and hover to Coefficient Diagnostics. Follow the right arrow and click on Wald Test - Coefficient Restrictions.... You'll be prompted by a dialog box. There, you can input the restriction as

    C(5)=C(6)=C(7)=0

    The critical values in this case are based on asymptotic distribution of \(\chi^2\) or \(F\) depending on which distribution is chosen. Indeed, the two of them are related. However, they may not give appropriate answers either because they are asymptotic or because parametric assumptions are made. 

    Bootstrapping the regression model

    You can use bootstrap strategy instead. This is how you can proceed. Using the model above, for example, you can 

    1. Estimate the model and obtain the residuals; 
    2. De-mean the residuals by subtracting the mean of the residuals. The reason for this is to ensure that the residuals are somewhat centered around zero, the same way that the random errors they represent center around zero. That is, \(\epsilon_t\sim N(0,\sigma^2)\). Let the (centered) residuals be represented as \(\epsilon_t^*\);
    3. Using the centered residuals and conditional on the estimated parameters, reconstruct the model as \[y_t^*=\hat{\alpha} + \hat{\theta_1} y_{t-1}^* + \hat{\theta}_2 y_{t-2}^*+ \hat{\theta}_3 y_{t-3}^* +\hat{\eta}_1 x_{t-1} + \hat{\eta}_2 x_{t-2}+ \hat{\eta}_3 x_{t-3} +\epsilon_t^*\]where the hat symbolizes the estimated values in Step 1. In this particular exercise, the process is recursively carried out because of the lags of the endogenous variables among the regressors. Also note that the process is initialized by setting the \(y_0=y_{-1}=y_{-2}=0\). In static models, it is sufficient to substitute the estimated coefficients and the exogenous regressors. This is the stage where the (centered) residuals are sampled without replacement;
    4. Using the computed pseudo data for endogenous variable, \(y_t^*\), estimate the model: \[y_t^*=\gamma + \mu_1 y_{t-1}^* + \mu_2 y_{t-2}^*+ \mu_3 y_{t-3}^* +\zeta_1 x_{t-1} + \zeta_2 x_{t-2}+ \zeta_3 x_{t-3} +\xi_t\] 
    5. Set up the restriction, C(5)=C(6)=C(7)=0, test the implied hypothesis, and save the statistic;  
    6. Repeat Steps 3-5 B times. I suggest 999 times. 

    These are the steps involved in bootstrap simulation. The resulting distribution can then be used to decide whether or not there is a causal effect from variable \(x_t\) to variable \(y_t\). Of course, you can compute various percentiles. For \(\chi^2\) or \(F\), whose domain is positive, the critical levels can be computed for 1, 5 and 10 percent by issuing @quantile(wchiq,\(\tau)\) where wchiq is the vector of 999 statistics and \(\tau=0.99\) for 1 percent, \(\tau=0.95\) for 5 percent and \(\tau=0.90\) for 10 percent. What @quantile() internally does is order the elements of the vector and then select the corresponding values for the percentiles of interest. In the absence of this function, which is inbuilt in Eviews, you can therefore manually order the elements from the lowest to the highest and then select the values that correspond to the respective percentile positions.

    This addin

    The Eviews addin, which can be downloaded here, carries out all the steps listed above for LS, ARDL, BREAKLS, THRESHOLD, COINTREG, VARSEL and QREG methods. It works directly with the equation object. This means you will first estimate your model just as usual and then run the addin on the estimated model equation. In what follows I show you how this can be used with two examples.

    An example... 

    I estimate the break least square model: LD C LD(-1 to -2) LE(-1 to -2). This is with a view to testing the symmetry of causal effect across the regimes. Four regimes are detected using the Bai-Perron L+1 vs L sequentially determined breaks approach. The break dates are 1914Q2, 1937Q1 and 1959Q2. The coefficients of the lagged LE for the first regime are C(4) and C(5) and for the second regime they are C(9) and C(10). The third and the fourth regimes have C(14) and C(15), and C(19) and C(20) respectively. The restriction we want to test is whether the causal effects are similar across these four regimes. The causal effect is the sum of the estimated coefficients in each regime. Thus, for this restriction, we test the following hypothesis, the rejection of which will indicate there are asymmetric causal effects across the regimes:

    C(4)+C(5)=C(9)+C(10)=C(14)+C(15)=C(19)+C(20)

    The model is estimated and the output is in Figure 1. 


    Figure 1

    After estimating the model, you can access the addin from the Proc tab where you can locate Add-ins. Follow the right arrow to locate Bootstrap MC Restriction Simulation. This is shown in Figure 2. If you have other addins for equation, they'll be listed here.

    Figure 2

    In Figure 3, the Bootstrap Restrictions dialog box shows up. You can interact with the different options and prompts. In the edit box, you could input the restrictions just as you would do using the Eviews Wald test environment. Four options are listed under the Bootstrap Monte Carlos, where the default is Bootstrap. The number of iterations can be selected. There are three of them: 99 (good for overviewing the preliminary results), 499 and 999. If you want the graph generated, then you can check the Distribution graph. Lastly, follow me @ olayeniolaolu.blogspot.com also stares at you with💗.

    Figure 3

    In Figure 4, I input the restriction discussed above and also check Distribution graph because I want one (who would not want that?). 

    Figure 4

    The result is a graph reported in Figure 5. The computed value is in the acceptance region and so we cannot reject the null hypothesis that there is no causal effect across the four regimes. 
    Figure 5

    Another example...

    Consider an ARDL(3,2) model. The model is given by \[ld_t=\alpha+\sum_{j=1}^3\theta_j ld_{t-j}+\sum_{j=0}^2\eta_j le_{t-j}+\xi_t\]This model can be reparameterized as \[ld_t=\gamma+\beta le_t +\sum_{j=1}^3\theta_j ld_{t-j}+\sum_{j=0}^1\eta_j \Delta le_{t-j}+\xi_t\]The long-run relationship is then stated as\[ld_t=\mu +\varphi le_t+u_t\]where \(\mu=\gamma(1-\sum_{j=1}^3\theta_j)^{-1}\) and \(\varphi=\beta(1-\sum_{j=1}^3\theta_j)^{-1}\).

    To estimate this model, I use the reparameterized version and then input the following expression using the LS (note!) method:

    ld c le ld(-1 to -3) d(le) d(le(-1))

    From the same equation output, we want to compute the distributions for the long-run parameters \(\mu\) and \(\varphi\). For the intercept (\(\mu\)), I input the following restriction

    C(1)/(1-C(3)-C(4)-C(5))=0

    while for the slope (\(\varphi\)), I input the following restriction

    C(2)/(1-C(3)-C(4)-C(5))=0

    Although the addin does not plot the graphs for these distributions, the vectors of their respectively values are generated and stored in the workfile. Therefore, you can work on them as you desire. In this example, I report the distributions of the two estimates of the long-run coefficients in Figures 6 and 7.

    Figure 6


    Figure 7
    You can use this estimate for bias correction if there are reasons to suspect overestimation or underestimation of intercept and slope.

    Working with Output

    Apart from plotting the graphs, which you can present in your work, the addin gives you access to the simulated results in vectors. The three vectors that you will have after testing your restriction are bootcoef##, boottvalue## and f_bootwald##, where ## indicates the precedence number appended to every instance of the objects generated after the first time. The first reports the vector of estimates of coefficient if only one restriction is involved. The hypothesis involves only one restriction if there is only one equality (=) sign in it. boottvalue is the vector of the corresponding value in this case. f_bootwald refers to the F statistic for a hypothesis having more than one restriction. It reports the joint test. You can use these for specific, tailor-made, analysis in your research. Let me take you through one that may interest you. Can we compute the bootstrap confidence interval for the restriction tested previously? I suggest you do it for the restricted coefficient. For you to do this, you need both the mean value and standard deviation from the distribution, the latter depending on the mean value. But you don't need to go through that route because Eviews has inbuilt routines that help deliver a good number of statistics for everyday use. So you can compute the mean value directly using the following: 

                            sdev=@stdev(bootcoef)

    This routine computes the standard deviation given by:\[\hat{SE}(\hat{\tau}^b)=\left[\frac{1}{B-1}\sum_{b=1}^B (\hat{\tau}^b-\bar{\hat{\tau}}^b)^2\right]^{1/2}\]where the mean value is given by\[\bar{\hat{\tau}}^b=\frac{1}{B}\sum_{b=1}^B \hat{\tau}^b\]The following code snippet will do the bootstrap confidence interval:

    scalar meanv=@mean(bootcoef)
    scalar sdev=@stdev(bootcoef) 
    scalar z_u=@quantile(bootcoef, 0.025)
    scalar z_l=@quantile(bootcoef, 0.925)
    scalar lowerbound=meanv-z_l*sdev 
    scalar upperbound=meanv-z_u*sdev 

    Using this code, you will find the lower bound to be 0.208 and the upper bound to be 0.466, while the mean value will be 0.439. 

    Perhaps you are interested further in robustness check. The bootstrap-t confidence interval can be computed. Use the following code snippet:

    scalar meanv=@mean(bootcoef)
    scalar sdev=@stdev(bootcoef) 
    vector zboot=(bootcoef-meanv)/sdev
    scalar t_u=@quantile(zboot, 0.025)
    scalar t_l=@quantile(zboot, 0.925)
    scalar lowerbound=meanv-t_l*sdev 
    scalar upperbound=meanv-t_u*sdev 

    For more Eviews statistical routines that you can use for specific analysis, look them up here. Again, this addin can be accessed here. The data used can be accessed here as well.

    Glad that you've followed me to this point. 








    Unit root test with partial information on the break date

    Introduction Partial information on the location of break date can help improve the power of the test for unit root under break. It is this ...