- 1
- Ahn, S. C. 1995. Robust GMM Tests for Model Specification.Arizona State University (Working Paper).

Total in-text references: 1- In-text reference with the coordinate start=47525
- Prefix
- For excluded instruments, this is equivalent to dropping them from the instrument list. For included instruments, theCtest hypothecates placing them in the list of included endogenous variables: in essence, treating them as endogenous regressors. TheCtest, 14See
- Exact
- Ahn (1995),
- Suffix
- Proposition 1, or, for an alternative formulation, Wooldridge (1995), Procedure 3.2. 15See Hayashi (2000), pp. 218–21 and pp. 232–34 or Ruud (2000), Chapter 22, for comprehensive presentations. distributedχ2with degrees of freedom equal to the loss of overidentifying restrictions (i.e., the number of suspect instruments being tested), has the null hypothesis that the specified variables are pro

- In-text reference with the coordinate start=47525
- 2
- Arellano, M. 1987. Computing robust standard errors for within–groups estimators.

Total in-text references: 1- In-text reference with the coordinate start=21869
- Prefix
- 0ˆΣM (36) then an estimator ofSthat is consistent in the presence of arbitrary intra–cluster correlation is Sˆ=1 n (Z′ˆΩCZ)(37) The earliest reference to this approach to robust estimation in the presence of clustering of which we are aware is White (1984), pp. 135–6. It is commonly employed in the context of panel data estimation; see Wooldridge (2002), p. 193,
- Exact
- Arellano (1987) and
- Suffix
- K ́ezdi (2002). It is the standard Stata approach to clustering, implemented in, e.g., robust,regressandivreg2.4 The cluster–robust covariance matrix for IV estimation is obtained exactly as in the preceding subsection except usingˆSas defined in Equation (37).

- In-text reference with the coordinate start=21869
- 4
- Basmann, R. 1960. On finite sample distributions of generalized classical linear identifiability test statistics.Journal of the American Statistical Association55(292): 650–659.

Total in-text references: 2- In-text reference with the coordinate start=5138
- Prefix
- We may cast some light on whether the instruments satisfy the orthogonality conditions in the context of an overidentified model: that is, one in which a surfeit of instruments are available. In that context we may test the overidentifying restrictions in order to provide some evidence of the instruments’ validity. We present the variants of this test due to
- Exact
- Sargan (1958), Basmann (1960) and,
- Suffix
- in the GMM context, L. Hansen (1982), and show how the generalization of this test, theCor “difference–in–Sargan” test, can be used test the validity of subsets of the instruments. Although there may well be reason to suspect non–orthogonality between regressors and errors, the use of IV estimation to address this problem must be balanced against the inevitable loss of efficiency vis–`a–vis OLS.

- In-text reference with the coordinate start=42832
- Prefix
- The literature contains several variations on this test. The main idea behind these variations is that there is more than one way to consistently estimate the variance in the denominator of (43). The most important of these is that of
- Exact
- Basmann (1960).
- Suffix
- Independently of Sargan, Basmann proposed anF(L−K,n−L)-test of overidentifying restrictions: Basmann’sF-statistic = uˆ′PZˆu/(L−K) uˆ′MZˆu/(n−L) (44) whereMZ≡I−PZis the “annihilator” matrix andLis the total number of instruments.

- In-text reference with the coordinate start=5138
- 5
- Bound, J., D. A. Jaeger, and R. Baker. 1995. Problems with instrumental variables estimation when the correlation between the instruments and the endogeneous explanatory variable is weak.Journal of the American Statistical Association90: 443–450.

Total in-text references: 1- In-text reference with the coordinate start=33642
- Prefix
- The first stage regressions are reduced form regressions of the endogenous variablesX1on the full set of instrumentsZ; the relevant test statistics here relate to the explanatory power of the excluded instrumentsZ1in these regressions. A statistic commonly used, as recommended e.g., by
- Exact
- Bound et al. (1995),
- Suffix
- is theR2of the first–stage regression with the included instruments “partialled-out”.8Alternatively, this may be expressed as theF–test of the joint significance of theZ1instruments in the first–stage regression.

- In-text reference with the coordinate start=33642
- 6
- Bowden, R. J. and D. A. Turkington. 1984.Instrumental Variables. Cambridge: Cambridge University Press.

Total in-text references: 1- In-text reference with the coordinate start=56953
- Prefix
- The Hausman statistic for the endogeneity test can also be expressed in terms of a test of the coefficients of the endogenous regressors alone and the rest of theβs removed. In this alternate form, the matrix difference in the expression equivalent to (47) is positive definite and a generalized inverse is not required. See
- Exact
- Bowden and Turkington (1984),
- Suffix
- pp. 50–51. This version of the statistic was proposed by separately by Wu (1973) (hisT3statistic) and Hausman (1978). It can be obtained within Stata by usinghausmanwith the (undocumented)sigmalessoption.

- In-text reference with the coordinate start=56953
- 7
- Breusch, T. S. and A. R. Pagan. 1979. A simple test for heteroskedasticity and random coefficient variation.Econometrica47: 1287–1294.

Total in-text references: 1- In-text reference with the coordinate start=27271
- Prefix
- We describe this test in the next section. 3Testing for heteroskedasticity The Breusch–Pagan/Godfrey/Cook–Weisberg and White/Koenker statistics are standard tests of the presence of heteroskedasticity in an OLS regression. The principle is to test for a relationship between the residuals of the regression andpindicator variables that are hypothesized to be related to the heteroskedasticity.
- Exact
- Breusch and Pagan (1979), Godfrey (1978), and Cook and Weisberg (1983)
- Suffix
- separately derived the same test statistic. This statistic is distributed asχ2withpdegrees of freedom under the null of no heteroskedasticity, and under the maintained hypothesis that the error of the regression is normally distributed.

- In-text reference with the coordinate start=27271
- 8
- Chamberlain, G. 1982. Multivariate regression models for panel data.Journal of Econometrics18: 5–46.

Total in-text references: 1- In-text reference with the coordinate start=24741
- Prefix
- but correct inference is still possible through the use of the Eicker–Huber– White “sandwich” robust covariance estimator, and this estimator can also be derived using the general formula for the asymptotic variance of a GMM estimator with a sub– optimal weighting matrix, Equation (24). A natural question is whether a more efficient GMM estimator exists, and the answer is “yes”
- Exact
- (Chamberlain (1982), Cragg (1983)).
- Suffix
- If the disturbance is heteroskedastic, there are no endogenous regressors, and the researcher has available additional moment conditions, i.e., additional variables that do not appear in the regression but that are known to be exogenous, then the efficient GMM estimator is that of Cragg (1983), dubbed “heteroskedastic OLS” (HOLS) by Davidson and MacKinnon (1993), p. 600.

- In-text reference with the coordinate start=24741
- 9
- Cook, R. D. and S. Weisberg. 1983. Diagnostics for heteroscedasticity in regression. Biometrika70: 1–10.

Total in-text references: 1- In-text reference with the coordinate start=27271
- Prefix
- We describe this test in the next section. 3Testing for heteroskedasticity The Breusch–Pagan/Godfrey/Cook–Weisberg and White/Koenker statistics are standard tests of the presence of heteroskedasticity in an OLS regression. The principle is to test for a relationship between the residuals of the regression andpindicator variables that are hypothesized to be related to the heteroskedasticity.
- Exact
- Breusch and Pagan (1979), Godfrey (1978), and Cook and Weisberg (1983)
- Suffix
- separately derived the same test statistic. This statistic is distributed asχ2withpdegrees of freedom under the null of no heteroskedasticity, and under the maintained hypothesis that the error of the regression is normally distributed.

- In-text reference with the coordinate start=27271
- 10
- Cragg, J. 1983. More efficient estimation in the presence of heteroskedasticity of unknown form.Econometrica51: 751–763.

Total in-text references: 2- In-text reference with the coordinate start=24741
- Prefix
- but correct inference is still possible through the use of the Eicker–Huber– White “sandwich” robust covariance estimator, and this estimator can also be derived using the general formula for the asymptotic variance of a GMM estimator with a sub– optimal weighting matrix, Equation (24). A natural question is whether a more efficient GMM estimator exists, and the answer is “yes”
- Exact
- (Chamberlain (1982), Cragg (1983)).
- Suffix
- If the disturbance is heteroskedastic, there are no endogenous regressors, and the researcher has available additional moment conditions, i.e., additional variables that do not appear in the regression but that are known to be exogenous, then the efficient GMM estimator is that of Cragg (1983), dubbed “heteroskedastic OLS” (HOLS) by Davidson and MacKinnon (1993), p. 600.

- In-text reference with the coordinate start=25077
- Prefix
- If the disturbance is heteroskedastic, there are no endogenous regressors, and the researcher has available additional moment conditions, i.e., additional variables that do not appear in the regression but that are known to be exogenous, then the efficient GMM estimator is that of
- Exact
- Cragg (1983),
- Suffix
- dubbed “heteroskedastic OLS” (HOLS) by Davidson and MacKinnon (1993), p. 600. It can be obtained in precisely the same way as feasible efficient two–step GMM except now the first–step inefficient but consistent estimator used to generate the residuals is OLS rather than IV.

- In-text reference with the coordinate start=24741
- 11
- Cumby, R. E., J. Huizinga, and M. Obstfeld. 1983. Two-step two-stage least squares estimation in models with rational expectations.Journal of Econometrics21: 333– 355.

Total in-text references: 1- In-text reference with the coordinate start=15183
- Prefix
- This yields βˆEGMM= (X′Z(Z′ˆΩZ)−1Z′X)−1X′Z(Z′ˆΩZ)−1Z′y(29) with asymptotic variance V(ˆβEGMM) = (X′Z(Z′ˆΩZ)−1Z′X)−1(30) 1This estimator goes under various names: “2-stage instrumental variables”(2SIV), White (1982); “2-step 2-stage least squares”,
- Exact
- Cumby et al. (1983);
- Suffix
- “heteroskedastic 2-stage least squares” (H2SLS); Davidson and MacKinnon (1993), p. 599. A variety of other feasible GMM procedures are also possible. For example, the procedure above can be iterated by obtaining the residuals from the two–step GMM estimator, using these to calculate a newˆS, using this in turn to calculate the three–step feasible efficient GMM estimator, and so forth,

- In-text reference with the coordinate start=15183
- 12
- Davidson, R. and J. G. MacKinnon. 1993.Estimation and Inference in Econometrics. 2nd ed. New York: Oxford University Press.

Total in-text references: 10- In-text reference with the coordinate start=7042
- Prefix
- There are a number of good modern texts that cover GMM, and one recent prominent text, Hayashi (2000), presents virtually all the estimation techniques discussed in the GMM framework. A concise on–line text that covers GMM is Hansen (2000). The exposition below draws on Hansen (2000), Chapter 11; Hayashi (2000), Chapter 3; Wooldridge (2002), Chapter 8;
- Exact
- Davidson and MacKinnon (1993), and Greene (2000).
- Suffix
- We begin with the standard IV estimator, and then relate it to the GMM framework. We then consider the issue of clustered errors, and finally turn to OLS. 2.1The method of instrumental variables The equation to be estimated is, in matrix notation, y=Xβ+u,E(uu′) = Ω(1) with typical row yi=Xiβ+ui(2) The matrix of regressorsXisn×K, wherenis the number of observations.

- In-text reference with the coordinate start=9348
- Prefix
- The instrumental variables estimator ofβis βˆIV= (X′Z(Z′Z)−1Z′X)−1X′Z(Z′Z)−1Z′y= (X′PZX)−1X′PZy(8) This estimator goes under a variety of names: the instrumental variables (IV) estimator, the generalized instrumental variables estimator (GIVE), or the two-stage leastsquares (2SLS) estimator, the last reflecting the fact that the estimator can be calculated in a two–step procedure. We follow
- Exact
- Davidson and MacKinnon (1993),
- Suffix
- p. 220 and refer to it as the IV estimator rather than 2SLS because the basic idea of instrumenting is central, and because it can be (and in Stata, is more naturally) calculated in one step as well as in two.

- In-text reference with the coordinate start=15253
- Prefix
- This yields βˆEGMM= (X′Z(Z′ˆΩZ)−1Z′X)−1X′Z(Z′ˆΩZ)−1Z′y(29) with asymptotic variance V(ˆβEGMM) = (X′Z(Z′ˆΩZ)−1Z′X)−1(30) 1This estimator goes under various names: “2-stage instrumental variables”(2SIV), White (1982); “2-step 2-stage least squares”, Cumby et al. (1983); “heteroskedastic 2-stage least squares” (H2SLS);
- Exact
- Davidson and MacKinnon (1993),
- Suffix
- p. 599. A variety of other feasible GMM procedures are also possible. For example, the procedure above can be iterated by obtaining the residuals from the two–step GMM estimator, using these to calculate a newˆS, using this in turn to calculate the three–step feasible efficient GMM estimator, and so forth, for as long as the user wishes or until the estimator converges; this is the “i

- In-text reference with the coordinate start=21329
- Prefix
- In effect, under conditional homoskedasticity, the continuously updated GMM estimator is the LIML estimator. Calculating the LIML estimator does not require numerical optimatization methods; it can be calculated as the solution to an eigenvalue problem (see, e.g.,
- Exact
- Davidson and MacKinnon (1993),
- Suffix
- pp. 644–51). defineˆΩCas the block–diagonal form ΩˆC= Σˆ10 .. . Σˆm .. . 0ˆΣM (36) then an estimator ofSthat is consistent in the presence of arbitrary intra–cluster correlation is Sˆ=1 n (Z′ˆΩCZ)(37) The earliest reference to this approach to robust estimation in the presence of clustering of which we are aware is White (1984), pp. 135–6.

- In-text reference with the coordinate start=25130
- Prefix
- If the disturbance is heteroskedastic, there are no endogenous regressors, and the researcher has available additional moment conditions, i.e., additional variables that do not appear in the regression but that are known to be exogenous, then the efficient GMM estimator is that of Cragg (1983), dubbed “heteroskedastic OLS” (HOLS) by
- Exact
- Davidson and MacKinnon (1993),
- Suffix
- p. 600. It can be obtained in precisely the same way as feasible efficient two–step GMM except now the first–step inefficient but consistent estimator used to generate the residuals is OLS rather than IV.

- In-text reference with the coordinate start=40896
- Prefix
- This can be quite important in practice: Hoxby and Paserman (1998) have shown that the presence of intra–cluster correlation can readily cause a standard overidentification statistic to over–reject the null. 11Thus
- Exact
- Davidson and MacKinnon (1993),
- Suffix
- p. 236: “Tests of overidentifying restrictions should be calculated routinely whenever one computes IV estimates.” Sargan’s own view, cited in Godfrey (1988), p. 145, was that regression analysis without testing the orthogonality assumptions is a “pious fraud”. 4.3Overidentifying restrictions in IV In the special case of linear instrumental variables under conditional heteroskedasticity, t

- In-text reference with the coordinate start=44076
- Prefix
- Consequently,overidcalculates the uncenteredR2itself; the uncentered total sum of squares of the auxiliary regression needed for the denominator ofR2uis simply the residual sum of squares of the original IV regression. 13See
- Exact
- Davidson and MacKinnon (1993),
- Suffix
- pp. 235–6. The Basmann statistic uses the error variance from the estimate of their equation (7.54), and the pseudo–Fform of the Basmann statistic is given by equation (7.55); the Sargan statistic is given by their (7.57).

- In-text reference with the coordinate start=69463
- Prefix
- At−test of the significance of ˆvin this auxiliary regression is then a direct test of the null hypothesis—in this context, thatθ= 0: y=β1x1+X2β2+θˆv+-(51) 24A more detailed presentation of the test can be found in
- Exact
- Davidson and MacKinnon (1993),
- Suffix
- pp. 237– 42. The Wu–Hausman test may be readily generalized to multiple endogenous variables, since it merely requires the estimation of the first–stage regression for each of the endogenous variables, and augmentation of the original model with their residual series.

- In-text reference with the coordinate start=70420
- Prefix
- Wu (1974)’s Monte Carlo studies also suggest that this statistic is to be preferred to the statistic using just ˆσ2IV. A version of the Wu–Hausman statistic for testing a subset of regressors is also available, as
- Exact
- Davidson and MacKinnon (1993),
- Suffix
- pp. 241–242 point out. The modified test involves estimating the first–stage regression for each of theK1Bvariables inX1B in order to generate a residual series. These residual seriesˆVBare then used to augment the original model: y=X1Aδ+X1Bλ+X2β+ˆVBΘ +-(52) which is then estimated via instrumental variables, with onlyX1Aspecified as included endogenous variables.

- In-text reference with the coordinate start=71308
- Prefix
- An inconvenient complication here is that an ordinaryF-test for the significance of Θ in this auxiliary regression willnotbe valid, because the unrestricted sum of squares needed for the denominator is wrong, and obtaining the correct SSR requires further steps (see
- Exact
- Davidson and MacKinnon (1993),
- Suffix
- chapter 7). Only in the special case where the efficient estimator is OLS will an ordinaryF−test yield the correct test statistic. The auxiliary regression approach to obtaining the Wu–Hausman statistic described above has the further disadvantage of being computationally expensive and practically cumbersome when there are more than a few endogenous variables to be test

- In-text reference with the coordinate start=7042
- 13
- Durbin, J. 1954. Errors in variables.Review of the International Statistical Institute 22: 23–32.

Total in-text references: 1- In-text reference with the coordinate start=55780
- Prefix
- If a common estimate ofσis used, then the generalized inverse ofDis guaranteed to exist and a positive test statistic is guaranteed.19 If the Hausman statistic is formed using the OLS estimate of the error variance, then theDmatrix in Equation (45) becomes D= ˆσ2OLS ( (X′PZX)−1−(X′X)−1 ) (47) This version of the endogeneity test was first proposed by
- Exact
- Durbin (1954) and
- Suffix
- separately by Wu (1973) (hisT4statistic) and Hausman (1978). It can be obtained within Stata by usinghausmanwith thesigmamoreoption in conjunction with estimation byregress, ivregand/orivreg2. If the Hausman statistic is formed using the IV estimate of the error variance, then theDmatrix becomes D= ˆσ2IV ( (X′PZX)−1−(X′X)−1 ) (48) 18Readers should also bear in mind here and below that the estimat

- In-text reference with the coordinate start=55780
- 14
- Eichenbaum, M. S., L. P. Hansen, and K. J. Singleton. 1988. A time series analysis of representative agent models of consumption and leisure.Quarterly Journal of Economics103(1): 51–78.

Total in-text references: 1- In-text reference with the coordinate start=46676
- Prefix
- In these contexts, a “difference–in–Sargan” statistic may usefully be employed.15 The test is known under other names as well, e.g., Ruud (2000) calls it the “distance difference” statistic, and Hayashi (2000) follows
- Exact
- Eichenbaum et al. (1988) and
- Suffix
- dubs it theCstatistic; we will use the latter term. TheCtest allows us to test a subset of the original set of orthogonality conditions. The statistic is computed as the difference between two Sargan statistics (or, for efficient GMM, twoJstatistics): that for the (restricted, fully efficient) regression using the entire set of overidentifying restrictions, versus that for the (unres

- In-text reference with the coordinate start=46676
- 15
- Godfrey, L. G. 1978. Testing for multiplicative heteroskedasticity.Journal of Econometrics8: 227–236. —. 1988.Misspecification tests in econometrics: The Lagrange multiplier principle and other approaches. Cambridge: Cambridge University Press. —. 1999. Instrument relevance in multivariate linear models.Review of Economics &

Total in-text references: 1- In-text reference with the coordinate start=27271
- Prefix
- We describe this test in the next section. 3Testing for heteroskedasticity The Breusch–Pagan/Godfrey/Cook–Weisberg and White/Koenker statistics are standard tests of the presence of heteroskedasticity in an OLS regression. The principle is to test for a relationship between the residuals of the regression andpindicator variables that are hypothesized to be related to the heteroskedasticity.
- Exact
- Breusch and Pagan (1979), Godfrey (1978), and Cook and Weisberg (1983)
- Suffix
- separately derived the same test statistic. This statistic is distributed asχ2withpdegrees of freedom under the null of no heteroskedasticity, and under the maintained hypothesis that the error of the regression is normally distributed.

- In-text reference with the coordinate start=27271
- 16
- Greene, W. H. 2000.Econometric Analysis. 4th ed. Upper Saddle River, NJ: Prentice– Hall.

Total in-text references: 3- In-text reference with the coordinate start=7042
- Prefix
- There are a number of good modern texts that cover GMM, and one recent prominent text, Hayashi (2000), presents virtually all the estimation techniques discussed in the GMM framework. A concise on–line text that covers GMM is Hansen (2000). The exposition below draws on Hansen (2000), Chapter 11; Hayashi (2000), Chapter 3; Wooldridge (2002), Chapter 8;
- Exact
- Davidson and MacKinnon (1993), and Greene (2000).
- Suffix
- We begin with the standard IV estimator, and then relate it to the GMM framework. We then consider the issue of clustered errors, and finally turn to OLS. 2.1The method of instrumental variables The equation to be estimated is, in matrix notation, y=Xβ+u,E(uu′) = Ω(1) with typical row yi=Xiβ+ui(2) The matrix of regressorsXisn×K, wherenis the number of observations.

- In-text reference with the coordinate start=10444
- Prefix
- n (15) we obtain the estimated asymptotic variance–covariance matrix of the IV estimator: V(ˆβIV) = ˆσ2(X′Z(Z′Z)−1Z′X)−1= ˆσ2(X′PZX)−1(16) Note that some packages, including Stata’sivreg, include a degrees–of–freedom correction to the estimate of ˆσ2by replacingnwithn−L. This correction is not necessary, however, since the estimate of ˆσ2would not be unbiased anyway
- Exact
- (Greene (2000),
- Suffix
- p. 373). Ourivreg2routine defaults to the large–sample formulas for the estimated error variance and covariance matrix; the user can request the small–sample versions with the optionsmall. 2.2The Generalized Method of Moments The standard IV estimator is a special case of a Generalized Method of Moments (GMM) estimator.

- In-text reference with the coordinate start=56482
- Prefix
- ˆσ2IV ( (X′PZX)−1−(X′X)−1 ) (48) 18Readers should also bear in mind here and below that the estimates of the error variances may or may not have small-sample corrections, according to the estimation package used and the options chosen. If one of the variance-covariance matrices inDuses a small-sample correction, then so should the other. 19The matrix difference in (47) and (48) has rankK1; see
- Exact
- Greene (2000),
- Suffix
- pp. 384–385. Intuitively, the variables being tested are those not shared byXandZ, namely theK1endogenous regressors X1. The Hausman statistic for the endogeneity test can also be expressed in terms of a test of the coefficients of the endogenous regressors alone and the rest of theβs removed.

- In-text reference with the coordinate start=7042
- 17
- Hahn, J. and J. Hausman. 2002a. A new specification test for the validity of instrumental variables.Econometrica70(1): 163–89. —. 2002b. Notes on bias in estimators for simultaneous equation models.Economics Letters75(2): 237–41.

Total in-text references: 1- In-text reference with the coordinate start=37444
- Prefix
- Since the size of theIVbias is increasing in the number of instruments (Hahn and Hausman (2002b)), one recommendation when faced with this problem is to be parsimonious in the choice of instruments. For further discussion see, e.g.,
- Exact
- Staiger and Stock (1997), Hahn and Hausman (2002a), Hahn and Hausman (2002b), and
- Suffix
- the references cited therein. 9The Shea partialR2statistic may be easily computed according to the simplification presented in Godfrey (1999), who demonstrates that Shea’s statistic for endogenous regressorimay be expressed as R2p= νOLSi,i νIVi,i [ (1−R2IV) (1−R2OLS) ] whereνi,iis the estimated asymptotic variance of the coefficient. 10One approach in the literature, following Staiger and Stock (1

- In-text reference with the coordinate start=37444
- 18
- Hansen,B.E.2000.Econometrics.1sted.Madison,WI: http://www.ssc.wisc.edu/ bhansen/notes/notes.htm.

Total in-text references: 2- In-text reference with the coordinate start=6903
- Prefix
- There are a number of good modern texts that cover GMM, and one recent prominent text, Hayashi (2000), presents virtually all the estimation techniques discussed in the GMM framework. A concise on–line text that covers GMM is
- Exact
- Hansen (2000).
- Suffix
- The exposition below draws on Hansen (2000), Chapter 11; Hayashi (2000), Chapter 3; Wooldridge (2002), Chapter 8; Davidson and MacKinnon (1993), and Greene (2000). We begin with the standard IV estimator, and then relate it to the GMM framework.

- In-text reference with the coordinate start=6954
- Prefix
- There are a number of good modern texts that cover GMM, and one recent prominent text, Hayashi (2000), presents virtually all the estimation techniques discussed in the GMM framework. A concise on–line text that covers GMM is Hansen (2000). The exposition below draws on
- Exact
- Hansen (2000),
- Suffix
- Chapter 11; Hayashi (2000), Chapter 3; Wooldridge (2002), Chapter 8; Davidson and MacKinnon (1993), and Greene (2000). We begin with the standard IV estimator, and then relate it to the GMM framework.

- In-text reference with the coordinate start=6903
- 19
- Hansen, L. 1982. Large sample properties of generalized method of moments estimators. Econometrica50(3): 1029–1054.

Total in-text references: 3- In-text reference with the coordinate start=2244
- Prefix
- The conventional IV estimator (though consistent) is, however, inefficient in the presence of heteroskedasticity. The usual approach today when facing heteroskedasticity of unknown form is to use the Generalized Method of Moments (GMM), introduced by L.
- Exact
- Hansen (1982).
- Suffix
- GMM makes use of the orthogonality conditions to allow for efficient estimation in the presence of heteroskedasticity of unknown form. In the twenty years since it was first introduced, GMM has become a very popular tool among empirical researchers.

- In-text reference with the coordinate start=5196
- Prefix
- In that context we may test the overidentifying restrictions in order to provide some evidence of the instruments’ validity. We present the variants of this test due to Sargan (1958), Basmann (1960) and, in the GMM context, L.
- Exact
- Hansen (1982), and
- Suffix
- show how the generalization of this test, theCor “difference–in–Sargan” test, can be used test the validity of subsets of the instruments. Although there may well be reason to suspect non–orthogonality between regressors and errors, the use of IV estimation to address this problem must be balanced against the inevitable loss of efficiency vis–`a–vis OLS.

- In-text reference with the coordinate start=39315
- Prefix
- as a standard diagnostic in any overidentified instrumental variables estimation.11These are tests of the joint hypotheses of correct model specification and the orthogonality conditions, and a rejection may properly call either or both of those hypotheses into question. In the context of GMM, the overidentifying restrictions may be tested via the commonly employedJstatistic of
- Exact
- Hansen (1982).
- Suffix
- This statistic is none other than the value of the GMM objective function (20), evaluated at the efficient GMM estimatorˆβEGMM. Under the null, J(ˆβEGMM) =ng(ˆβ)′ˆS−1g(ˆβ) A ∼χ2L−K(41) In the case of heteroskedastic errors, the matrixˆSis estimated using theˆΩ matrix (27), and theJstatistic becomes J(ˆβEGMM) = ˆuZ′(Z′ˆΩZ)−1Zˆu′ A ∼χ2L−K(42) With clustered errors, theˆΩCmatrix (37) can be us

- In-text reference with the coordinate start=2244
- 20
- Hansen, L., J. Heaton, and A. Yaron. 1996. Finite sample properties of some alternative

Total in-text references: 1- In-text reference with the coordinate start=17767
- Prefix
- Instead of first obtaining an optimal weighting matrix and then taking it as given when maximizing Equation (20), we can write the optimal weighting matrix as a function ofˆβ, and chooseˆβto maximizeJ(ˆβ) =ngn(ˆβ)′W(ˆβ)gn(ˆβ). This is the “continuously updated GMM” of
- Exact
- Hansen et al. (1996);
- Suffix
- it requires numerical optimization methods. 3It is worth noting that the IV estimator is not the only such efficient GMM estimator under conditional homoskedasticity. Instead of treating ˆσ2as a parameter to be estimated in a second stage, what if we return to the GMM criterion function and minimize by simultaneously choosing What are the implications of

- In-text reference with the coordinate start=17767
- 22
- Hausman, J. 1978. Specification tests in econometrics.Econometrica46(3): 1251–1271.

Total in-text references: 4- In-text reference with the coordinate start=53494
- Prefix
- Denote byˆβcthe estimator that is consistent under both the null and the alternative hypotheses, and byˆβethe estimator that is fully efficient under the null but inconsistent if the null is not true. The
- Exact
- Hausman (1978)
- Suffix
- specification test takes the quadratic form H=n(ˆβc−ˆβe)′D−(ˆβc−ˆβe) where D= ( V(ˆβc)−V(ˆβe) )(45) and whereV(ˆβ) denotes a consistent estimate of the asymptotic variance ofβ, and the 17See Wooldridge (2002), pp. 58–61, and Wooldridge (1995) for more detailed discussion. operator−denotes a generalized inverse.

- In-text reference with the coordinate start=55842
- Prefix
- , then the generalized inverse ofDis guaranteed to exist and a positive test statistic is guaranteed.19 If the Hausman statistic is formed using the OLS estimate of the error variance, then theDmatrix in Equation (45) becomes D= ˆσ2OLS ( (X′PZX)−1−(X′X)−1 ) (47) This version of the endogeneity test was first proposed by Durbin (1954) and separately by Wu (1973) (hisT4statistic) and
- Exact
- Hausman (1978).
- Suffix
- It can be obtained within Stata by usinghausmanwith thesigmamoreoption in conjunction with estimation byregress, ivregand/orivreg2. If the Hausman statistic is formed using the IV estimate of the error variance, then theDmatrix becomes D= ˆσ2IV ( (X′PZX)−1−(X′X)−1 ) (48) 18Readers should also bear in mind here and below that the estimates of the error variances may or may not have small-sample cor

- In-text reference with the coordinate start=57140
- Prefix
- In this alternate form, the matrix difference in the expression equivalent to (47) is positive definite and a generalized inverse is not required. See Bowden and Turkington (1984), pp. 50–51. This version of the statistic was proposed by separately by Wu (1973) (hisT3statistic) and
- Exact
- Hausman (1978).
- Suffix
- It can be obtained within Stata by usinghausmanwith the (undocumented)sigmalessoption. Use ofhausmanwith thesigmamoreorsigmalessoptions avoids the additional annoyance that because Stata’shausmantries to deduce the correct degrees of freedom for the test from the rank of the matrixD, it may sometimes come up with the wrong answer.

- In-text reference with the coordinate start=68356
- Prefix
- Yet another asymptotically equivalent flavor of the DWH test is available for standard IV estimation under conditional homoskedasticity, and is included in the output ofivendog. This is the test statistic introduced by Wu (1973) (hisT2), and separately shown by
- Exact
- Hausman (1978) to
- Suffix
- be calculated straightforwardly through the use of auxiliary regressions. We will refer to it as the Wu–Hausman statistic.24 Consider a simplified version of our basic model (1) with a single endogenous regressorx1: y=β1x1+X2β2+u,(49) withX2≡Z2assumed exogenous (including the constant, if one is specified) and with excluded instrumentsZ1as usual.

- In-text reference with the coordinate start=53494
- 23
- Hausman, J. A. and W. E. Taylor. 1981. A generalized specification test.Economics Letters8: 239–245.

Total in-text references: 1- In-text reference with the coordinate start=63226
- Prefix
- We can state this more precisely as follows: IfLe−Lc≤Kc1, theCstatistic and the Hausman statistic are numerically 21Users beware: thesigmamoreoption following arobustestimation will not only fail to accomplish this, it will generate an invalid test statistic as well. 22See
- Exact
- Hausman and Taylor (1981) and Newey (1985),
- Suffix
- summarized by Hayashi (2000), pp. 233–34. equivalent.23IfLe−Lc> Kc1, the two statistics will be numerically different, theC statistic will haveLe−Lcdegrees of freedom, and the Hausman statistic will haveKc1 degrees of freedom in the conditional homoskedasticity case (and an unknown number of degrees of freedom in the conditional heteroskedasticity case).

- In-text reference with the coordinate start=63226
- 24
- Hayashi, F. 2000.Econometrics. 1st ed. Princeton, NJ: Princeton University Press.

Total in-text references: 7- In-text reference with the coordinate start=6764
- Prefix
- The syntax diagrams for these commands are presented in the last section of the paper, and the electronic supplement presents annotated examples of their use. 2IV and GMM estimation The “Generalized Method of Moments” was introduced by L. Hansen in his celebrated 1982 paper. There are a number of good modern texts that cover GMM, and one recent prominent text,
- Exact
- Hayashi (2000),
- Suffix
- presents virtually all the estimation techniques discussed in the GMM framework. A concise on–line text that covers GMM is Hansen (2000). The exposition below draws on Hansen (2000), Chapter 11; Hayashi (2000), Chapter 3; Wooldridge (2002), Chapter 8; Davidson and MacKinnon (1993), and Greene (2000).

- In-text reference with the coordinate start=6985
- Prefix
- There are a number of good modern texts that cover GMM, and one recent prominent text, Hayashi (2000), presents virtually all the estimation techniques discussed in the GMM framework. A concise on–line text that covers GMM is Hansen (2000). The exposition below draws on Hansen (2000), Chapter 11;
- Exact
- Hayashi (2000),
- Suffix
- Chapter 3; Wooldridge (2002), Chapter 8; Davidson and MacKinnon (1993), and Greene (2000). We begin with the standard IV estimator, and then relate it to the GMM framework. We then consider the issue of clustered errors, and finally turn to OLS. 2.1The method of instrumental variables The equation to be estimated is, in matrix notation, y=Xβ+u,E(uu′) = Ω(1) with typical row yi=Xiβ+ui(2) The matrix

- In-text reference with the coordinate start=25966
- Prefix
- The advantages of GMM over IV are clear: if heteroskedasticity is present, the GMM estimator is more efficient than the simple IV estimator, whereas if heteroskedasticity is not present, the GMM estimator is no worse asymptotically than the IV estimator. Nevertheless, the use of GMM does come with a price. The problem, as
- Exact
- Hayashi (2000)
- Suffix
- points out (p. 215), is that the optimal weighting matrixˆSat the core of efficient GMM is a function of fourth moments, and obtaining reasonable estimates of fourth moments may require very large sample sizes.

- In-text reference with the coordinate start=46653
- Prefix
- In these contexts, a “difference–in–Sargan” statistic may usefully be employed.15 The test is known under other names as well, e.g., Ruud (2000) calls it the “distance difference” statistic, and
- Exact
- Hayashi (2000)
- Suffix
- follows Eichenbaum et al. (1988) and dubs it theCstatistic; we will use the latter term. TheCtest allows us to test a subset of the original set of orthogonality conditions. The statistic is computed as the difference between two Sargan statistics (or, for efficient GMM, twoJstatistics): that for the (restricted, fully efficient) regression using the entire set of overidentifying res

- In-text reference with the coordinate start=47626
- Prefix
- For included instruments, theCtest hypothecates placing them in the list of included endogenous variables: in essence, treating them as endogenous regressors. TheCtest, 14See Ahn (1995), Proposition 1, or, for an alternative formulation, Wooldridge (1995), Procedure 3.2. 15See
- Exact
- Hayashi (2000),
- Suffix
- pp. 218–21 and pp. 232–34 or Ruud (2000), Chapter 22, for comprehensive presentations. distributedχ2with degrees of freedom equal to the loss of overidentifying restrictions (i.e., the number of suspect instruments being tested), has the null hypothesis that the specified variables are proper instruments.

- In-text reference with the coordinate start=48866
- Prefix
- More precisely,ˆSfrom the restricted estimation is used to form the restrictedJstatistic, and the submatrix ofˆSwith rows/columns corresponding to the unrestricted estimation is used to form theJstatistic for the unrestricted estimation; see
- Exact
- Hayashi (2000),
- Suffix
- p. 220. TheCtest is conducted inivreg2by specifying theorthogoption, and listing the instruments (either included or excluded) to be challenged. The equation must still be identified with these instruments either removed or reconsidered as endogenous if the Cstatistic is to be calculated.

- In-text reference with the coordinate start=63284
- Prefix
- We can state this more precisely as follows: IfLe−Lc≤Kc1, theCstatistic and the Hausman statistic are numerically 21Users beware: thesigmamoreoption following arobustestimation will not only fail to accomplish this, it will generate an invalid test statistic as well. 22See Hausman and Taylor (1981) and Newey (1985), summarized by
- Exact
- Hayashi (2000),
- Suffix
- pp. 233–34. equivalent.23IfLe−Lc> Kc1, the two statistics will be numerically different, theC statistic will haveLe−Lcdegrees of freedom, and the Hausman statistic will haveKc1 degrees of freedom in the conditional homoskedasticity case (and an unknown number of degrees of freedom in the conditional heteroskedasticity case).

- In-text reference with the coordinate start=6764
- 25
- Hoxby, C. and M. D. Paserman. 1998. Overidentification tests with grouped data.

Total in-text references: 1- In-text reference with the coordinate start=40724
- Prefix
- TheJstatistic is calculated and displayed byivreg2 when thegmm,robust, orclusteroptions are specified. In the last case, theJstatistic will be consistent in the presence of arbitrary intra–cluster correlation. This can be quite important in practice:
- Exact
- Hoxby and Paserman (1998)
- Suffix
- have shown that the presence of intra–cluster correlation can readily cause a standard overidentification statistic to over–reject the null. 11Thus Davidson and MacKinnon (1993), p. 236: “Tests of overidentifying restrictions should be calculated routinely whenever one computes IV estimates.

- In-text reference with the coordinate start=40724
- 28
- Koenker, R. 1981. A note on Studentizing a test for heteroskedasticity.Journal of Econometrics17: 107–112.

Total in-text references: 1- In-text reference with the coordinate start=27592
- Prefix
- Breusch and Pagan (1979), Godfrey (1978), and Cook and Weisberg (1983) separately derived the same test statistic. This statistic is distributed asχ2withpdegrees of freedom under the null of no heteroskedasticity, and under the maintained hypothesis that the error of the regression is normally distributed.
- Exact
- Koenker (1981)
- Suffix
- noted that the power of this test is very sensitive to the normality assumption, and presented a version of the test that relaxed this assumption. Koenker’s test statistic, also distributed asχ2punder the null, is easily obtained asnR2c, whereR2cis the centeredR2from an auxiliary regression of the squared residuals from the original regression on the indicator variables.

- In-text reference with the coordinate start=27592
- 29
- Moulton, B. R. 1986. Random group effects and the precision of regression estimates. Journal of Econometrics32: 385–397.

Total in-text references: 1- In-text reference with the coordinate start=23451
- Prefix
- But users should take care that, if theclusteroption is used, then it ought to be the case thatM >> K.5 4There are other approaches to dealing with clustering that put more structure on the Ω matrix and hence are more efficient but less robust. For example, the
- Exact
- Moulton (1986)
- Suffix
- approach to obtaining consistent standard errors is in effect to specify an “error components” (a.k.a. “random effects”) structure in Equation (36): Σmis a matrix with diagonal elementsσ2u+σ2vand off-diagonal elements σ2v.

- In-text reference with the coordinate start=23451
- 30
- Nakamura, A. and M. Nakamura. 1981. On the relationships among several specification error tests presented by Durbin, Wu, and Hausman.Econometrica49(6): 1583–1588.

Total in-text references: 2- In-text reference with the coordinate start=70148
- Prefix
- The test statistic then becomes anF−test, with numerator degrees of freedom equal to the number of included endogenous variables. One advantage of the Wu–Hausman F−statistic over the other DWH tests for IV vs. OLS is that with certain normality assumptions, it is a finite sample test exactly distributed asF(see
- Exact
- Wu (1973) and Nakamura and Nakamura (1981)). Wu (1974)
- Suffix
- ’s Monte Carlo studies also suggest that this statistic is to be preferred to the statistic using just ˆσ2IV. A version of the Wu–Hausman statistic for testing a subset of regressors is also available, as Davidson and MacKinnon (1993), pp. 241–242 point out.

- In-text reference with the coordinate start=72429
- Prefix
- n (53) and the Wu–HausmanF−statistic can be written Wu-Hausman:F(K1B,n−K−K1B) = Q∗/K1B (USSR−Q∗)/(n−K−K1B) (54) whereQ∗is the difference between the restricted and unrestricted sums of squares given by the auxiliary regression (51) or (52), andUSSRis the sum of squared residuals from the efficient estimate of the model.25From the discussion in the preceding section, 25See Wu (1973) or
- Exact
- Nakamura and Nakamura (1981).
- Suffix
- Q∗can also be interpreted as the difference between the sums of squares of the second–stage estimation of the efficient model with and without however, we know that for tests of the endogeneity of regressors, theCstatistic and the Hausman form of the DWH test are numerically equal, and when the error variance from the more efficient estimation is used, the Hausman form of the DWH test is the Durbi

- In-text reference with the coordinate start=70148
- 31
- Newey, W. 1985. Generalized method of moments specification testing.Journal of Econometrics29: 229–256.

Total in-text references: 1- In-text reference with the coordinate start=63226
- Prefix
- We can state this more precisely as follows: IfLe−Lc≤Kc1, theCstatistic and the Hausman statistic are numerically 21Users beware: thesigmamoreoption following arobustestimation will not only fail to accomplish this, it will generate an invalid test statistic as well. 22See
- Exact
- Hausman and Taylor (1981) and Newey (1985),
- Suffix
- summarized by Hayashi (2000), pp. 233–34. equivalent.23IfLe−Lc> Kc1, the two statistics will be numerically different, theC statistic will haveLe−Lcdegrees of freedom, and the Hausman statistic will haveKc1 degrees of freedom in the conditional homoskedasticity case (and an unknown number of degrees of freedom in the conditional heteroskedasticity case).

- In-text reference with the coordinate start=63226
- 32
- Pagan, A. R. and D. Hall. 1983. Diagnostic tests as residual analysis.Econometric Reviews2(2): 159–218.

Total in-text references: 5- In-text reference with the coordinate start=4082
- Prefix
- The usual Breusch–Pagan/Godfrey/Cook–Weisberg and White/Koenker tests for the presence of heteroskedasticity in a regression equation can be applied to an IV regression only under restrictive assumptions. In Section 3 we discuss the test of
- Exact
- Pagan and Hall (1983)
- Suffix
- designed specifically for detecting the presence of heteroskedasticity in IV estimation, and its relationship to these other heteroskedasticity tests. Even when IV or GMM is judged to be the appropriate estimation technique, we may still question its validity in a given application: are our instruments “good instruments”?

- In-text reference with the coordinate start=26749
- Prefix
- If in fact the error is homoskedastic, IV would be preferable to efficient GMM. For this reason a test for the presence of heteroskedasticity when one or more regressors is endogenous may be useful in deciding whether IV or GMM is called for. Such a test was proposed by
- Exact
- Pagan and Hall (1983), and
- Suffix
- we have implemented it in Stata asivhettest. We describe this test in the next section. 3Testing for heteroskedasticity The Breusch–Pagan/Godfrey/Cook–Weisberg and White/Koenker statistics are standard tests of the presence of heteroskedasticity in an OLS regression.

- In-text reference with the coordinate start=28331
- Prefix
- When the indicator variables are the regressors of the original equation, their squares and their cross-products, Koenker’s test is identical to White’snR2cgeneral test for heteroskedasticity (White (1980)). These tests are available in Stata, following estimation with regress, using ourivhettestas well as viahettestandwhitetst. As
- Exact
- Pagan and Hall (1983)
- Suffix
- point out, the above tests will be valid tests for heteroskedasticity in an IV regression only if heteroskedasticity is present in that equation andnowhere else in the system. The other structural equations in the system (corresponding to the endogenous regressorsX1) must also be homoskedastic, even though they are not being explicitly estimated.6Pagan and Hall derive a test which r

- In-text reference with the coordinate start=29949
- Prefix
- The levels, squares, and cross-products of the instrumentsZ(excluding the constant), as in the White (1980) test. This is the default inivhettest. 2. The levels only of the instrumentsZ(excluding the constant). This is available inivhettestby specifying theivlevoption. 6For a more detailed discussion, see
- Exact
- Pagan and Hall (1983)
- Suffix
- or Godfrey (1988), pp. 189–90. 7We note here that the original Pagan–Hall paper has a serious typo in the presentation of their non-normality-robust statistic. Their equation (58b), p. 195, is missing the term (in their terminology) −2μ3ψ(ˆX′ˆX)−1ˆX′D(D′D)−1.

- In-text reference with the coordinate start=31333
- Prefix
- Let Ψ =1n ∑n i=1Ψidimension =n×p Dˆ≡1n∑ni=1Ψ′i(ˆu2i−ˆσ2)dimension =n×1 ˆΓ =1 n ∑n i=1(Ψi− Ψ)ˆ′Xiˆuidimension =p×K (38) ˆμ3=1n ∑n i=1ˆu 3 i ˆμ4=1n ∑n i=1ˆu 4 i Xˆ=PzX Ifuiis homoskedastic and independent ofZi, then
- Exact
- Pagan and Hall (1983)
- Suffix
- (Theorem 8) show that under the null of no heteroskedasticity, nˆD′ˆB−1ˆD A ∼χ2p(39) where Bˆ=B1+B2+B3+B4 B1= (ˆμ4−ˆσ4)1n(Ψi−Ψ)′(Ψi−Ψ) B2=−2ˆμ31nΨ′ˆX(1nˆX′ˆX)−1ˆΓ′ B3=B′2 B4= 4ˆσ21nˆΓ′(1nˆX′ˆX)−1ˆΓ (40) This is the default statistic produced byivhettest.

- In-text reference with the coordinate start=4082
- 33
- Pesaran, M. H. and L. W. Taylor. 1999. Diagnostics for IV regressions.Oxford Bulletin of Economics & Statistics61(2): 255–281.

Total in-text references: 2- In-text reference with the coordinate start=30337
- Prefix
- Their equation (58b), p. 195, is missing the term (in their terminology) −2μ3ψ(ˆX′ˆX)−1ˆX′D(D′D)−1. The typo reappears in the discussion of the test by Godfrey (1988). The correction published in
- Exact
- Pesaran and Taylor (1999)
- Suffix
- is incomplete, as it applies only to the version of the Pagan–Hall test with a single indicator variable. 3. The “fitted value” of the dependent variable. This isnotthe usual fitted value of the dependent variable,Xˆβ.

- In-text reference with the coordinate start=32804
- Prefix
- The Pagan–Hall statistic has not been widely used in practice, perhaps because it is not a standard feature of most regression packages. For a discussion of the relative merits of the Pagan–Hall test, including some Monte Carlo results, see
- Exact
- Pesaran and Taylor (1999).
- Suffix
- Their findings suggest caution in the use of the Pagan–Hall statistic particularly in small samples; in these circumstances thenR2cstatistic may be preferred. 4Testing the relevance and validity of instruments 4.1Testing the relevance of instruments An instrumental variable must satisfy two requirements: it must be correlated with the included endogenous variable(s), and ortho

- In-text reference with the coordinate start=30337
- 34
- Ruud, P. A. 2000.An Introduction to Classical Econometric Theory. Oxford: Oxford

Total in-text references: 6- In-text reference with the coordinate start=46592
- Prefix
- Another common problem arises when the researcher has prior suspicions about the validity of a subset of instruments, and wishes to test them. In these contexts, a “difference–in–Sargan” statistic may usefully be employed.15 The test is known under other names as well, e.g.,
- Exact
- Ruud (2000)
- Suffix
- calls it the “distance difference” statistic, and Hayashi (2000) follows Eichenbaum et al. (1988) and dubs it theCstatistic; we will use the latter term. TheCtest allows us to test a subset of the original set of orthogonality conditions.

- In-text reference with the coordinate start=47672
- Prefix
- For included instruments, theCtest hypothecates placing them in the list of included endogenous variables: in essence, treating them as endogenous regressors. TheCtest, 14See Ahn (1995), Proposition 1, or, for an alternative formulation, Wooldridge (1995), Procedure 3.2. 15See Hayashi (2000), pp. 218–21 and pp. 232–34 or
- Exact
- Ruud (2000),
- Suffix
- Chapter 22, for comprehensive presentations. distributedχ2with degrees of freedom equal to the loss of overidentifying restrictions (i.e., the number of suspect instruments being tested), has the null hypothesis that the specified variables are proper instruments.

- In-text reference with the coordinate start=49783
- Prefix
- This illustrates how the Hansen–Sargan overidentification test is an “omnibus” test for the failure ofanyof the instruments to satisfy the orthogonality conditions, but at the same time requires that the investigator believe that at leastsome of the instruments are valid; see
- Exact
- Ruud (2000),
- Suffix
- p. 577. 4.5Tests of overidentifying restrictions as Lagrange multiplier (score) tests The Sargan test can be viewed as analogous to a Lagrange multiplier (LM) or score test.16In the case of OLS, the resemblance becomes exact.

- In-text reference with the coordinate start=51016
- Prefix
- If thegmmoption is chosen, HOLS estimates are reported along with a robust LM statistic. As usual, theclusteroption generates 16For a detailed discussion of the relationship between the different types of tests in a GMM framework, see
- Exact
- Ruud (2000),
- Suffix
- Chapter 22. a statistic that is robust to arbitrary intra–cluster correlation. If the estimation method is OLS but the error is not homoskedastic, then the standard LM test is no longer valid. A heteroskedasticity–robust version is, however, available.17The robust LM statistic for OLS is numerically equivalent to theJstatistic from feasible efficient two–step GMM, i.e.

- In-text reference with the coordinate start=62737
- Prefix
- In the conditional heteroskedasticity case, the degrees of freedom will beLe−LcifLe−Lc≤Kc1but unknown otherwise (making the test impractical).22 What, then, is the difference between the GMMCtest and the Hausman specification test? In fact, because the two estimators being tested are both GMM estimators, the Hausman specification test is a test of linear combinations of orthogonality conditions
- Exact
- (Ruud (2000),
- Suffix
- pp. 578-584). When the particular linear combination of orthogonality conditions being tested is the same for theCtest and for the Hausman test, the two test statistics will be numerically equivalent.

- In-text reference with the coordinate start=65078
- Prefix
- faces a trade–off when deciding which of the two tests to use: when the two tests differ, the Hausman test is a test of linear combinations of moment conditions, and is more powerful than theCtest at detecting violations on restrictions of these linear combinations, but the latter test will be able to detect other violations of moment conditions that the former test cannot. As
- Exact
- Ruud (2000),
- Suffix
- pp. 585, points out, one of the appealing features of the Hausman test is that its particular linear combination of moment conditions also determines the consistency of the more efficient GMM estimator.

- In-text reference with the coordinate start=46592
- 36
- Sargan, J. 1958. The estimation of economic relationships using instrumental variables. Econometrica26(3): 393–415.

Total in-text references: 2- In-text reference with the coordinate start=5138
- Prefix
- We may cast some light on whether the instruments satisfy the orthogonality conditions in the context of an overidentified model: that is, one in which a surfeit of instruments are available. In that context we may test the overidentifying restrictions in order to provide some evidence of the instruments’ validity. We present the variants of this test due to
- Exact
- Sargan (1958), Basmann (1960) and,
- Suffix
- in the GMM context, L. Hansen (1982), and show how the generalization of this test, theCor “difference–in–Sargan” test, can be used test the validity of subsets of the instruments. Although there may well be reason to suspect non–orthogonality between regressors and errors, the use of IV estimation to address this problem must be balanced against the inevitable loss of efficiency vis–`a–vis OLS.

- In-text reference with the coordinate start=41568
- Prefix
- analysis without testing the orthogonality assumptions is a “pious fraud”. 4.3Overidentifying restrictions in IV In the special case of linear instrumental variables under conditional heteroskedasticity, the concept of theJstatistic considerably predates the development of GMM estimation techniques. Theivreg2procedure routinely presents this test, labelled as Sargan’s statistic
- Exact
- (Sargan (1958))
- Suffix
- in the estimation output. Just as IV is a special case of GMM, Sargan’s statistic is a special case of Hansen’s Junder the assumption of conditional homoskedasticity. Thus if we use the IV optimal weighting matrix (34) together with the expression forJ(41), we obtain Sargan’s statistic = 1 ˆσ2 uˆ′Z(Z′Z)−1Z′ˆu= uˆ′Z(Z′Z)−1Z′ˆu ˆu′ˆu/n = uˆ′PZˆu uˆ′ˆu/n (43) It is easy to see from (43) that Sargan’

- In-text reference with the coordinate start=5138
- 37
- Shea, J. 1997. Instrument relevance in multivariate linear models: A simple measure. Review of Economics & Statistics79(2): 348–352.

Total in-text references: 1- In-text reference with the coordinate start=35570
- Prefix
- The statistics proposed by Bound et al. are able to diagnose instrument relevance only in the presence of a single endogenous regressor. When multiple endogenous regressors are used, other statistics are required. One such statistic has been proposed by
- Exact
- Shea (1997)
- Suffix
- : a “partialR2” measure that takes the intercorrelations among the instruments into account.9For a model containing a single endogenous regressor, the twoR2measures are equivalent. The distribution of Shea’s partialR2statistic has not been derived, but it may be interpreted like anyR2.

- In-text reference with the coordinate start=35570
- 38
- Staiger, D. and J. H. Stock. 1997. Instrumental variables regression with weak instruments.Econometrica65(3): 557–86.

Total in-text references: 5- In-text reference with the coordinate start=36852
- Prefix
- in the first stage regression is nil, the model is in effect unidentified with respect to that endogenous variable; in this case, the bias of the IV estimator is the same as that of the OLS estimator, IV becomes inconsistent, and nothing is gained from instrumenting (ibid.). If the explanatory power is simply “weak”,10conventional asymptotics fail. What is surprising is that, as
- Exact
- Staiger and Stock (1997) and
- Suffix
- others have shown, the “weak instrument” problem can arise even when the first stage tests are significant at conventional levels (5% or 1%) and the researcher is using a large sample. One rule of thumb is that for a single endogenous regressor, anF–statistic below 10 is cause for concern (Staiger and Stock (1997) p. 557).

- In-text reference with the coordinate start=37177
- Prefix
- What is surprising is that, as Staiger and Stock (1997) and others have shown, the “weak instrument” problem can arise even when the first stage tests are significant at conventional levels (5% or 1%) and the researcher is using a large sample. One rule of thumb is that for a single endogenous regressor, anF–statistic below 10 is cause for concern
- Exact
- (Staiger and Stock (1997)
- Suffix
- p. 557). Since the size of theIVbias is increasing in the number of instruments (Hahn and Hausman (2002b)), one recommendation when faced with this problem is to be parsimonious in the choice of instruments.

- In-text reference with the coordinate start=37444
- Prefix
- Since the size of theIVbias is increasing in the number of instruments (Hahn and Hausman (2002b)), one recommendation when faced with this problem is to be parsimonious in the choice of instruments. For further discussion see, e.g.,
- Exact
- Staiger and Stock (1997), Hahn and Hausman (2002a), Hahn and Hausman (2002b), and
- Suffix
- the references cited therein. 9The Shea partialR2statistic may be easily computed according to the simplification presented in Godfrey (1999), who demonstrates that Shea’s statistic for endogenous regressorimay be expressed as R2p= νOLSi,i νIVi,i [ (1−R2IV) (1−R2OLS) ] whereνi,iis the estimated asymptotic variance of the coefficient. 10One approach in the literature, following Staiger and Stock (1

- In-text reference with the coordinate start=37894
- Prefix
- (2002b), and the references cited therein. 9The Shea partialR2statistic may be easily computed according to the simplification presented in Godfrey (1999), who demonstrates that Shea’s statistic for endogenous regressorimay be expressed as R2p= νOLSi,i νIVi,i [ (1−R2IV) (1−R2OLS) ] whereνi,iis the estimated asymptotic variance of the coefficient. 10One approach in the literature, following
- Exact
- Staiger and Stock (1997),
- Suffix
- is to define “weak” as meaning that the first stage reduced form coefficients are in aN1/2neighborhood of zero, or equivalently, holding the expectation of the first stageFstatistic constant as the sample size increases.

- In-text reference with the coordinate start=58467
- Prefix
- Given the choice between forming the Hausman statistic using either ˆσ2OLSor ˆσ2IV, the standard choice is the former (the Durbin statistic) because under the null both are consistent but the former is more efficient. The Durbin flavor of the test has the additional advantage of superior performance when instruments are weak
- Exact
- (Staiger and Stock (1997)).
- Suffix
- 5.2Extensions: Testing a subset of the regressors for endogeneity, and heteroskedastic-robust testing for IV and GMM estimation In some contexts, the researcher may be certain that one or more regressors inX1is endogenous but may question the endogeneity of the others.

- In-text reference with the coordinate start=36852
- 39
- White, H. 1980. A heteroskedasticity-consistent covariance matrix estimator and a direct test for heteroskedasticity.Econometrica48: 817–838. —. 1982. Instrumental variables regression with independent observations.Econometrica50(2): 483–499. —. 1984.Asymptotic Theory for Econometricians. 1st ed. Orlando, FL: Academic

Total in-text references: 3- In-text reference with the coordinate start=28179
- Prefix
- , also distributed asχ2punder the null, is easily obtained asnR2c, whereR2cis the centeredR2from an auxiliary regression of the squared residuals from the original regression on the indicator variables. When the indicator variables are the regressors of the original equation, their squares and their cross-products, Koenker’s test is identical to White’snR2cgeneral test for heteroskedasticity
- Exact
- (White (1980)).
- Suffix
- These tests are available in Stata, following estimation with regress, using ourivhettestas well as viahettestandwhitetst. As Pagan and Hall (1983) point out, the above tests will be valid tests for heteroskedasticity in an IV regression only if heteroskedasticity is present in that equation andnowhere else in the system.

- In-text reference with the coordinate start=29265
- Prefix
- Our implementation is of the simpler Pagan–Hall statistic, available with the commandivhettestafter estimation byivreg,ivreg2, orivgmm0. We present the Pagan–Hall test here in the format and notation of the original
- Exact
- White (1980) and White (1982)
- Suffix
- tests, however, to facilitate comparisons with the other tests noted above.7 Let Ψ be then×pmatrix of indicator variables hypothesized to be related to the heteroskedasticity in the equation, with typical row Ψi.

- In-text reference with the coordinate start=29737
- Prefix
- These indicator variables must be exogenous, typically either instruments or functions of the instruments. Common choices would be: 1. The levels, squares, and cross-products of the instrumentsZ(excluding the constant), as in the
- Exact
- White (1980)
- Suffix
- test. This is the default inivhettest. 2. The levels only of the instrumentsZ(excluding the constant). This is available inivhettestby specifying theivlevoption. 6For a more detailed discussion, see Pagan and Hall (1983) or Godfrey (1988), pp. 189–90. 7We note here that the original Pagan–Hall paper has a serious typo in the presentation of their non-normality-robust statistic.

- In-text reference with the coordinate start=28179
- 40
- Wooldridge, J. M. 1995. Score diagnostics for linear models estimated by two stage least squares. InAdvances in Econometrics and Quantitative Economics: Essays in honor of Professor C. R. Rao, eds. G. S. Maddala, P. C. B. Phillips, and T. N. Srinivasan, 66–87. Cambridge, MA: Blackwell Publishers. —. 2002.Econometric Analysis of Cross Section and Panel Data. 1st ed. Cambridge,

Total in-text references: 2- In-text reference with the coordinate start=47588
- Prefix
- For included instruments, theCtest hypothecates placing them in the list of included endogenous variables: in essence, treating them as endogenous regressors. TheCtest, 14See Ahn (1995), Proposition 1, or, for an alternative formulation,
- Exact
- Wooldridge (1995),
- Suffix
- Procedure 3.2. 15See Hayashi (2000), pp. 218–21 and pp. 232–34 or Ruud (2000), Chapter 22, for comprehensive presentations. distributedχ2with degrees of freedom equal to the loss of overidentifying restrictions (i.e., the number of suspect instruments being tested), has the null hypothesis that the specified variables are proper instruments.

- In-text reference with the coordinate start=53726
- Prefix
- The Hausman (1978) specification test takes the quadratic form H=n(ˆβc−ˆβe)′D−(ˆβc−ˆβe) where D= ( V(ˆβc)−V(ˆβe) )(45) and whereV(ˆβ) denotes a consistent estimate of the asymptotic variance ofβ, and the 17See Wooldridge (2002), pp. 58–61, and
- Exact
- Wooldridge (1995)
- Suffix
- for more detailed discussion. operator−denotes a generalized inverse. A Hausman statistic for a test of endogeneity in an IV regression is formed by choosing OLS as the efficient estimatorˆβeand IV as the inefficient but consistent estimator βˆc.

- In-text reference with the coordinate start=47588
- 41
- Wu, D.-M. 1973. Alternative tests of independence between stochastic regressors and disturbances.Econometrica41(4): 733–750. —. 1974. Alternative tests of independence between stochastic regressors and disturbances: Finite sample results.Econometrica42(3): 529–546.

Total in-text references: 5- In-text reference with the coordinate start=55811
- Prefix
- If a common estimate ofσis used, then the generalized inverse ofDis guaranteed to exist and a positive test statistic is guaranteed.19 If the Hausman statistic is formed using the OLS estimate of the error variance, then theDmatrix in Equation (45) becomes D= ˆσ2OLS ( (X′PZX)−1−(X′X)−1 ) (47) This version of the endogeneity test was first proposed by Durbin (1954) and separately by
- Exact
- Wu (1973)
- Suffix
- (hisT4statistic) and Hausman (1978). It can be obtained within Stata by usinghausmanwith thesigmamoreoption in conjunction with estimation byregress, ivregand/orivreg2. If the Hausman statistic is formed using the IV estimate of the error variance, then theDmatrix becomes D= ˆσ2IV ( (X′PZX)−1−(X′X)−1 ) (48) 18Readers should also bear in mind here and below that the estimates of the error variance

- In-text reference with the coordinate start=57109
- Prefix
- In this alternate form, the matrix difference in the expression equivalent to (47) is positive definite and a generalized inverse is not required. See Bowden and Turkington (1984), pp. 50–51. This version of the statistic was proposed by separately by
- Exact
- Wu (1973)
- Suffix
- (hisT3statistic) and Hausman (1978). It can be obtained within Stata by usinghausmanwith the (undocumented)sigmalessoption. Use ofhausmanwith thesigmamoreorsigmalessoptions avoids the additional annoyance that because Stata’shausmantries to deduce the correct degrees of freedom for the test from the rank of the matrixD, it may sometimes come up with the wrong answer.

- In-text reference with the coordinate start=68308
- Prefix
- Yet another asymptotically equivalent flavor of the DWH test is available for standard IV estimation under conditional homoskedasticity, and is included in the output ofivendog. This is the test statistic introduced by
- Exact
- Wu (1973)
- Suffix
- (hisT2), and separately shown by Hausman (1978) to be calculated straightforwardly through the use of auxiliary regressions. We will refer to it as the Wu–Hausman statistic.24 Consider a simplified version of our basic model (1) with a single endogenous regressorx1: y=β1x1+X2β2+u,(49) withX2≡Z2assumed exogenous (including the constant, if one is specified) and with excluded instrumentsZ1as usua

- In-text reference with the coordinate start=70148
- Prefix
- The test statistic then becomes anF−test, with numerator degrees of freedom equal to the number of included endogenous variables. One advantage of the Wu–Hausman F−statistic over the other DWH tests for IV vs. OLS is that with certain normality assumptions, it is a finite sample test exactly distributed asF(see
- Exact
- Wu (1973) and Nakamura and Nakamura (1981)). Wu (1974)
- Suffix
- ’s Monte Carlo studies also suggest that this statistic is to be preferred to the statistic using just ˆσ2IV. A version of the Wu–Hausman statistic for testing a subset of regressors is also available, as Davidson and MacKinnon (1993), pp. 241–242 point out.

- In-text reference with the coordinate start=72416
- Prefix
- ) = Q∗ USSR/n (53) and the Wu–HausmanF−statistic can be written Wu-Hausman:F(K1B,n−K−K1B) = Q∗/K1B (USSR−Q∗)/(n−K−K1B) (54) whereQ∗is the difference between the restricted and unrestricted sums of squares given by the auxiliary regression (51) or (52), andUSSRis the sum of squared residuals from the efficient estimate of the model.25From the discussion in the preceding section, 25See
- Exact
- Wu (1973)
- Suffix
- or Nakamura and Nakamura (1981).Q∗can also be interpreted as the difference between the sums of squares of the second–stage estimation of the efficient model with and without however, we know that for tests of the endogeneity of regressors, theCstatistic and the Hausman form of the DWH test are numerically equal, and when the error variance from the more efficient estimation is used, the Hausman f

- In-text reference with the coordinate start=55811