- 1
- Ahn, S. C. 1995. Robust GMM Tests for Model Specification.Arizona State University (Working Paper).

Total in-text references: 1- In-text reference with the coordinate start=47525
- Prefix
- For excluded instruments, this is equivalent to dropping them from the instrument list. For included instruments, theCtest hypothecates placing them in the list of included endogenous variables: in essence, treating them as endogenous regressors. TheCtest, 14See
- Exact
- Ahn (1995),
- Suffix
- Proposition 1, or, for an alternative formulation, Wooldridge (1995), Procedure 3.2. 15See Hayashi (2000), pp. 218–21 and pp. 232–34 or Ruud (2000), Chapter 22, for comprehensive presentations. distributedχ2with degrees of freedom equal to the loss of overidentifying restrictions (i.e., the number of suspect instruments being tested), has the null hypothesis that the specified variables are pro

- In-text reference with the coordinate start=47525
- 2
- Arellano, M. 1987. Computing robust standard errors for within–groups estimators.

Total in-text references: 1- In-text reference with the coordinate start=21869
- Prefix
- 0ˆΣM (36) then an estimator ofSthat is consistent in the presence of arbitrary intra–cluster correlation is Sˆ=1 n (Z′ˆΩCZ)(37) The earliest reference to this approach to robust estimation in the presence of clustering of which we are aware is White (1984), pp. 135–6. It is commonly employed in the context of panel data estimation; see Wooldridge (2002), p. 193,
- Exact
- Arellano (1987) and
- Suffix
- K ́ezdi (2002). It is the standard Stata approach to clustering, implemented in, e.g., robust,regressandivreg2.4 The cluster–robust covariance matrix for IV estimation is obtained exactly as in the preceding subsection except usingˆSas defined in Equation (37).

- In-text reference with the coordinate start=21869
- 4
- Basmann, R. 1960. On finite sample distributions of generalized classical linear identifiability test statistics.Journal of the American Statistical Association55(292): 650–659.

Total in-text references: 2- In-text reference with the coordinate start=5138
- Prefix
- We may cast some light on whether the instruments satisfy the orthogonality conditions in the context of an overidentified model: that is, one in which a surfeit of instruments are available. In that context we may test the overidentifying restrictions in order to provide some evidence of the instruments’ validity. We present the variants of this test due to
- Exact
- Sargan (1958), Basmann (1960) and,
- Suffix
- in the GMM context, L. Hansen (1982), and show how the generalization of this test, theCor “difference–in–Sargan” test, can be used test the validity of subsets of the instruments. Although there may well be reason to suspect non–orthogonality between regressors and errors, the use of IV estimation to address this problem must be balanced against the inevitable loss of efficiency vis–`a–vis OLS.

- In-text reference with the coordinate start=42832
- Prefix
- The literature contains several variations on this test. The main idea behind these variations is that there is more than one way to consistently estimate the variance in the denominator of (43). The most important of these is that of
- Exact
- Basmann (1960).
- Suffix
- Independently of Sargan, Basmann proposed anF(L−K,n−L)-test of overidentifying restrictions: Basmann’sF-statistic = uˆ′PZˆu/(L−K) uˆ′MZˆu/(n−L) (44) whereMZ≡I−PZis the “annihilator” matrix andLis the total number of instruments.

- In-text reference with the coordinate start=5138
- 8
- Chamberlain, G. 1982. Multivariate regression models for panel data.Journal of Econometrics18: 5–46.

Total in-text references: 1- In-text reference with the coordinate start=24741
- Prefix
- but correct inference is still possible through the use of the Eicker–Huber– White “sandwich” robust covariance estimator, and this estimator can also be derived using the general formula for the asymptotic variance of a GMM estimator with a sub– optimal weighting matrix, Equation (24). A natural question is whether a more efficient GMM estimator exists, and the answer is “yes”
- Exact
- (Chamberlain (1982), Cragg (1983)).
- Suffix
- If the disturbance is heteroskedastic, there are no endogenous regressors, and the researcher has available additional moment conditions, i.e., additional variables that do not appear in the regression but that are known to be exogenous, then the efficient GMM estimator is that of Cragg (1983), dubbed “heteroskedastic OLS” (HOLS) by Davidson and MacKinnon (1993), p. 600.

- In-text reference with the coordinate start=24741
- 10
- Cragg, J. 1983. More efficient estimation in the presence of heteroskedasticity of unknown form.Econometrica51: 751–763.

Total in-text references: 2- In-text reference with the coordinate start=24741
- Prefix
- but correct inference is still possible through the use of the Eicker–Huber– White “sandwich” robust covariance estimator, and this estimator can also be derived using the general formula for the asymptotic variance of a GMM estimator with a sub– optimal weighting matrix, Equation (24). A natural question is whether a more efficient GMM estimator exists, and the answer is “yes”
- Exact
- (Chamberlain (1982), Cragg (1983)).
- Suffix
- If the disturbance is heteroskedastic, there are no endogenous regressors, and the researcher has available additional moment conditions, i.e., additional variables that do not appear in the regression but that are known to be exogenous, then the efficient GMM estimator is that of Cragg (1983), dubbed “heteroskedastic OLS” (HOLS) by Davidson and MacKinnon (1993), p. 600.

- In-text reference with the coordinate start=25077
- Prefix
- If the disturbance is heteroskedastic, there are no endogenous regressors, and the researcher has available additional moment conditions, i.e., additional variables that do not appear in the regression but that are known to be exogenous, then the efficient GMM estimator is that of
- Exact
- Cragg (1983),
- Suffix
- dubbed “heteroskedastic OLS” (HOLS) by Davidson and MacKinnon (1993), p. 600. It can be obtained in precisely the same way as feasible efficient two–step GMM except now the first–step inefficient but consistent estimator used to generate the residuals is OLS rather than IV.

- In-text reference with the coordinate start=24741
- 13
- Durbin, J. 1954. Errors in variables.Review of the International Statistical Institute 22: 23–32.

Total in-text references: 1- In-text reference with the coordinate start=55780
- Prefix
- If a common estimate ofσis used, then the generalized inverse ofDis guaranteed to exist and a positive test statistic is guaranteed.19 If the Hausman statistic is formed using the OLS estimate of the error variance, then theDmatrix in Equation (45) becomes D= ˆσ2OLS ( (X′PZX)−1−(X′X)−1 ) (47) This version of the endogeneity test was first proposed by
- Exact
- Durbin (1954) and
- Suffix
- separately by Wu (1973) (hisT4statistic) and Hausman (1978). It can be obtained within Stata by usinghausmanwith thesigmamoreoption in conjunction with estimation byregress, ivregand/orivreg2. If the Hausman statistic is formed using the IV estimate of the error variance, then theDmatrix becomes D= ˆσ2IV ( (X′PZX)−1−(X′X)−1 ) (48) 18Readers should also bear in mind here and below that the estimat

- In-text reference with the coordinate start=55780
- 15
- Godfrey, L. G. 1978. Testing for multiplicative heteroskedasticity.Journal of Econometrics8: 227–236. —. 1988.Misspecification tests in econometrics: The Lagrange multiplier principle and other approaches. Cambridge: Cambridge University Press. —. 1999. Instrument relevance in multivariate linear models.Review of Economics &

Total in-text references: 1- In-text reference with the coordinate start=27283
- Prefix
- this test in the next section. 3Testing for heteroskedasticity The Breusch–Pagan/Godfrey/Cook–Weisberg and White/Koenker statistics are standard tests of the presence of heteroskedasticity in an OLS regression. The principle is to test for a relationship between the residuals of the regression andpindicator variables that are hypothesized to be related to the heteroskedasticity. Breusch and
- Exact
- Pagan (1979), Godfrey (1978), and
- Suffix
- Cook and Weisberg (1983) separately derived the same test statistic. This statistic is distributed asχ2withpdegrees of freedom under the null of no heteroskedasticity, and under the maintained hypothesis that the error of the regression is normally distributed.

- In-text reference with the coordinate start=27283
- 16
- Greene, W. H. 2000.Econometric Analysis. 4th ed. Upper Saddle River, NJ: Prentice– Hall.

Total in-text references: 3- In-text reference with the coordinate start=7077
- Prefix
- A concise on–line text that covers GMM is Hansen (2000). The exposition below draws on Hansen (2000), Chapter 11; Hayashi (2000), Chapter 3; Wooldridge (2002), Chapter 8; Davidson and MacKinnon (1993), and
- Exact
- Greene (2000).
- Suffix
- We begin with the standard IV estimator, and then relate it to the GMM framework. We then consider the issue of clustered errors, and finally turn to OLS. 2.1The method of instrumental variables The equation to be estimated is, in matrix notation, y=Xβ+u,E(uu′) = Ω(1) with typical row yi=Xiβ+ui(2) The matrix of regressorsXisn×K, wherenis the number of observations.

- In-text reference with the coordinate start=10444
- Prefix
- n (15) we obtain the estimated asymptotic variance–covariance matrix of the IV estimator: V(ˆβIV) = ˆσ2(X′Z(Z′Z)−1Z′X)−1= ˆσ2(X′PZX)−1(16) Note that some packages, including Stata’sivreg, include a degrees–of–freedom correction to the estimate of ˆσ2by replacingnwithn−L. This correction is not necessary, however, since the estimate of ˆσ2would not be unbiased anyway
- Exact
- (Greene (2000),
- Suffix
- p. 373). Ourivreg2routine defaults to the large–sample formulas for the estimated error variance and covariance matrix; the user can request the small–sample versions with the optionsmall. 2.2The Generalized Method of Moments The standard IV estimator is a special case of a Generalized Method of Moments (GMM) estimator.

- In-text reference with the coordinate start=56482
- Prefix
- ˆσ2IV ( (X′PZX)−1−(X′X)−1 ) (48) 18Readers should also bear in mind here and below that the estimates of the error variances may or may not have small-sample corrections, according to the estimation package used and the options chosen. If one of the variance-covariance matrices inDuses a small-sample correction, then so should the other. 19The matrix difference in (47) and (48) has rankK1; see
- Exact
- Greene (2000),
- Suffix
- pp. 384–385. Intuitively, the variables being tested are those not shared byXandZ, namely theK1endogenous regressors X1. The Hausman statistic for the endogeneity test can also be expressed in terms of a test of the coefficients of the endogenous regressors alone and the rest of theβs removed.

- In-text reference with the coordinate start=7077
- 18
- Hansen,B.E.2000.Econometrics.1sted.Madison,WI: http://www.ssc.wisc.edu/ bhansen/notes/notes.htm.

Total in-text references: 2- In-text reference with the coordinate start=6903
- Prefix
- There are a number of good modern texts that cover GMM, and one recent prominent text, Hayashi (2000), presents virtually all the estimation techniques discussed in the GMM framework. A concise on–line text that covers GMM is
- Exact
- Hansen (2000).
- Suffix
- The exposition below draws on Hansen (2000), Chapter 11; Hayashi (2000), Chapter 3; Wooldridge (2002), Chapter 8; Davidson and MacKinnon (1993), and Greene (2000). We begin with the standard IV estimator, and then relate it to the GMM framework.

- In-text reference with the coordinate start=6954
- Prefix
- There are a number of good modern texts that cover GMM, and one recent prominent text, Hayashi (2000), presents virtually all the estimation techniques discussed in the GMM framework. A concise on–line text that covers GMM is Hansen (2000). The exposition below draws on
- Exact
- Hansen (2000),
- Suffix
- Chapter 11; Hayashi (2000), Chapter 3; Wooldridge (2002), Chapter 8; Davidson and MacKinnon (1993), and Greene (2000). We begin with the standard IV estimator, and then relate it to the GMM framework.

- In-text reference with the coordinate start=6903
- 19
- Hansen, L. 1982. Large sample properties of generalized method of moments estimators. Econometrica50(3): 1029–1054.

Total in-text references: 3- In-text reference with the coordinate start=2244
- Prefix
- The conventional IV estimator (though consistent) is, however, inefficient in the presence of heteroskedasticity. The usual approach today when facing heteroskedasticity of unknown form is to use the Generalized Method of Moments (GMM), introduced by L.
- Exact
- Hansen (1982).
- Suffix
- GMM makes use of the orthogonality conditions to allow for efficient estimation in the presence of heteroskedasticity of unknown form. In the twenty years since it was first introduced, GMM has become a very popular tool among empirical researchers.

- In-text reference with the coordinate start=5196
- Prefix
- In that context we may test the overidentifying restrictions in order to provide some evidence of the instruments’ validity. We present the variants of this test due to Sargan (1958), Basmann (1960) and, in the GMM context, L.
- Exact
- Hansen (1982), and
- Suffix
- show how the generalization of this test, theCor “difference–in–Sargan” test, can be used test the validity of subsets of the instruments. Although there may well be reason to suspect non–orthogonality between regressors and errors, the use of IV estimation to address this problem must be balanced against the inevitable loss of efficiency vis–`a–vis OLS.

- In-text reference with the coordinate start=39315
- Prefix
- as a standard diagnostic in any overidentified instrumental variables estimation.11These are tests of the joint hypotheses of correct model specification and the orthogonality conditions, and a rejection may properly call either or both of those hypotheses into question. In the context of GMM, the overidentifying restrictions may be tested via the commonly employedJstatistic of
- Exact
- Hansen (1982).
- Suffix
- This statistic is none other than the value of the GMM objective function (20), evaluated at the efficient GMM estimatorˆβEGMM. Under the null, J(ˆβEGMM) =ng(ˆβ)′ˆS−1g(ˆβ) A ∼χ2L−K(41) In the case of heteroskedastic errors, the matrixˆSis estimated using theˆΩ matrix (27), and theJstatistic becomes J(ˆβEGMM) = ˆuZ′(Z′ˆΩZ)−1Zˆu′ A ∼χ2L−K(42) With clustered errors, theˆΩCmatrix (37) can be us

- In-text reference with the coordinate start=2244
- 22
- Hausman, J. 1978. Specification tests in econometrics.Econometrica46(3): 1251–1271.

Total in-text references: 4- In-text reference with the coordinate start=53494
- Prefix
- Denote byˆβcthe estimator that is consistent under both the null and the alternative hypotheses, and byˆβethe estimator that is fully efficient under the null but inconsistent if the null is not true. The
- Exact
- Hausman (1978)
- Suffix
- specification test takes the quadratic form H=n(ˆβc−ˆβe)′D−(ˆβc−ˆβe) where D= ( V(ˆβc)−V(ˆβe) )(45) and whereV(ˆβ) denotes a consistent estimate of the asymptotic variance ofβ, and the 17See Wooldridge (2002), pp. 58–61, and Wooldridge (1995) for more detailed discussion. operator−denotes a generalized inverse.

- In-text reference with the coordinate start=55842
- Prefix
- , then the generalized inverse ofDis guaranteed to exist and a positive test statistic is guaranteed.19 If the Hausman statistic is formed using the OLS estimate of the error variance, then theDmatrix in Equation (45) becomes D= ˆσ2OLS ( (X′PZX)−1−(X′X)−1 ) (47) This version of the endogeneity test was first proposed by Durbin (1954) and separately by Wu (1973) (hisT4statistic) and
- Exact
- Hausman (1978).
- Suffix
- It can be obtained within Stata by usinghausmanwith thesigmamoreoption in conjunction with estimation byregress, ivregand/orivreg2. If the Hausman statistic is formed using the IV estimate of the error variance, then theDmatrix becomes D= ˆσ2IV ( (X′PZX)−1−(X′X)−1 ) (48) 18Readers should also bear in mind here and below that the estimates of the error variances may or may not have small-sample cor

- In-text reference with the coordinate start=57140
- Prefix
- In this alternate form, the matrix difference in the expression equivalent to (47) is positive definite and a generalized inverse is not required. See Bowden and Turkington (1984), pp. 50–51. This version of the statistic was proposed by separately by Wu (1973) (hisT3statistic) and
- Exact
- Hausman (1978).
- Suffix
- It can be obtained within Stata by usinghausmanwith the (undocumented)sigmalessoption. Use ofhausmanwith thesigmamoreorsigmalessoptions avoids the additional annoyance that because Stata’shausmantries to deduce the correct degrees of freedom for the test from the rank of the matrixD, it may sometimes come up with the wrong answer.

- In-text reference with the coordinate start=68356
- Prefix
- Yet another asymptotically equivalent flavor of the DWH test is available for standard IV estimation under conditional homoskedasticity, and is included in the output ofivendog. This is the test statistic introduced by Wu (1973) (hisT2), and separately shown by
- Exact
- Hausman (1978) to
- Suffix
- be calculated straightforwardly through the use of auxiliary regressions. We will refer to it as the Wu–Hausman statistic.24 Consider a simplified version of our basic model (1) with a single endogenous regressorx1: y=β1x1+X2β2+u,(49) withX2≡Z2assumed exogenous (including the constant, if one is specified) and with excluded instrumentsZ1as usual.

- In-text reference with the coordinate start=53494
- 24
- Hayashi, F. 2000.Econometrics. 1st ed. Princeton, NJ: Princeton University Press.

Total in-text references: 7- In-text reference with the coordinate start=6764
- Prefix
- The syntax diagrams for these commands are presented in the last section of the paper, and the electronic supplement presents annotated examples of their use. 2IV and GMM estimation The “Generalized Method of Moments” was introduced by L. Hansen in his celebrated 1982 paper. There are a number of good modern texts that cover GMM, and one recent prominent text,
- Exact
- Hayashi (2000),
- Suffix
- presents virtually all the estimation techniques discussed in the GMM framework. A concise on–line text that covers GMM is Hansen (2000). The exposition below draws on Hansen (2000), Chapter 11; Hayashi (2000), Chapter 3; Wooldridge (2002), Chapter 8; Davidson and MacKinnon (1993), and Greene (2000).

- In-text reference with the coordinate start=6985
- Prefix
- There are a number of good modern texts that cover GMM, and one recent prominent text, Hayashi (2000), presents virtually all the estimation techniques discussed in the GMM framework. A concise on–line text that covers GMM is Hansen (2000). The exposition below draws on Hansen (2000), Chapter 11;
- Exact
- Hayashi (2000),
- Suffix
- Chapter 3; Wooldridge (2002), Chapter 8; Davidson and MacKinnon (1993), and Greene (2000). We begin with the standard IV estimator, and then relate it to the GMM framework. We then consider the issue of clustered errors, and finally turn to OLS. 2.1The method of instrumental variables The equation to be estimated is, in matrix notation, y=Xβ+u,E(uu′) = Ω(1) with typical row yi=Xiβ+ui(2) The matrix

- In-text reference with the coordinate start=25966
- Prefix
- The advantages of GMM over IV are clear: if heteroskedasticity is present, the GMM estimator is more efficient than the simple IV estimator, whereas if heteroskedasticity is not present, the GMM estimator is no worse asymptotically than the IV estimator. Nevertheless, the use of GMM does come with a price. The problem, as
- Exact
- Hayashi (2000)
- Suffix
- points out (p. 215), is that the optimal weighting matrixˆSat the core of efficient GMM is a function of fourth moments, and obtaining reasonable estimates of fourth moments may require very large sample sizes.

- In-text reference with the coordinate start=46653
- Prefix
- In these contexts, a “difference–in–Sargan” statistic may usefully be employed.15 The test is known under other names as well, e.g., Ruud (2000) calls it the “distance difference” statistic, and
- Exact
- Hayashi (2000)
- Suffix
- follows Eichenbaum et al. (1988) and dubs it theCstatistic; we will use the latter term. TheCtest allows us to test a subset of the original set of orthogonality conditions. The statistic is computed as the difference between two Sargan statistics (or, for efficient GMM, twoJstatistics): that for the (restricted, fully efficient) regression using the entire set of overidentifying res

- In-text reference with the coordinate start=47626
- Prefix
- For included instruments, theCtest hypothecates placing them in the list of included endogenous variables: in essence, treating them as endogenous regressors. TheCtest, 14See Ahn (1995), Proposition 1, or, for an alternative formulation, Wooldridge (1995), Procedure 3.2. 15See
- Exact
- Hayashi (2000),
- Suffix
- pp. 218–21 and pp. 232–34 or Ruud (2000), Chapter 22, for comprehensive presentations. distributedχ2with degrees of freedom equal to the loss of overidentifying restrictions (i.e., the number of suspect instruments being tested), has the null hypothesis that the specified variables are proper instruments.

- In-text reference with the coordinate start=48866
- Prefix
- More precisely,ˆSfrom the restricted estimation is used to form the restrictedJstatistic, and the submatrix ofˆSwith rows/columns corresponding to the unrestricted estimation is used to form theJstatistic for the unrestricted estimation; see
- Exact
- Hayashi (2000),
- Suffix
- p. 220. TheCtest is conducted inivreg2by specifying theorthogoption, and listing the instruments (either included or excluded) to be challenged. The equation must still be identified with these instruments either removed or reconsidered as endogenous if the Cstatistic is to be calculated.

- In-text reference with the coordinate start=63284
- Prefix
- We can state this more precisely as follows: IfLe−Lc≤Kc1, theCstatistic and the Hausman statistic are numerically 21Users beware: thesigmamoreoption following arobustestimation will not only fail to accomplish this, it will generate an invalid test statistic as well. 22See Hausman and Taylor (1981) and Newey (1985), summarized by
- Exact
- Hayashi (2000),
- Suffix
- pp. 233–34. equivalent.23IfLe−Lc> Kc1, the two statistics will be numerically different, theC statistic will haveLe−Lcdegrees of freedom, and the Hausman statistic will haveKc1 degrees of freedom in the conditional homoskedasticity case (and an unknown number of degrees of freedom in the conditional heteroskedasticity case).

- In-text reference with the coordinate start=6764
- 28
- Koenker, R. 1981. A note on Studentizing a test for heteroskedasticity.Journal of Econometrics17: 107–112.

Total in-text references: 1- In-text reference with the coordinate start=27592
- Prefix
- Breusch and Pagan (1979), Godfrey (1978), and Cook and Weisberg (1983) separately derived the same test statistic. This statistic is distributed asχ2withpdegrees of freedom under the null of no heteroskedasticity, and under the maintained hypothesis that the error of the regression is normally distributed.
- Exact
- Koenker (1981)
- Suffix
- noted that the power of this test is very sensitive to the normality assumption, and presented a version of the test that relaxed this assumption. Koenker’s test statistic, also distributed asχ2punder the null, is easily obtained asnR2c, whereR2cis the centeredR2from an auxiliary regression of the squared residuals from the original regression on the indicator variables.

- In-text reference with the coordinate start=27592
- 29
- Moulton, B. R. 1986. Random group effects and the precision of regression estimates. Journal of Econometrics32: 385–397.

Total in-text references: 1- In-text reference with the coordinate start=23451
- Prefix
- But users should take care that, if theclusteroption is used, then it ought to be the case thatM >> K.5 4There are other approaches to dealing with clustering that put more structure on the Ω matrix and hence are more efficient but less robust. For example, the
- Exact
- Moulton (1986)
- Suffix
- approach to obtaining consistent standard errors is in effect to specify an “error components” (a.k.a. “random effects”) structure in Equation (36): Σmis a matrix with diagonal elementsσ2u+σ2vand off-diagonal elements σ2v.

- In-text reference with the coordinate start=23451
- 30
- Nakamura, A. and M. Nakamura. 1981. On the relationships among several specification error tests presented by Durbin, Wu, and Hausman.Econometrica49(6): 1583–1588.

Total in-text references: 2- In-text reference with the coordinate start=70176
- Prefix
- One advantage of the Wu–Hausman F−statistic over the other DWH tests for IV vs. OLS is that with certain normality assumptions, it is a finite sample test exactly distributed asF(see Wu (1973) and Nakamura and
- Exact
- Nakamura (1981)). Wu (1974)
- Suffix
- ’s Monte Carlo studies also suggest that this statistic is to be preferred to the statistic using just ˆσ2IV. A version of the Wu–Hausman statistic for testing a subset of regressors is also available, as Davidson and MacKinnon (1993), pp. 241–242 point out.

- In-text reference with the coordinate start=72442
- Prefix
- Wu–HausmanF−statistic can be written Wu-Hausman:F(K1B,n−K−K1B) = Q∗/K1B (USSR−Q∗)/(n−K−K1B) (54) whereQ∗is the difference between the restricted and unrestricted sums of squares given by the auxiliary regression (51) or (52), andUSSRis the sum of squared residuals from the efficient estimate of the model.25From the discussion in the preceding section, 25See Wu (1973) or Nakamura and
- Exact
- Nakamura (1981).
- Suffix
- Q∗can also be interpreted as the difference between the sums of squares of the second–stage estimation of the efficient model with and without however, we know that for tests of the endogeneity of regressors, theCstatistic and the Hausman form of the DWH test are numerically equal, and when the error variance from the more efficient estimation is used, the Hausman form of the DWH test is the Durbi

- In-text reference with the coordinate start=70176
- 31
- Newey, W. 1985. Generalized method of moments specification testing.Journal of Econometrics29: 229–256.

Total in-text references: 1- In-text reference with the coordinate start=63256
- Prefix
- We can state this more precisely as follows: IfLe−Lc≤Kc1, theCstatistic and the Hausman statistic are numerically 21Users beware: thesigmamoreoption following arobustestimation will not only fail to accomplish this, it will generate an invalid test statistic as well. 22See Hausman and Taylor (1981) and
- Exact
- Newey (1985),
- Suffix
- summarized by Hayashi (2000), pp. 233–34. equivalent.23IfLe−Lc> Kc1, the two statistics will be numerically different, theC statistic will haveLe−Lcdegrees of freedom, and the Hausman statistic will haveKc1 degrees of freedom in the conditional homoskedasticity case (and an unknown number of degrees of freedom in the conditional heteroskedasticity case).

- In-text reference with the coordinate start=63256
- 34
- Ruud, P. A. 2000.An Introduction to Classical Econometric Theory. Oxford: Oxford

Total in-text references: 6- In-text reference with the coordinate start=46592
- Prefix
- Another common problem arises when the researcher has prior suspicions about the validity of a subset of instruments, and wishes to test them. In these contexts, a “difference–in–Sargan” statistic may usefully be employed.15 The test is known under other names as well, e.g.,
- Exact
- Ruud (2000)
- Suffix
- calls it the “distance difference” statistic, and Hayashi (2000) follows Eichenbaum et al. (1988) and dubs it theCstatistic; we will use the latter term. TheCtest allows us to test a subset of the original set of orthogonality conditions.

- In-text reference with the coordinate start=47672
- Prefix
- For included instruments, theCtest hypothecates placing them in the list of included endogenous variables: in essence, treating them as endogenous regressors. TheCtest, 14See Ahn (1995), Proposition 1, or, for an alternative formulation, Wooldridge (1995), Procedure 3.2. 15See Hayashi (2000), pp. 218–21 and pp. 232–34 or
- Exact
- Ruud (2000),
- Suffix
- Chapter 22, for comprehensive presentations. distributedχ2with degrees of freedom equal to the loss of overidentifying restrictions (i.e., the number of suspect instruments being tested), has the null hypothesis that the specified variables are proper instruments.

- In-text reference with the coordinate start=49783
- Prefix
- This illustrates how the Hansen–Sargan overidentification test is an “omnibus” test for the failure ofanyof the instruments to satisfy the orthogonality conditions, but at the same time requires that the investigator believe that at leastsome of the instruments are valid; see
- Exact
- Ruud (2000),
- Suffix
- p. 577. 4.5Tests of overidentifying restrictions as Lagrange multiplier (score) tests The Sargan test can be viewed as analogous to a Lagrange multiplier (LM) or score test.16In the case of OLS, the resemblance becomes exact.

- In-text reference with the coordinate start=51016
- Prefix
- If thegmmoption is chosen, HOLS estimates are reported along with a robust LM statistic. As usual, theclusteroption generates 16For a detailed discussion of the relationship between the different types of tests in a GMM framework, see
- Exact
- Ruud (2000),
- Suffix
- Chapter 22. a statistic that is robust to arbitrary intra–cluster correlation. If the estimation method is OLS but the error is not homoskedastic, then the standard LM test is no longer valid. A heteroskedasticity–robust version is, however, available.17The robust LM statistic for OLS is numerically equivalent to theJstatistic from feasible efficient two–step GMM, i.e.

- In-text reference with the coordinate start=62737
- Prefix
- In the conditional heteroskedasticity case, the degrees of freedom will beLe−LcifLe−Lc≤Kc1but unknown otherwise (making the test impractical).22 What, then, is the difference between the GMMCtest and the Hausman specification test? In fact, because the two estimators being tested are both GMM estimators, the Hausman specification test is a test of linear combinations of orthogonality conditions
- Exact
- (Ruud (2000),
- Suffix
- pp. 578-584). When the particular linear combination of orthogonality conditions being tested is the same for theCtest and for the Hausman test, the two test statistics will be numerically equivalent.

- In-text reference with the coordinate start=65078
- Prefix
- faces a trade–off when deciding which of the two tests to use: when the two tests differ, the Hausman test is a test of linear combinations of moment conditions, and is more powerful than theCtest at detecting violations on restrictions of these linear combinations, but the latter test will be able to detect other violations of moment conditions that the former test cannot. As
- Exact
- Ruud (2000),
- Suffix
- pp. 585, points out, one of the appealing features of the Hausman test is that its particular linear combination of moment conditions also determines the consistency of the more efficient GMM estimator.

- In-text reference with the coordinate start=46592
- 36
- Sargan, J. 1958. The estimation of economic relationships using instrumental variables. Econometrica26(3): 393–415.

Total in-text references: 2- In-text reference with the coordinate start=5138
- Prefix
- We may cast some light on whether the instruments satisfy the orthogonality conditions in the context of an overidentified model: that is, one in which a surfeit of instruments are available. In that context we may test the overidentifying restrictions in order to provide some evidence of the instruments’ validity. We present the variants of this test due to
- Exact
- Sargan (1958), Basmann (1960) and,
- Suffix
- in the GMM context, L. Hansen (1982), and show how the generalization of this test, theCor “difference–in–Sargan” test, can be used test the validity of subsets of the instruments. Although there may well be reason to suspect non–orthogonality between regressors and errors, the use of IV estimation to address this problem must be balanced against the inevitable loss of efficiency vis–`a–vis OLS.

- In-text reference with the coordinate start=41568
- Prefix
- analysis without testing the orthogonality assumptions is a “pious fraud”. 4.3Overidentifying restrictions in IV In the special case of linear instrumental variables under conditional heteroskedasticity, the concept of theJstatistic considerably predates the development of GMM estimation techniques. Theivreg2procedure routinely presents this test, labelled as Sargan’s statistic
- Exact
- (Sargan (1958))
- Suffix
- in the estimation output. Just as IV is a special case of GMM, Sargan’s statistic is a special case of Hansen’s Junder the assumption of conditional homoskedasticity. Thus if we use the IV optimal weighting matrix (34) together with the expression forJ(41), we obtain Sargan’s statistic = 1 ˆσ2 uˆ′Z(Z′Z)−1Z′ˆu= uˆ′Z(Z′Z)−1Z′ˆu ˆu′ˆu/n = uˆ′PZˆu uˆ′ˆu/n (43) It is easy to see from (43) that Sargan’

- In-text reference with the coordinate start=5138
- 37
- Shea, J. 1997. Instrument relevance in multivariate linear models: A simple measure. Review of Economics & Statistics79(2): 348–352.

Total in-text references: 1- In-text reference with the coordinate start=35570
- Prefix
- The statistics proposed by Bound et al. are able to diagnose instrument relevance only in the presence of a single endogenous regressor. When multiple endogenous regressors are used, other statistics are required. One such statistic has been proposed by
- Exact
- Shea (1997)
- Suffix
- : a “partialR2” measure that takes the intercorrelations among the instruments into account.9For a model containing a single endogenous regressor, the twoR2measures are equivalent. The distribution of Shea’s partialR2statistic has not been derived, but it may be interpreted like anyR2.

- In-text reference with the coordinate start=35570
- 39
- White, H. 1980. A heteroskedasticity-consistent covariance matrix estimator and a direct test for heteroskedasticity.Econometrica48: 817–838. —. 1982. Instrumental variables regression with independent observations.Econometrica50(2): 483–499. —. 1984.Asymptotic Theory for Econometricians. 1st ed. Orlando, FL: Academic

Total in-text references: 3- In-text reference with the coordinate start=28179
- Prefix
- , also distributed asχ2punder the null, is easily obtained asnR2c, whereR2cis the centeredR2from an auxiliary regression of the squared residuals from the original regression on the indicator variables. When the indicator variables are the regressors of the original equation, their squares and their cross-products, Koenker’s test is identical to White’snR2cgeneral test for heteroskedasticity
- Exact
- (White (1980)).
- Suffix
- These tests are available in Stata, following estimation with regress, using ourivhettestas well as viahettestandwhitetst. As Pagan and Hall (1983) point out, the above tests will be valid tests for heteroskedasticity in an IV regression only if heteroskedasticity is present in that equation andnowhere else in the system.

- In-text reference with the coordinate start=29265
- Prefix
- Our implementation is of the simpler Pagan–Hall statistic, available with the commandivhettestafter estimation byivreg,ivreg2, orivgmm0. We present the Pagan–Hall test here in the format and notation of the original
- Exact
- White (1980) and White (1982)
- Suffix
- tests, however, to facilitate comparisons with the other tests noted above.7 Let Ψ be then×pmatrix of indicator variables hypothesized to be related to the heteroskedasticity in the equation, with typical row Ψi.

- In-text reference with the coordinate start=29737
- Prefix
- These indicator variables must be exogenous, typically either instruments or functions of the instruments. Common choices would be: 1. The levels, squares, and cross-products of the instrumentsZ(excluding the constant), as in the
- Exact
- White (1980)
- Suffix
- test. This is the default inivhettest. 2. The levels only of the instrumentsZ(excluding the constant). This is available inivhettestby specifying theivlevoption. 6For a more detailed discussion, see Pagan and Hall (1983) or Godfrey (1988), pp. 189–90. 7We note here that the original Pagan–Hall paper has a serious typo in the presentation of their non-normality-robust statistic.

- In-text reference with the coordinate start=28179
- 40
- Wooldridge, J. M. 1995. Score diagnostics for linear models estimated by two stage least squares. InAdvances in Econometrics and Quantitative Economics: Essays in honor of Professor C. R. Rao, eds. G. S. Maddala, P. C. B. Phillips, and T. N. Srinivasan, 66–87. Cambridge, MA: Blackwell Publishers. —. 2002.Econometric Analysis of Cross Section and Panel Data. 1st ed. Cambridge,

Total in-text references: 2- In-text reference with the coordinate start=47588
- Prefix
- For included instruments, theCtest hypothecates placing them in the list of included endogenous variables: in essence, treating them as endogenous regressors. TheCtest, 14See Ahn (1995), Proposition 1, or, for an alternative formulation,
- Exact
- Wooldridge (1995),
- Suffix
- Procedure 3.2. 15See Hayashi (2000), pp. 218–21 and pp. 232–34 or Ruud (2000), Chapter 22, for comprehensive presentations. distributedχ2with degrees of freedom equal to the loss of overidentifying restrictions (i.e., the number of suspect instruments being tested), has the null hypothesis that the specified variables are proper instruments.

- In-text reference with the coordinate start=53726
- Prefix
- The Hausman (1978) specification test takes the quadratic form H=n(ˆβc−ˆβe)′D−(ˆβc−ˆβe) where D= ( V(ˆβc)−V(ˆβe) )(45) and whereV(ˆβ) denotes a consistent estimate of the asymptotic variance ofβ, and the 17See Wooldridge (2002), pp. 58–61, and
- Exact
- Wooldridge (1995)
- Suffix
- for more detailed discussion. operator−denotes a generalized inverse. A Hausman statistic for a test of endogeneity in an IV regression is formed by choosing OLS as the efficient estimatorˆβeand IV as the inefficient but consistent estimator βˆc.

- In-text reference with the coordinate start=47588
- 41
- Wu, D.-M. 1973. Alternative tests of independence between stochastic regressors and disturbances.Econometrica41(4): 733–750. —. 1974. Alternative tests of independence between stochastic regressors and disturbances: Finite sample results.Econometrica42(3): 529–546.

Total in-text references: 5- In-text reference with the coordinate start=55811
- Prefix
- If a common estimate ofσis used, then the generalized inverse ofDis guaranteed to exist and a positive test statistic is guaranteed.19 If the Hausman statistic is formed using the OLS estimate of the error variance, then theDmatrix in Equation (45) becomes D= ˆσ2OLS ( (X′PZX)−1−(X′X)−1 ) (47) This version of the endogeneity test was first proposed by Durbin (1954) and separately by
- Exact
- Wu (1973)
- Suffix
- (hisT4statistic) and Hausman (1978). It can be obtained within Stata by usinghausmanwith thesigmamoreoption in conjunction with estimation byregress, ivregand/orivreg2. If the Hausman statistic is formed using the IV estimate of the error variance, then theDmatrix becomes D= ˆσ2IV ( (X′PZX)−1−(X′X)−1 ) (48) 18Readers should also bear in mind here and below that the estimates of the error variance

- In-text reference with the coordinate start=57109
- Prefix
- In this alternate form, the matrix difference in the expression equivalent to (47) is positive definite and a generalized inverse is not required. See Bowden and Turkington (1984), pp. 50–51. This version of the statistic was proposed by separately by
- Exact
- Wu (1973)
- Suffix
- (hisT3statistic) and Hausman (1978). It can be obtained within Stata by usinghausmanwith the (undocumented)sigmalessoption. Use ofhausmanwith thesigmamoreorsigmalessoptions avoids the additional annoyance that because Stata’shausmantries to deduce the correct degrees of freedom for the test from the rank of the matrixD, it may sometimes come up with the wrong answer.

- In-text reference with the coordinate start=68308
- Prefix
- Yet another asymptotically equivalent flavor of the DWH test is available for standard IV estimation under conditional homoskedasticity, and is included in the output ofivendog. This is the test statistic introduced by
- Exact
- Wu (1973)
- Suffix
- (hisT2), and separately shown by Hausman (1978) to be calculated straightforwardly through the use of auxiliary regressions. We will refer to it as the Wu–Hausman statistic.24 Consider a simplified version of our basic model (1) with a single endogenous regressorx1: y=β1x1+X2β2+u,(49) withX2≡Z2assumed exogenous (including the constant, if one is specified) and with excluded instrumentsZ1as usua

- In-text reference with the coordinate start=70148
- Prefix
- The test statistic then becomes anF−test, with numerator degrees of freedom equal to the number of included endogenous variables. One advantage of the Wu–Hausman F−statistic over the other DWH tests for IV vs. OLS is that with certain normality assumptions, it is a finite sample test exactly distributed asF(see
- Exact
- Wu (1973) and
- Suffix
- Nakamura and Nakamura (1981)). Wu (1974)’s Monte Carlo studies also suggest that this statistic is to be preferred to the statistic using just ˆσ2IV. A version of the Wu–Hausman statistic for testing a subset of regressors is also available, as Davidson and MacKinnon (1993), pp. 241–242 point out.

- In-text reference with the coordinate start=72416
- Prefix
- ) = Q∗ USSR/n (53) and the Wu–HausmanF−statistic can be written Wu-Hausman:F(K1B,n−K−K1B) = Q∗/K1B (USSR−Q∗)/(n−K−K1B) (54) whereQ∗is the difference between the restricted and unrestricted sums of squares given by the auxiliary regression (51) or (52), andUSSRis the sum of squared residuals from the efficient estimate of the model.25From the discussion in the preceding section, 25See
- Exact
- Wu (1973)
- Suffix
- or Nakamura and Nakamura (1981).Q∗can also be interpreted as the difference between the sums of squares of the second–stage estimation of the efficient model with and without however, we know that for tests of the endogeneity of regressors, theCstatistic and the Hausman form of the DWH test are numerically equal, and when the error variance from the more efficient estimation is used, the Hausman f

- In-text reference with the coordinate start=55811