The 22 references with contexts in paper Christopher F Baum, Mark E. Schaffer, Steven Stillman (2002) “Instrumental variables and GMM: Estimation and testing” / RePEc:boc:bocoec:545

1
Ahn, S. C. 1995. Robust GMM Tests for Model Specification.Arizona State University (Working Paper).
Total in-text references: 1
  1. In-text reference with the coordinate start=47525
    Prefix
    For excluded instruments, this is equivalent to dropping them from the instrument list. For included instruments, theCtest hypothecates placing them in the list of included endogenous variables: in essence, treating them as endogenous regressors. TheCtest, 14See
    Exact
    Ahn (1995),
    Suffix
    Proposition 1, or, for an alternative formulation, Wooldridge (1995), Procedure 3.2. 15See Hayashi (2000), pp. 218–21 and pp. 232–34 or Ruud (2000), Chapter 22, for comprehensive presentations. distributedχ2with degrees of freedom equal to the loss of overidentifying restrictions (i.e., the number of suspect instruments being tested), has the null hypothesis that the specified variables are pro

2
Arellano, M. 1987. Computing robust standard errors for within–groups estimators.
Total in-text references: 1
  1. In-text reference with the coordinate start=21869
    Prefix
    0ˆΣM         (36) then an estimator ofSthat is consistent in the presence of arbitrary intra–cluster correlation is Sˆ=1 n (Z′ˆΩCZ)(37) The earliest reference to this approach to robust estimation in the presence of clustering of which we are aware is White (1984), pp. 135–6. It is commonly employed in the context of panel data estimation; see Wooldridge (2002), p. 193,
    Exact
    Arellano (1987) and
    Suffix
    K ́ezdi (2002). It is the standard Stata approach to clustering, implemented in, e.g., robust,regressandivreg2.4 The cluster–robust covariance matrix for IV estimation is obtained exactly as in the preceding subsection except usingˆSas defined in Equation (37).

4
Basmann, R. 1960. On finite sample distributions of generalized classical linear identifiability test statistics.Journal of the American Statistical Association55(292): 650–659.
Total in-text references: 2
  1. In-text reference with the coordinate start=5138
    Prefix
    We may cast some light on whether the instruments satisfy the orthogonality conditions in the context of an overidentified model: that is, one in which a surfeit of instruments are available. In that context we may test the overidentifying restrictions in order to provide some evidence of the instruments’ validity. We present the variants of this test due to
    Exact
    Sargan (1958), Basmann (1960) and,
    Suffix
    in the GMM context, L. Hansen (1982), and show how the generalization of this test, theCor “difference–in–Sargan” test, can be used test the validity of subsets of the instruments. Although there may well be reason to suspect non–orthogonality between regressors and errors, the use of IV estimation to address this problem must be balanced against the inevitable loss of efficiency vis–`a–vis OLS.

  2. In-text reference with the coordinate start=42832
    Prefix
    The literature contains several variations on this test. The main idea behind these variations is that there is more than one way to consistently estimate the variance in the denominator of (43). The most important of these is that of
    Exact
    Basmann (1960).
    Suffix
    Independently of Sargan, Basmann proposed anF(L−K,n−L)-test of overidentifying restrictions: Basmann’sF-statistic = uˆ′PZˆu/(L−K) uˆ′MZˆu/(n−L) (44) whereMZ≡I−PZis the “annihilator” matrix andLis the total number of instruments.

8
Chamberlain, G. 1982. Multivariate regression models for panel data.Journal of Econometrics18: 5–46.
Total in-text references: 1
  1. In-text reference with the coordinate start=24741
    Prefix
    but correct inference is still possible through the use of the Eicker–Huber– White “sandwich” robust covariance estimator, and this estimator can also be derived using the general formula for the asymptotic variance of a GMM estimator with a sub– optimal weighting matrix, Equation (24). A natural question is whether a more efficient GMM estimator exists, and the answer is “yes”
    Exact
    (Chamberlain (1982), Cragg (1983)).
    Suffix
    If the disturbance is heteroskedastic, there are no endogenous regressors, and the researcher has available additional moment conditions, i.e., additional variables that do not appear in the regression but that are known to be exogenous, then the efficient GMM estimator is that of Cragg (1983), dubbed “heteroskedastic OLS” (HOLS) by Davidson and MacKinnon (1993), p. 600.

10
Cragg, J. 1983. More efficient estimation in the presence of heteroskedasticity of unknown form.Econometrica51: 751–763.
Total in-text references: 2
  1. In-text reference with the coordinate start=24741
    Prefix
    but correct inference is still possible through the use of the Eicker–Huber– White “sandwich” robust covariance estimator, and this estimator can also be derived using the general formula for the asymptotic variance of a GMM estimator with a sub– optimal weighting matrix, Equation (24). A natural question is whether a more efficient GMM estimator exists, and the answer is “yes”
    Exact
    (Chamberlain (1982), Cragg (1983)).
    Suffix
    If the disturbance is heteroskedastic, there are no endogenous regressors, and the researcher has available additional moment conditions, i.e., additional variables that do not appear in the regression but that are known to be exogenous, then the efficient GMM estimator is that of Cragg (1983), dubbed “heteroskedastic OLS” (HOLS) by Davidson and MacKinnon (1993), p. 600.

  2. In-text reference with the coordinate start=25077
    Prefix
    If the disturbance is heteroskedastic, there are no endogenous regressors, and the researcher has available additional moment conditions, i.e., additional variables that do not appear in the regression but that are known to be exogenous, then the efficient GMM estimator is that of
    Exact
    Cragg (1983),
    Suffix
    dubbed “heteroskedastic OLS” (HOLS) by Davidson and MacKinnon (1993), p. 600. It can be obtained in precisely the same way as feasible efficient two–step GMM except now the first–step inefficient but consistent estimator used to generate the residuals is OLS rather than IV.

13
Durbin, J. 1954. Errors in variables.Review of the International Statistical Institute 22: 23–32.
Total in-text references: 1
  1. In-text reference with the coordinate start=55780
    Prefix
    If a common estimate ofσis used, then the generalized inverse ofDis guaranteed to exist and a positive test statistic is guaranteed.19 If the Hausman statistic is formed using the OLS estimate of the error variance, then theDmatrix in Equation (45) becomes D= ˆσ2OLS ( (X′PZX)−1−(X′X)−1 ) (47) This version of the endogeneity test was first proposed by
    Exact
    Durbin (1954) and
    Suffix
    separately by Wu (1973) (hisT4statistic) and Hausman (1978). It can be obtained within Stata by usinghausmanwith thesigmamoreoption in conjunction with estimation byregress, ivregand/orivreg2. If the Hausman statistic is formed using the IV estimate of the error variance, then theDmatrix becomes D= ˆσ2IV ( (X′PZX)−1−(X′X)−1 ) (48) 18Readers should also bear in mind here and below that the estimat

15
Godfrey, L. G. 1978. Testing for multiplicative heteroskedasticity.Journal of Econometrics8: 227–236. —. 1988.Misspecification tests in econometrics: The Lagrange multiplier principle and other approaches. Cambridge: Cambridge University Press. —. 1999. Instrument relevance in multivariate linear models.Review of Economics &
Total in-text references: 1
  1. In-text reference with the coordinate start=27283
    Prefix
    this test in the next section. 3Testing for heteroskedasticity The Breusch–Pagan/Godfrey/Cook–Weisberg and White/Koenker statistics are standard tests of the presence of heteroskedasticity in an OLS regression. The principle is to test for a relationship between the residuals of the regression andpindicator variables that are hypothesized to be related to the heteroskedasticity. Breusch and
    Exact
    Pagan (1979), Godfrey (1978), and
    Suffix
    Cook and Weisberg (1983) separately derived the same test statistic. This statistic is distributed asχ2withpdegrees of freedom under the null of no heteroskedasticity, and under the maintained hypothesis that the error of the regression is normally distributed.

16
Greene, W. H. 2000.Econometric Analysis. 4th ed. Upper Saddle River, NJ: Prentice– Hall.
Total in-text references: 3
  1. In-text reference with the coordinate start=7077
    Prefix
    A concise on–line text that covers GMM is Hansen (2000). The exposition below draws on Hansen (2000), Chapter 11; Hayashi (2000), Chapter 3; Wooldridge (2002), Chapter 8; Davidson and MacKinnon (1993), and
    Exact
    Greene (2000).
    Suffix
    We begin with the standard IV estimator, and then relate it to the GMM framework. We then consider the issue of clustered errors, and finally turn to OLS. 2.1The method of instrumental variables The equation to be estimated is, in matrix notation, y=Xβ+u,E(uu′) = Ω(1) with typical row yi=Xiβ+ui(2) The matrix of regressorsXisn×K, wherenis the number of observations.

  2. In-text reference with the coordinate start=10444
    Prefix
    n (15) we obtain the estimated asymptotic variance–covariance matrix of the IV estimator: V(ˆβIV) = ˆσ2(X′Z(Z′Z)−1Z′X)−1= ˆσ2(X′PZX)−1(16) Note that some packages, including Stata’sivreg, include a degrees–of–freedom correction to the estimate of ˆσ2by replacingnwithn−L. This correction is not necessary, however, since the estimate of ˆσ2would not be unbiased anyway
    Exact
    (Greene (2000),
    Suffix
    p. 373). Ourivreg2routine defaults to the large–sample formulas for the estimated error variance and covariance matrix; the user can request the small–sample versions with the optionsmall. 2.2The Generalized Method of Moments The standard IV estimator is a special case of a Generalized Method of Moments (GMM) estimator.

  3. In-text reference with the coordinate start=56482
    Prefix
    ˆσ2IV ( (X′PZX)−1−(X′X)−1 ) (48) 18Readers should also bear in mind here and below that the estimates of the error variances may or may not have small-sample corrections, according to the estimation package used and the options chosen. If one of the variance-covariance matrices inDuses a small-sample correction, then so should the other. 19The matrix difference in (47) and (48) has rankK1; see
    Exact
    Greene (2000),
    Suffix
    pp. 384–385. Intuitively, the variables being tested are those not shared byXandZ, namely theK1endogenous regressors X1. The Hausman statistic for the endogeneity test can also be expressed in terms of a test of the coefficients of the endogenous regressors alone and the rest of theβs removed.

18
Hansen,B.E.2000.Econometrics.1sted.Madison,WI: http://www.ssc.wisc.edu/ bhansen/notes/notes.htm.
Total in-text references: 2
  1. In-text reference with the coordinate start=6903
    Prefix
    There are a number of good modern texts that cover GMM, and one recent prominent text, Hayashi (2000), presents virtually all the estimation techniques discussed in the GMM framework. A concise on–line text that covers GMM is
    Exact
    Hansen (2000).
    Suffix
    The exposition below draws on Hansen (2000), Chapter 11; Hayashi (2000), Chapter 3; Wooldridge (2002), Chapter 8; Davidson and MacKinnon (1993), and Greene (2000). We begin with the standard IV estimator, and then relate it to the GMM framework.

  2. In-text reference with the coordinate start=6954
    Prefix
    There are a number of good modern texts that cover GMM, and one recent prominent text, Hayashi (2000), presents virtually all the estimation techniques discussed in the GMM framework. A concise on–line text that covers GMM is Hansen (2000). The exposition below draws on
    Exact
    Hansen (2000),
    Suffix
    Chapter 11; Hayashi (2000), Chapter 3; Wooldridge (2002), Chapter 8; Davidson and MacKinnon (1993), and Greene (2000). We begin with the standard IV estimator, and then relate it to the GMM framework.

19
Hansen, L. 1982. Large sample properties of generalized method of moments estimators. Econometrica50(3): 1029–1054.
Total in-text references: 3
  1. In-text reference with the coordinate start=2244
    Prefix
    The conventional IV estimator (though consistent) is, however, inefficient in the presence of heteroskedasticity. The usual approach today when facing heteroskedasticity of unknown form is to use the Generalized Method of Moments (GMM), introduced by L.
    Exact
    Hansen (1982).
    Suffix
    GMM makes use of the orthogonality conditions to allow for efficient estimation in the presence of heteroskedasticity of unknown form. In the twenty years since it was first introduced, GMM has become a very popular tool among empirical researchers.

  2. In-text reference with the coordinate start=5196
    Prefix
    In that context we may test the overidentifying restrictions in order to provide some evidence of the instruments’ validity. We present the variants of this test due to Sargan (1958), Basmann (1960) and, in the GMM context, L.
    Exact
    Hansen (1982), and
    Suffix
    show how the generalization of this test, theCor “difference–in–Sargan” test, can be used test the validity of subsets of the instruments. Although there may well be reason to suspect non–orthogonality between regressors and errors, the use of IV estimation to address this problem must be balanced against the inevitable loss of efficiency vis–`a–vis OLS.

  3. In-text reference with the coordinate start=39315
    Prefix
    as a standard diagnostic in any overidentified instrumental variables estimation.11These are tests of the joint hypotheses of correct model specification and the orthogonality conditions, and a rejection may properly call either or both of those hypotheses into question. In the context of GMM, the overidentifying restrictions may be tested via the commonly employedJstatistic of
    Exact
    Hansen (1982).
    Suffix
    This statistic is none other than the value of the GMM objective function (20), evaluated at the efficient GMM estimatorˆβEGMM. Under the null, J(ˆβEGMM) =ng(ˆβ)′ˆS−1g(ˆβ) A ∼χ2L−K(41) In the case of heteroskedastic errors, the matrixˆSis estimated using theˆΩ matrix (27), and theJstatistic becomes J(ˆβEGMM) = ˆuZ′(Z′ˆΩZ)−1Zˆu′ A ∼χ2L−K(42) With clustered errors, theˆΩCmatrix (37) can be us

22
Hausman, J. 1978. Specification tests in econometrics.Econometrica46(3): 1251–1271.
Total in-text references: 4
  1. In-text reference with the coordinate start=53494
    Prefix
    Denote byˆβcthe estimator that is consistent under both the null and the alternative hypotheses, and byˆβethe estimator that is fully efficient under the null but inconsistent if the null is not true. The
    Exact
    Hausman (1978)
    Suffix
    specification test takes the quadratic form H=n(ˆβc−ˆβe)′D−(ˆβc−ˆβe) where D= ( V(ˆβc)−V(ˆβe) )(45) and whereV(ˆβ) denotes a consistent estimate of the asymptotic variance ofβ, and the 17See Wooldridge (2002), pp. 58–61, and Wooldridge (1995) for more detailed discussion. operator−denotes a generalized inverse.

  2. In-text reference with the coordinate start=55842
    Prefix
    , then the generalized inverse ofDis guaranteed to exist and a positive test statistic is guaranteed.19 If the Hausman statistic is formed using the OLS estimate of the error variance, then theDmatrix in Equation (45) becomes D= ˆσ2OLS ( (X′PZX)−1−(X′X)−1 ) (47) This version of the endogeneity test was first proposed by Durbin (1954) and separately by Wu (1973) (hisT4statistic) and
    Exact
    Hausman (1978).
    Suffix
    It can be obtained within Stata by usinghausmanwith thesigmamoreoption in conjunction with estimation byregress, ivregand/orivreg2. If the Hausman statistic is formed using the IV estimate of the error variance, then theDmatrix becomes D= ˆσ2IV ( (X′PZX)−1−(X′X)−1 ) (48) 18Readers should also bear in mind here and below that the estimates of the error variances may or may not have small-sample cor

  3. In-text reference with the coordinate start=57140
    Prefix
    In this alternate form, the matrix difference in the expression equivalent to (47) is positive definite and a generalized inverse is not required. See Bowden and Turkington (1984), pp. 50–51. This version of the statistic was proposed by separately by Wu (1973) (hisT3statistic) and
    Exact
    Hausman (1978).
    Suffix
    It can be obtained within Stata by usinghausmanwith the (undocumented)sigmalessoption. Use ofhausmanwith thesigmamoreorsigmalessoptions avoids the additional annoyance that because Stata’shausmantries to deduce the correct degrees of freedom for the test from the rank of the matrixD, it may sometimes come up with the wrong answer.

  4. In-text reference with the coordinate start=68356
    Prefix
    Yet another asymptotically equivalent flavor of the DWH test is available for standard IV estimation under conditional homoskedasticity, and is included in the output ofivendog. This is the test statistic introduced by Wu (1973) (hisT2), and separately shown by
    Exact
    Hausman (1978) to
    Suffix
    be calculated straightforwardly through the use of auxiliary regressions. We will refer to it as the Wu–Hausman statistic.24 Consider a simplified version of our basic model (1) with a single endogenous regressorx1: y=β1x1+X2β2+u,(49) withX2≡Z2assumed exogenous (including the constant, if one is specified) and with excluded instrumentsZ1as usual.

24
Hayashi, F. 2000.Econometrics. 1st ed. Princeton, NJ: Princeton University Press.
Total in-text references: 7
  1. In-text reference with the coordinate start=6764
    Prefix
    The syntax diagrams for these commands are presented in the last section of the paper, and the electronic supplement presents annotated examples of their use. 2IV and GMM estimation The “Generalized Method of Moments” was introduced by L. Hansen in his celebrated 1982 paper. There are a number of good modern texts that cover GMM, and one recent prominent text,
    Exact
    Hayashi (2000),
    Suffix
    presents virtually all the estimation techniques discussed in the GMM framework. A concise on–line text that covers GMM is Hansen (2000). The exposition below draws on Hansen (2000), Chapter 11; Hayashi (2000), Chapter 3; Wooldridge (2002), Chapter 8; Davidson and MacKinnon (1993), and Greene (2000).

  2. In-text reference with the coordinate start=6985
    Prefix
    There are a number of good modern texts that cover GMM, and one recent prominent text, Hayashi (2000), presents virtually all the estimation techniques discussed in the GMM framework. A concise on–line text that covers GMM is Hansen (2000). The exposition below draws on Hansen (2000), Chapter 11;
    Exact
    Hayashi (2000),
    Suffix
    Chapter 3; Wooldridge (2002), Chapter 8; Davidson and MacKinnon (1993), and Greene (2000). We begin with the standard IV estimator, and then relate it to the GMM framework. We then consider the issue of clustered errors, and finally turn to OLS. 2.1The method of instrumental variables The equation to be estimated is, in matrix notation, y=Xβ+u,E(uu′) = Ω(1) with typical row yi=Xiβ+ui(2) The matrix

  3. In-text reference with the coordinate start=25966
    Prefix
    The advantages of GMM over IV are clear: if heteroskedasticity is present, the GMM estimator is more efficient than the simple IV estimator, whereas if heteroskedasticity is not present, the GMM estimator is no worse asymptotically than the IV estimator. Nevertheless, the use of GMM does come with a price. The problem, as
    Exact
    Hayashi (2000)
    Suffix
    points out (p. 215), is that the optimal weighting matrixˆSat the core of efficient GMM is a function of fourth moments, and obtaining reasonable estimates of fourth moments may require very large sample sizes.

  4. In-text reference with the coordinate start=46653
    Prefix
    In these contexts, a “difference–in–Sargan” statistic may usefully be employed.15 The test is known under other names as well, e.g., Ruud (2000) calls it the “distance difference” statistic, and
    Exact
    Hayashi (2000)
    Suffix
    follows Eichenbaum et al. (1988) and dubs it theCstatistic; we will use the latter term. TheCtest allows us to test a subset of the original set of orthogonality conditions. The statistic is computed as the difference between two Sargan statistics (or, for efficient GMM, twoJstatistics): that for the (restricted, fully efficient) regression using the entire set of overidentifying res

  5. In-text reference with the coordinate start=47626
    Prefix
    For included instruments, theCtest hypothecates placing them in the list of included endogenous variables: in essence, treating them as endogenous regressors. TheCtest, 14See Ahn (1995), Proposition 1, or, for an alternative formulation, Wooldridge (1995), Procedure 3.2. 15See
    Exact
    Hayashi (2000),
    Suffix
    pp. 218–21 and pp. 232–34 or Ruud (2000), Chapter 22, for comprehensive presentations. distributedχ2with degrees of freedom equal to the loss of overidentifying restrictions (i.e., the number of suspect instruments being tested), has the null hypothesis that the specified variables are proper instruments.

  6. In-text reference with the coordinate start=48866
    Prefix
    More precisely,ˆSfrom the restricted estimation is used to form the restrictedJstatistic, and the submatrix ofˆSwith rows/columns corresponding to the unrestricted estimation is used to form theJstatistic for the unrestricted estimation; see
    Exact
    Hayashi (2000),
    Suffix
    p. 220. TheCtest is conducted inivreg2by specifying theorthogoption, and listing the instruments (either included or excluded) to be challenged. The equation must still be identified with these instruments either removed or reconsidered as endogenous if the Cstatistic is to be calculated.

  7. In-text reference with the coordinate start=63284
    Prefix
    We can state this more precisely as follows: IfLe−Lc≤Kc1, theCstatistic and the Hausman statistic are numerically 21Users beware: thesigmamoreoption following arobustestimation will not only fail to accomplish this, it will generate an invalid test statistic as well. 22See Hausman and Taylor (1981) and Newey (1985), summarized by
    Exact
    Hayashi (2000),
    Suffix
    pp. 233–34. equivalent.23IfLe−Lc> Kc1, the two statistics will be numerically different, theC statistic will haveLe−Lcdegrees of freedom, and the Hausman statistic will haveKc1 degrees of freedom in the conditional homoskedasticity case (and an unknown number of degrees of freedom in the conditional heteroskedasticity case).

28
Koenker, R. 1981. A note on Studentizing a test for heteroskedasticity.Journal of Econometrics17: 107–112.
Total in-text references: 1
  1. In-text reference with the coordinate start=27592
    Prefix
    Breusch and Pagan (1979), Godfrey (1978), and Cook and Weisberg (1983) separately derived the same test statistic. This statistic is distributed asχ2withpdegrees of freedom under the null of no heteroskedasticity, and under the maintained hypothesis that the error of the regression is normally distributed.
    Exact
    Koenker (1981)
    Suffix
    noted that the power of this test is very sensitive to the normality assumption, and presented a version of the test that relaxed this assumption. Koenker’s test statistic, also distributed asχ2punder the null, is easily obtained asnR2c, whereR2cis the centeredR2from an auxiliary regression of the squared residuals from the original regression on the indicator variables.

29
Moulton, B. R. 1986. Random group effects and the precision of regression estimates. Journal of Econometrics32: 385–397.
Total in-text references: 1
  1. In-text reference with the coordinate start=23451
    Prefix
    But users should take care that, if theclusteroption is used, then it ought to be the case thatM >> K.5 4There are other approaches to dealing with clustering that put more structure on the Ω matrix and hence are more efficient but less robust. For example, the
    Exact
    Moulton (1986)
    Suffix
    approach to obtaining consistent standard errors is in effect to specify an “error components” (a.k.a. “random effects”) structure in Equation (36): Σmis a matrix with diagonal elementsσ2u+σ2vand off-diagonal elements σ2v.

30
Nakamura, A. and M. Nakamura. 1981. On the relationships among several specification error tests presented by Durbin, Wu, and Hausman.Econometrica49(6): 1583–1588.
Total in-text references: 2
  1. In-text reference with the coordinate start=70176
    Prefix
    One advantage of the Wu–Hausman F−statistic over the other DWH tests for IV vs. OLS is that with certain normality assumptions, it is a finite sample test exactly distributed asF(see Wu (1973) and Nakamura and
    Exact
    Nakamura (1981)). Wu (1974)
    Suffix
    ’s Monte Carlo studies also suggest that this statistic is to be preferred to the statistic using just ˆσ2IV. A version of the Wu–Hausman statistic for testing a subset of regressors is also available, as Davidson and MacKinnon (1993), pp. 241–242 point out.

  2. In-text reference with the coordinate start=72442
    Prefix
    Wu–HausmanF−statistic can be written Wu-Hausman:F(K1B,n−K−K1B) = Q∗/K1B (USSR−Q∗)/(n−K−K1B) (54) whereQ∗is the difference between the restricted and unrestricted sums of squares given by the auxiliary regression (51) or (52), andUSSRis the sum of squared residuals from the efficient estimate of the model.25From the discussion in the preceding section, 25See Wu (1973) or Nakamura and
    Exact
    Nakamura (1981).
    Suffix
    Q∗can also be interpreted as the difference between the sums of squares of the second–stage estimation of the efficient model with and without however, we know that for tests of the endogeneity of regressors, theCstatistic and the Hausman form of the DWH test are numerically equal, and when the error variance from the more efficient estimation is used, the Hausman form of the DWH test is the Durbi

31
Newey, W. 1985. Generalized method of moments specification testing.Journal of Econometrics29: 229–256.
Total in-text references: 1
  1. In-text reference with the coordinate start=63256
    Prefix
    We can state this more precisely as follows: IfLe−Lc≤Kc1, theCstatistic and the Hausman statistic are numerically 21Users beware: thesigmamoreoption following arobustestimation will not only fail to accomplish this, it will generate an invalid test statistic as well. 22See Hausman and Taylor (1981) and
    Exact
    Newey (1985),
    Suffix
    summarized by Hayashi (2000), pp. 233–34. equivalent.23IfLe−Lc> Kc1, the two statistics will be numerically different, theC statistic will haveLe−Lcdegrees of freedom, and the Hausman statistic will haveKc1 degrees of freedom in the conditional homoskedasticity case (and an unknown number of degrees of freedom in the conditional heteroskedasticity case).

34
Ruud, P. A. 2000.An Introduction to Classical Econometric Theory. Oxford: Oxford
Total in-text references: 6
  1. In-text reference with the coordinate start=46592
    Prefix
    Another common problem arises when the researcher has prior suspicions about the validity of a subset of instruments, and wishes to test them. In these contexts, a “difference–in–Sargan” statistic may usefully be employed.15 The test is known under other names as well, e.g.,
    Exact
    Ruud (2000)
    Suffix
    calls it the “distance difference” statistic, and Hayashi (2000) follows Eichenbaum et al. (1988) and dubs it theCstatistic; we will use the latter term. TheCtest allows us to test a subset of the original set of orthogonality conditions.

  2. In-text reference with the coordinate start=47672
    Prefix
    For included instruments, theCtest hypothecates placing them in the list of included endogenous variables: in essence, treating them as endogenous regressors. TheCtest, 14See Ahn (1995), Proposition 1, or, for an alternative formulation, Wooldridge (1995), Procedure 3.2. 15See Hayashi (2000), pp. 218–21 and pp. 232–34 or
    Exact
    Ruud (2000),
    Suffix
    Chapter 22, for comprehensive presentations. distributedχ2with degrees of freedom equal to the loss of overidentifying restrictions (i.e., the number of suspect instruments being tested), has the null hypothesis that the specified variables are proper instruments.

  3. In-text reference with the coordinate start=49783
    Prefix
    This illustrates how the Hansen–Sargan overidentification test is an “omnibus” test for the failure ofanyof the instruments to satisfy the orthogonality conditions, but at the same time requires that the investigator believe that at leastsome of the instruments are valid; see
    Exact
    Ruud (2000),
    Suffix
    p. 577. 4.5Tests of overidentifying restrictions as Lagrange multiplier (score) tests The Sargan test can be viewed as analogous to a Lagrange multiplier (LM) or score test.16In the case of OLS, the resemblance becomes exact.

  4. In-text reference with the coordinate start=51016
    Prefix
    If thegmmoption is chosen, HOLS estimates are reported along with a robust LM statistic. As usual, theclusteroption generates 16For a detailed discussion of the relationship between the different types of tests in a GMM framework, see
    Exact
    Ruud (2000),
    Suffix
    Chapter 22. a statistic that is robust to arbitrary intra–cluster correlation. If the estimation method is OLS but the error is not homoskedastic, then the standard LM test is no longer valid. A heteroskedasticity–robust version is, however, available.17The robust LM statistic for OLS is numerically equivalent to theJstatistic from feasible efficient two–step GMM, i.e.

  5. In-text reference with the coordinate start=62737
    Prefix
    In the conditional heteroskedasticity case, the degrees of freedom will beLe−LcifLe−Lc≤Kc1but unknown otherwise (making the test impractical).22 What, then, is the difference between the GMMCtest and the Hausman specification test? In fact, because the two estimators being tested are both GMM estimators, the Hausman specification test is a test of linear combinations of orthogonality conditions
    Exact
    (Ruud (2000),
    Suffix
    pp. 578-584). When the particular linear combination of orthogonality conditions being tested is the same for theCtest and for the Hausman test, the two test statistics will be numerically equivalent.

  6. In-text reference with the coordinate start=65078
    Prefix
    faces a trade–off when deciding which of the two tests to use: when the two tests differ, the Hausman test is a test of linear combinations of moment conditions, and is more powerful than theCtest at detecting violations on restrictions of these linear combinations, but the latter test will be able to detect other violations of moment conditions that the former test cannot. As
    Exact
    Ruud (2000),
    Suffix
    pp. 585, points out, one of the appealing features of the Hausman test is that its particular linear combination of moment conditions also determines the consistency of the more efficient GMM estimator.

36
Sargan, J. 1958. The estimation of economic relationships using instrumental variables. Econometrica26(3): 393–415.
Total in-text references: 2
  1. In-text reference with the coordinate start=5138
    Prefix
    We may cast some light on whether the instruments satisfy the orthogonality conditions in the context of an overidentified model: that is, one in which a surfeit of instruments are available. In that context we may test the overidentifying restrictions in order to provide some evidence of the instruments’ validity. We present the variants of this test due to
    Exact
    Sargan (1958), Basmann (1960) and,
    Suffix
    in the GMM context, L. Hansen (1982), and show how the generalization of this test, theCor “difference–in–Sargan” test, can be used test the validity of subsets of the instruments. Although there may well be reason to suspect non–orthogonality between regressors and errors, the use of IV estimation to address this problem must be balanced against the inevitable loss of efficiency vis–`a–vis OLS.

  2. In-text reference with the coordinate start=41568
    Prefix
    analysis without testing the orthogonality assumptions is a “pious fraud”. 4.3Overidentifying restrictions in IV In the special case of linear instrumental variables under conditional heteroskedasticity, the concept of theJstatistic considerably predates the development of GMM estimation techniques. Theivreg2procedure routinely presents this test, labelled as Sargan’s statistic
    Exact
    (Sargan (1958))
    Suffix
    in the estimation output. Just as IV is a special case of GMM, Sargan’s statistic is a special case of Hansen’s Junder the assumption of conditional homoskedasticity. Thus if we use the IV optimal weighting matrix (34) together with the expression forJ(41), we obtain Sargan’s statistic = 1 ˆσ2 uˆ′Z(Z′Z)−1Z′ˆu= uˆ′Z(Z′Z)−1Z′ˆu ˆu′ˆu/n = uˆ′PZˆu uˆ′ˆu/n (43) It is easy to see from (43) that Sargan’

37
Shea, J. 1997. Instrument relevance in multivariate linear models: A simple measure. Review of Economics & Statistics79(2): 348–352.
Total in-text references: 1
  1. In-text reference with the coordinate start=35570
    Prefix
    The statistics proposed by Bound et al. are able to diagnose instrument relevance only in the presence of a single endogenous regressor. When multiple endogenous regressors are used, other statistics are required. One such statistic has been proposed by
    Exact
    Shea (1997)
    Suffix
    : a “partialR2” measure that takes the intercorrelations among the instruments into account.9For a model containing a single endogenous regressor, the twoR2measures are equivalent. The distribution of Shea’s partialR2statistic has not been derived, but it may be interpreted like anyR2.

39
White, H. 1980. A heteroskedasticity-consistent covariance matrix estimator and a direct test for heteroskedasticity.Econometrica48: 817–838. —. 1982. Instrumental variables regression with independent observations.Econometrica50(2): 483–499. —. 1984.Asymptotic Theory for Econometricians. 1st ed. Orlando, FL: Academic
Total in-text references: 3
  1. In-text reference with the coordinate start=28179
    Prefix
    , also distributed asχ2punder the null, is easily obtained asnR2c, whereR2cis the centeredR2from an auxiliary regression of the squared residuals from the original regression on the indicator variables. When the indicator variables are the regressors of the original equation, their squares and their cross-products, Koenker’s test is identical to White’snR2cgeneral test for heteroskedasticity
    Exact
    (White (1980)).
    Suffix
    These tests are available in Stata, following estimation with regress, using ourivhettestas well as viahettestandwhitetst. As Pagan and Hall (1983) point out, the above tests will be valid tests for heteroskedasticity in an IV regression only if heteroskedasticity is present in that equation andnowhere else in the system.

  2. In-text reference with the coordinate start=29265
    Prefix
    Our implementation is of the simpler Pagan–Hall statistic, available with the commandivhettestafter estimation byivreg,ivreg2, orivgmm0. We present the Pagan–Hall test here in the format and notation of the original
    Exact
    White (1980) and White (1982)
    Suffix
    tests, however, to facilitate comparisons with the other tests noted above.7 Let Ψ be then×pmatrix of indicator variables hypothesized to be related to the heteroskedasticity in the equation, with typical row Ψi.

  3. In-text reference with the coordinate start=29737
    Prefix
    These indicator variables must be exogenous, typically either instruments or functions of the instruments. Common choices would be: 1. The levels, squares, and cross-products of the instrumentsZ(excluding the constant), as in the
    Exact
    White (1980)
    Suffix
    test. This is the default inivhettest. 2. The levels only of the instrumentsZ(excluding the constant). This is available inivhettestby specifying theivlevoption. 6For a more detailed discussion, see Pagan and Hall (1983) or Godfrey (1988), pp. 189–90. 7We note here that the original Pagan–Hall paper has a serious typo in the presentation of their non-normality-robust statistic.

40
Wooldridge, J. M. 1995. Score diagnostics for linear models estimated by two stage least squares. InAdvances in Econometrics and Quantitative Economics: Essays in honor of Professor C. R. Rao, eds. G. S. Maddala, P. C. B. Phillips, and T. N. Srinivasan, 66–87. Cambridge, MA: Blackwell Publishers. —. 2002.Econometric Analysis of Cross Section and Panel Data. 1st ed. Cambridge,
Total in-text references: 2
  1. In-text reference with the coordinate start=47588
    Prefix
    For included instruments, theCtest hypothecates placing them in the list of included endogenous variables: in essence, treating them as endogenous regressors. TheCtest, 14See Ahn (1995), Proposition 1, or, for an alternative formulation,
    Exact
    Wooldridge (1995),
    Suffix
    Procedure 3.2. 15See Hayashi (2000), pp. 218–21 and pp. 232–34 or Ruud (2000), Chapter 22, for comprehensive presentations. distributedχ2with degrees of freedom equal to the loss of overidentifying restrictions (i.e., the number of suspect instruments being tested), has the null hypothesis that the specified variables are proper instruments.

  2. In-text reference with the coordinate start=53726
    Prefix
    The Hausman (1978) specification test takes the quadratic form H=n(ˆβc−ˆβe)′D−(ˆβc−ˆβe) where D= ( V(ˆβc)−V(ˆβe) )(45) and whereV(ˆβ) denotes a consistent estimate of the asymptotic variance ofβ, and the 17See Wooldridge (2002), pp. 58–61, and
    Exact
    Wooldridge (1995)
    Suffix
    for more detailed discussion. operator−denotes a generalized inverse. A Hausman statistic for a test of endogeneity in an IV regression is formed by choosing OLS as the efficient estimatorˆβeand IV as the inefficient but consistent estimator βˆc.

41
Wu, D.-M. 1973. Alternative tests of independence between stochastic regressors and disturbances.Econometrica41(4): 733–750. —. 1974. Alternative tests of independence between stochastic regressors and disturbances: Finite sample results.Econometrica42(3): 529–546.
Total in-text references: 5
  1. In-text reference with the coordinate start=55811
    Prefix
    If a common estimate ofσis used, then the generalized inverse ofDis guaranteed to exist and a positive test statistic is guaranteed.19 If the Hausman statistic is formed using the OLS estimate of the error variance, then theDmatrix in Equation (45) becomes D= ˆσ2OLS ( (X′PZX)−1−(X′X)−1 ) (47) This version of the endogeneity test was first proposed by Durbin (1954) and separately by
    Exact
    Wu (1973)
    Suffix
    (hisT4statistic) and Hausman (1978). It can be obtained within Stata by usinghausmanwith thesigmamoreoption in conjunction with estimation byregress, ivregand/orivreg2. If the Hausman statistic is formed using the IV estimate of the error variance, then theDmatrix becomes D= ˆσ2IV ( (X′PZX)−1−(X′X)−1 ) (48) 18Readers should also bear in mind here and below that the estimates of the error variance

  2. In-text reference with the coordinate start=57109
    Prefix
    In this alternate form, the matrix difference in the expression equivalent to (47) is positive definite and a generalized inverse is not required. See Bowden and Turkington (1984), pp. 50–51. This version of the statistic was proposed by separately by
    Exact
    Wu (1973)
    Suffix
    (hisT3statistic) and Hausman (1978). It can be obtained within Stata by usinghausmanwith the (undocumented)sigmalessoption. Use ofhausmanwith thesigmamoreorsigmalessoptions avoids the additional annoyance that because Stata’shausmantries to deduce the correct degrees of freedom for the test from the rank of the matrixD, it may sometimes come up with the wrong answer.

  3. In-text reference with the coordinate start=68308
    Prefix
    Yet another asymptotically equivalent flavor of the DWH test is available for standard IV estimation under conditional homoskedasticity, and is included in the output ofivendog. This is the test statistic introduced by
    Exact
    Wu (1973)
    Suffix
    (hisT2), and separately shown by Hausman (1978) to be calculated straightforwardly through the use of auxiliary regressions. We will refer to it as the Wu–Hausman statistic.24 Consider a simplified version of our basic model (1) with a single endogenous regressorx1: y=β1x1+X2β2+u,(49) withX2≡Z2assumed exogenous (including the constant, if one is specified) and with excluded instrumentsZ1as usua

  4. In-text reference with the coordinate start=70148
    Prefix
    The test statistic then becomes anF−test, with numerator degrees of freedom equal to the number of included endogenous variables. One advantage of the Wu–Hausman F−statistic over the other DWH tests for IV vs. OLS is that with certain normality assumptions, it is a finite sample test exactly distributed asF(see
    Exact
    Wu (1973) and
    Suffix
    Nakamura and Nakamura (1981)). Wu (1974)’s Monte Carlo studies also suggest that this statistic is to be preferred to the statistic using just ˆσ2IV. A version of the Wu–Hausman statistic for testing a subset of regressors is also available, as Davidson and MacKinnon (1993), pp. 241–242 point out.

  5. In-text reference with the coordinate start=72416
    Prefix
    ) = Q∗ USSR/n (53) and the Wu–HausmanF−statistic can be written Wu-Hausman:F(K1B,n−K−K1B) = Q∗/K1B (USSR−Q∗)/(n−K−K1B) (54) whereQ∗is the difference between the restricted and unrestricted sums of squares given by the auxiliary regression (51) or (52), andUSSRis the sum of squared residuals from the efficient estimate of the model.25From the discussion in the preceding section, 25See
    Exact
    Wu (1973)
    Suffix
    or Nakamura and Nakamura (1981).Q∗can also be interpreted as the difference between the sums of squares of the second–stage estimation of the efficient model with and without however, we know that for tests of the endogeneity of regressors, theCstatistic and the Hausman form of the DWH test are numerically equal, and when the error variance from the more efficient estimation is used, the Hausman f