Posterior Distributions

This section provides the details of the calculation of posterior means, variances, and the standard error of the posterior means in AM. Currently, procedures that are based on MML regression allow you to save this posterior information in the data base. The technical details follow.

Calculation of the posterior means and variance for subscales

For each subscale we obtain the mean and variance for of the posterior distribution for each individual. We estimate the values using numeric quadrature on the same fixed-distance points used to estimate the MML models. Hence for any single subscale the posterior mean is estimated as

 

and the posterior variance is estimated as

 

 

Calculation of the moments of the posterior distribution for composite scales

The calculations of the moments of a multivariate posterior distribution most tractable when analytic results are available, as is the case for the normal distribution, and the prior distribution is multivariate normal. The measurement distribution () is not. Often the measurement distributions are asymmetrical, and sometimes degenerate. Therefore, identifying the parameters of the normal distributions that provide the best approximation is not simple. This is the key problem addressed by Thomas (1993).

Here, we arrive at normal approximations by identifying the means and variances of the normal distributions that would have given rise to the estimated posteriors and . For convenience, we omit the subscript i in what follows. To arrive at this approximation let and . Similarly, denote the moments of the prior distribution as and . Finally, let the normal approximation to the measurement distribution have moments and . Standard Bayesian calculations for normal distributions gives , where , and . This is all that is required to solve for the appropriate moments of the approximate multivariate normal distribution: and .

To obtain the moments of the composite posterior distributions we introduce the information about the correlations among the subscales obtained from the MML composite regression. Define where COV is the matrix formed from the estimated covariances among subscales discussed in the previous section. The approximate posterior means and variances are and .

The moments of the composite posteriors are formed as and

The approximation used here works well. Appendix C presents some simple evidence demonstrating that this approximation effectively recovers variances and covariances even under extreme conditions.

Approximate standard error of the posterior mean

Formulas for the estimation of the percent of population groups above achievement levels (presented in the next section) require an estimate of the standard error around the posterior means at each observation. This appendix describes a first-order approximation of that standard error.

Our estimate of the standard error of the posterior mean begins as though the posterior distributions were approximated as normal, although in the case of subscales, they need not be. Readers should note that the posterior distributions for individual subscales are calculated on a finite set of points and may take on any shape. As described above, the composite posteriors use a normal approximation. We have found, however, that standard errors for the corresponding normal approximation work well in either case.

For this section, we change our notation slightly, and use subscripts to indicate whether parameter estimates are from the measurement (m) or prior (p) distribution. We continue to use to indicate the mean of the posterior distribution.

Define , where is the variance of the measurement distribution and is the variance of the empirical prior distribution. Also, note that where represents the covariance matrix of the parameter estimates from the MML regression. Estimates of are themselves approximated with a first-order Taylor linearization as discussed in Binder (1983) and applied to marginal maximum likelihood estimates above and by Cohen and Jiang (1999). The specific formulas for a single subscale is given in Section 4 above, and the formula for composite scales appears in Section 5.

The normal approximation of the posterior mean would give , where is the estimated empirical prior mean for examinee i, and is the mean of the measurement distribution for examinee i. Here, as in operational NAEP, the measurement distribution is taken as known. In what follows, we drop the subscript i to simplify the notation. We can see that

.

Recognizing that is taken as fixed, this constant drops out of the variance calculation leaving

(4)

The third line of Equation 4 removes the constant from variance terms, and in the final term, substitutes , and again drops the constant. The final line recognizes

In practice, we have found the last term in the final line of Equation A.1 to be typically small, generally amounting to five percent or less of the total variance, and usually substantially less. Using a first order approximation, we find that , where and are the measurement and empirical prior variance, respectively. Notice that the first term in this equation will tend to be small. The first term is only relatively large when the prior variance is small relative to the measurement variance. In these cases, the also tends to be quite small. In the interest of simplicity, we omit this term in our approximation of the variance of the mean of the posterior distribution. Hence,

Click on the Overview tab to read an overview of this statistical analysis. Click on the Details tab to learn more about the details of this analysis. Select the References tab to retrieve the references for this analysis. Select the How To tab to learn how to conduct this analysis. Select the In NAEP tab to see how this procedure applies in NAEP.