Table Of Contents

- Manual
- Getting Started
- Starting the Program
- Retrieving Data
- Manipulating Data
- The Variable List
- The Variable List Menu
- Filter Observations/Selecting
- Add New Variables
- Delete Variables
- Edit Metadata
- Set Replicate Weights
- New Variable Reserve
- Edit Value Labels
- Dummy Code Categorical Variable
- Collapse Categories of Categorical Variable
- Set Missing Values
- The Expression Evaluator

- Saving and Re-running Actions

- Sampling
- Procedures
- Measurement Models
- MML Models for Test Data
- Other Available Procedures

- Graphics
- Tools
- Estimation Methods
- Optimization Techniques
- Variance Estimation

- Post-hoc Procedures
- More user input instructions
- The User Interface
- Input Instructions
- Options
- Output Precision

- Glossary of Terms and Symbols

- Getting Started

Descriptive

Descriptive statistics are a useful tool to assist analysts in simplifying and reducing the information contained in the sample data to a few descriptive numbers, called sample statistics. They are often used as a preliminary step to further analysis to examine the properties of the data and are helpful in detecting possible data problems (e.g., out-of-range values, skewness of responses). The current descriptive statistics procedure provides the sample mean of a variable with its corresponding standard error, the standard deviation, and the minimum and maximum values. Standard errors and standard deviations are adjusted to take into account the particular features of the sample design.

The descriptive statistics procedure provides information on both the center (i.e., mean) and spread (i.e., standard deviation; minimum; maximum) of a distribution. In addition, it provides a measure of the accuracy of the estimate of the sample mean (i.e., its standard error).

An important measure of central tendency, the sample mean of a variable is simply its average, computed by summing the values of all observations in the sample and dividing by the number of observations. This can be written as:

Although the average may be the most important single statistic, it is also important to know how spread out or varied the observations are. A commonly used measure of spread is the standard deviation. The standard deviation is simply the square root of the variance and provides a standard way to measure the deviation from the mean. In a simple random sample (srs) situation, the variance is the sum of the squared deviations from the mean (squared so that their sum is not equal to zero) divided by *n*-1 observations (to adjust for the numbers of degrees of freedom). Thus, in srs the standard deviation can be written as:

The above formulation assumes that the selection of elements in the sample is independent of one another. Most large scale surveys, however, employ complex sample design that rely on stratification and/or clustering methods (e.g., as in NAEP) to reduce the cost of administration. These designs violate the assumption of independence of observations (for example, by selecting students in entire classrooms), resulting in a non-zero covariance between the elements included in the sample. In such a situation, the formula for computing the variance (as well as the formula for any statistics using the variance) must be adjusted to take into account the combined effects of stratification, clustering, and the sample weights. This is done through robust variance estimation techniques such as Taylor series expansion or jackknife repeated replication. In the current procedure, the standard deviation is estimated using the Taylor series expansion variance estimation technique.

Additional information on the spread of the observations is provided through the minimum and maximum values of the sample data.

While information about the sample mean is useful in a variety of situations, in itself it says little about the corresponding target population mean it represents. The deviation of the sample mean from its target population mean represents the estimation error and is typically measured by the standard error. The larger the standard error of a sample mean, the less typical is that sample from its corresponding population. The formula for the standard error is:

where s represents the sample standard deviation. Again, to take into account the features of a complex survey design, the estimation of the standard error of the sample mean is based on a Taylor series expansion variance estimation technique.

Wonnacott, T. H., & Wonnacott, R. J. (1984). *Introductory Statistics for Business and Economics, 3 ^{rd} Edition*. New York: John Wiley & Sons.

To run Descriptive left-click on the **Statistics** menu and select "Descriptive." The following dialogue box will open:

Specify the independent variables and the dependent variable. You may also select the desired output format.

When you are finished, click the *OK* button.

Click the *OK* button on the Descriptive dialogue box to begin the analysis

Once the analysis is completed, you may perform a correlation or t-tests on the results.