I. Important Concepts
Introduction to Statistics
Statistics are a set of mathematical procedures used to organize, summarize, and interpret information. As the primary form of analyzing and interpreting quantitative data, statistics help us to organize and understand the information contained within our data, communicate our results and describe characteristics of our data to others, and answer the research questions that drive the research.
Normal distributions enable researchers to analyze and compare scores and find out the proportion of individuals who fall above or below a score. If a data set is Normal, there are many calculations that can be performed and scores can be standardized to make comparisons easier. This topic will explain how to define and describe density curves, measure position using percentiles and z-scores, describe Normal distributions, apply the 68-95-99.7 Rule, and perform Normal calculations.
Hypothesis testing is a common goal shared by all inferential statistics. This presentation discusses the logic of hypothesis testing and important related concepts including errors, alpha, effect sizes, and statistical power. This presentation includes a detailed example of the process used for hypothesis testing. Although the provided example is set within the context of z scores, the concepts described are generalizable to other inferential techniques.
II. Data Collection & Management
Sampling & Data Collection
The following presentation discusses the difference between populations and samples, and describes several sample selection procedures. The presentation also includes a brief description of data collection procedures, such as surveys.
Data Management in SPSS
The following software tutorials demonstrate how to create or open a data file, how to define variables, and how to enter new data in SPSS. Further, the tutorials demonstrate how to conduct basic data management procedures such as the selecting a subset of individuals, recoding variables, or computing new variables.
III. Univariate & Bivariate Analyses
Exploratory or descriptive analyses enable researchers to analyze and describe their data prior to running any statistical tests. They are always the first steps in analyzing data. Statistics measuring spread, central tendency, or frequency can be used to provide an overall description of the data without forcing researchers to look at every value in the set. Additionally, various charts can be used to illustrate the distribution of quantitative or categorical variables.
- Exploratory Statistics
Descriptive statistics enable researchers to analyze and describe their data prior to running any statistical tests. Measures of central tendency and variance explain a data set without forcing researchers to look at every value in the set. This topic will explain how to find measures of central tendency, including mean, median, and mode, and measures of spread, including quartiles and standard deviation. Descriptive analyses are always the first steps in analyzing data.
Charts and figures are often used for exploratory purposes. Depending on the nature of the data, different types of graphical representations can be used to summarize information. The narrated presentation discusses the components of a data set (individuals and variables) and explains the differences between different types of variables. Several methods to graphically summarize the data are then described for each type of variable. The software tutorials show how to generate different types of graphs in SPSS.
Scatterplots & Correlations
Scatterplots and correlation coefficients are measures of association between two quantitative variables. For instance, researchers may want to know whether an increased amount of time spent on homework is associated with higher scores on a standardized test. Scatterplots provide descriptive information on the direction, form, and strength of the relationship between the two variables by representing individuals as points on a two-dimensional graph. These points may aggregate to describe a linear relationship, curvilinear relationship, or no relationship. Scatterplots may also indicate whether there is positive or negative association between variables, and suggest the strength of their relationship. A positive association means that high values in one variable are associated with high values in the other variable, whereas a negative association shows that high values in one variable are associated with low values in the other variable (e.g. the relationship between poverty and student achievement).
Simple Linear Regression
Simple linear regression allows researchers to predict or explain the variance of a response variable using a predictor variable. For instance, simple linear regression may be used in educational research to predict college GPA based on SAT scores. The narrated presentation bellow provides an introduction to the topic of simple linear regression. It discusses basic concepts related to simple linear regression, the assumptions on which this procedure is based, and how to interpret and use the regression equation. The software tutorial demonstrates how to conduct a simple linear regression in SPSS.
z Procedures: Confidence Intervals for the Population Mean
A confidence interval is a range of values that a parameter may take in the population. This range is estimated using information collected from a sample, such as the mean, the degree to which values vary across individuals, or the sample size. For instance, a researcher may be interested in estimating the achievement motivation of first year college students. The researcher must select a random sample of students, administer a motivation scale, and then compute an average score for the entire sample. Results from the sample can then be used to make an inference about the motivation of the entire population of first year college students. The narrated presentation bellow provides an introduction to the topic of confidence intervals and demonstrates how to estimate the population mean of a normally distributed variable after computing the mean for a specific sample. The software tutorial shows how to calculate confidence intervals using SPSS.
z Procedures: Testing a Hypothesis about the Population Mean
A hypothesis is a statement about a parameter such as the population proportion or the population mean. To determine whether this statement is true, researchers use tests of significance to compare observed values of a statistic to the given parameters. Results from such tests show whether the difference between the sample statistic and the given parameter are statistically significant. The results of a significance test are expressed in terms of a probability that indicates the extent to which data from the sample and the hypothesis agree. The narrated presentation provides an introduction to this topic. It demonstrates how to formulate hypotheses, and how to conduct a test of significance for a population mean using the properties of the normal distribution.
z Procedures: Practical Issues
To be able to use the z procedures, certain assumption must be met. First, data should have a normal distribution. Second, the sample must have an adequate size, and individuals must be randomly selected. However, these conditions are often difficult to meet in practice. The following narrated presentation describes the necessary conditions for making inferences based on the z procedures, demonstrates how to determine the sample size needed for a certain level of error, and discusses the notions of Type I and Type II error, and the power of a significance test.
t Procedures: Inferences about the Population Mean
t procedures are very similar to the z procedures and are used when the distribution of the data are not perfectly normal, and when the population standard deviation of a variable is unknown. The interpretation of t scores is similar to the interpretation of z scores. However, the t distribution has a slightly different shape. It is symmetric, but not normal. It has a single peak, and a mean of 0, which is the center of the distribution, but that tails are higher and fatter, and the distribution has more spread. Further, the t distribution looks different for different sample sizes. The following presentation describes the properties of the t distribution, and demonstrates how to use t scores to make inferences about a population mean and to compare matched samples. The software tutorials show how to conduct these procedures in SPSS.
t Procedures: Comparing Independent Samples
t procedures can be used to compare variables across two independent samples. For instance, researchers may want to know whether the average performance on a certain achievement test differs significantly between males and females. The following narrated presentation demonstrates how to estimate a confidence interval for the mean difference between two populations, and how to test whether this difference is statistically significant. The presentation also discusses of the assumptions that must be met, and the robustness of t procedures to the violation of these assumptions. The software tutorial demonstrates how to perform an independent samples t test using SPSS.
The chi-square test is used to determine whether there is a statistically significant association between two or more categorical variables. For instance, educational researchers may want to determine whether the proportions of students preferring online instruction and face to face instruction differ significantly across undergraduate and graduate students. This procedure allows researchers to compare categorical variables across more than two groups and uses the chi-square statistic to determine statistical significance. The following narrated presentation describes the properties of the chi-square distribution and explains how to conduct and interpret the results of chi-square tests. The software tutorial demonstrates how to conduct this procedure in SPSS.
One-Way Analysis of Variance
Analysis of variance (ANOVA) is used to compare means across groups of similar individuals. ANOVA is comparing the variation of means across several samples to the variations of scores within each sample. It allows researchers to compare more than two groups, and uses the F statistic to determine statistical significance. The narrated presentation describes the F distribution and discusses ANOVA and its assumptions in more detail. The software tutorial demonstrates how to conduct a one-way ANOVA in SPSS.
IV. Multivariate Analyses
Multiple Linear Regression
Multiple linear regression allows researchers to predict or to explain the variance of a response variable using multiple predictors. For instance, college GPA can be predicted based on SAT scores, the amount of time spent studying, variables measuring students’ motivation, etc. This procedure allows researchers to compare the predictive power of each explanatory variable, to identify the strongest predictors and eliminate the ones that are not statistically significant. The following narrated presentation shows how to conduct multiple linear regression using the stepwise approach. It explains how to estimate regression parameters, the inferences that can be made based on multiple regression, and how to determine whether regression parameters are statistically significant. The software tutorial demonstrates how to conduct a stepwise multiple regression in SPSS.
Exploratory Factor Analysis
Exploratory Factor Analysis (EFA) is a statistical procedure that investigates patterns of variation across multiple variables to identify groups of variables that provide similar responses. EFA can be used for data reduction purposes (Principal Component Analysis) or to identify the latent constructs or the dimensions that underlie the data (Common Factor Analysis). The following presentation provides a brief introduction to the topic of EFA and provides an example of common factor analysis.
V. Other Topics
- Aligning Research Questions & Methods
- Canonical Correlation
- Cluster Analysis
- Confirmatory Factor Analysis
- Data Management & Quality
- Discriminant Analysis
- Experimental vs. Non-Experimental Designs
- Hypothesis Testing
- Interviews: Best Practices
- Kruskal-Wallis Test
- Logistic Regression
- Longitudinal Data Analysis Techniques
- Mail Surveys: Best Practices
- Mann-Whitney U Test
- Principal Components Factoring
- Profile Analysis
- Psychology of Survey Response
- Reliability & Generalizability
- Response Bias
- Sample Size & Attrition
- Software Options
- Statistical & Practical Significance (Effect Sizes)
- Survey Design
- Survey Item Development
- Survey Research Methods
- Survival Analysis
- Variables: Everything You Need to Know
- Web-based Surveys: Best Practices
- Wilcoxon Signed Ranks Test