I. Important Concepts
Introduction to Statistics
Statistics are a set of mathematical procedures used to organize, summarize, and interpret
information. As the primary form of analyzing and interpreting quantitative data,
statistics help us to organize and understand the information contained within our
data, communicate our results and describe characteristics of our data to others,
and answer the research questions that drive the research.
Normal Distribution
Normal distributions enable researchers to analyze and compare scores and find out
the proportion of individuals who fall above or below a score. If a data set is Normal,
there are many calculations that can be performed and scores can be standardized to
make comparisons easier. This topic will explain how to define and describe density
curves, measure position using percentiles and z-scores, describe Normal distributions,
apply the 68-95-99.7 Rule, and perform Normal calculations.
Hypothesis Testing
Hypothesis testing is a common goal shared by all inferential statistics. This presentation
discusses the logic of hypothesis testing and important related concepts including
errors, alpha, effect sizes, and statistical power. This presentation includes a detailed
example of the process used for hypothesis testing. Although the provided example
is set within the context of z scores, the concepts described are generalizable to
other inferential techniques.
II. Data Collection & Management
Sampling & Data Collection
The following presentation discusses the difference between populations and samples,
and describes several sample selection procedures. The presentation also includes
a brief description of data collection procedures, such as surveys.
Data Management in SPSS
The following software tutorials demonstrate how to create or open a data file, how
to define variables, and how to enter new data in SPSS. Further, the tutorials demonstrate
how to conduct basic data management procedures such as the selecting a subset of
individuals, recoding variables, or computing new variables.
III. Univariate & Bivariate Analyses
Exploratory Analysis
Exploratory or descriptive analyses enable researchers to analyze and describe their
data prior to running any statistical tests. They are always the first steps in analyzing
data. Statistics measuring spread, central tendency, or frequency can be used to provide
an overall description of the data without forcing researchers to look at every value
in the set. Additionally, various charts can be used to illustrate the distribution
of quantitative or categorical variables.
- Exploratory Statistics
Descriptive statistics enable researchers to analyze and describe their data prior to running any statistical tests. Measures of central tendency and variance explain a data set without forcing researchers to look at every value in the set. This topic will explain how to find measures of central tendency, including mean, median, and mode, and measures of spread, including quartiles and standard deviation. Descriptive analyses are always the first steps in analyzing data. - Graphs
Charts and figures are often used for exploratory purposes. Depending on the nature of the data, different types of graphical representations can be used to summarize information. The narrated presentation discusses the components of a data set (individuals and variables) and explains the differences between different types of variables. Several methods to graphically summarize the data are then described for each type of variable. The software tutorials show how to generate different types of graphs in SPSS.
Scatterplots & Correlations
Scatterplots and correlation coefficients are measures of association between two
quantitative variables. For instance, researchers may want to know whether an increased
amount of time spent on homework is associated with higher scores on a standardized
test. Scatterplots provide descriptive information on the direction, form, and strength
of the relationship between the two variables by representing individuals as points
on a two-dimensional graph. These points may aggregate to describe a linear relationship,
curvilinear relationship, or no relationship. Scatterplots may also indicate whether
there is positive or negative association between variables, and suggest the strength
of their relationship. A positive association means that high values in one variable
are associated with high values in the other variable, whereas a negative association
shows that high values in one variable are associated with low values in the other
variable (e.g. the relationship between poverty and student achievement).
Simple Linear Regression
Simple linear regression allows researchers to predict or explain the variance of
a response variable using a predictor variable. For instance, simple linear regression
may be used in educational research to predict college GPA based on SAT scores. The
narrated presentation bellow provides an introduction to the topic of simple linear
regression. It discusses basic concepts related to simple linear regression, the assumptions
on which this procedure is based, and how to interpret and use the regression equation.
The software tutorial demonstrates how to conduct a simple linear regression in SPSS.
z Procedures: Confidence Intervals for the Population Mean
A confidence interval is a range of values that a parameter may take in the population.
This range is estimated using information collected from a sample, such as the mean,
the degree to which values vary across individuals, or the sample size. For instance,
a researcher may be interested in estimating the achievement motivation of first year
college students. The researcher must select a random sample of students, administer
a motivation scale, and then compute an average score for the entire sample. Results
from the sample can then be used to make an inference about the motivation of the
entire population of first year college students. The narrated presentation bellow
provides an introduction to the topic of confidence intervals and demonstrates how
to estimate the population mean of a normally distributed variable after computing
the mean for a specific sample. The software tutorial shows how to calculate confidence
intervals using SPSS.
z Procedures: Testing a Hypothesis about the Population Mean
A hypothesis is a statement about a parameter such as the population proportion or
the population mean. To determine whether this statement is true, researchers use
tests of significance to compare observed values of a statistic to the given parameters.
Results from such tests show whether the difference between the sample statistic and
the given parameter are statistically significant. The results of a significance test
are expressed in terms of a probability that indicates the extent to which data from
the sample and the hypothesis agree. The narrated presentation provides an introduction
to this topic. It demonstrates how to formulate hypotheses, and how to conduct a test
of significance for a population mean using the properties of the normal distribution.
z Procedures: Practical Issues
To be able to use the z procedures, certain assumption must be met. First, data should
have a normal distribution. Second, the sample must have an adequate size, and individuals
must be randomly selected. However, these conditions are often difficult to meet in
practice. The following narrated presentation describes the necessary conditions for
making inferences based on the z procedures, demonstrates how to determine the sample
size needed for a certain level of error, and discusses the notions of Type I and
Type II error, and the power of a significance test.
t Procedures: Inferences about the Population Mean
t procedures are very similar to the z procedures and are used when the distribution
of the data are not perfectly normal, and when the population standard deviation of
a variable is unknown. The interpretation of t scores is similar to the interpretation
of z scores. However, the t distribution has a slightly different shape. It is symmetric,
but not normal. It has a single peak, and a mean of 0, which is the center of the
distribution, but that tails are higher and fatter, and the distribution has more
spread. Further, the t distribution looks different for different sample sizes. The
following presentation describes the properties of the t distribution, and demonstrates
how to use t scores to make inferences about a population mean and to compare matched
samples. The software tutorials show how to conduct these procedures in SPSS.
t Procedures: Comparing Independent Samples
t procedures can be used to compare variables across two independent samples. For
instance, researchers may want to know whether the average performance on a certain
achievement test differs significantly between males and females. The following narrated
presentation demonstrates how to estimate a confidence interval for the mean difference
between two populations, and how to test whether this difference is statistically
significant. The presentation also discusses of the assumptions that must be met,
and the robustness of t procedures to the violation of these assumptions. The software
tutorial demonstrates how to perform an independent samples t test using SPSS.
Chi-Square Test
The chi-square test is used to determine whether there is a statistically significant
association between two or more categorical variables. For instance, educational researchers
may want to determine whether the proportions of students preferring online instruction
and face to face instruction differ significantly across undergraduate and graduate
students. This procedure allows researchers to compare categorical variables across
more than two groups and uses the chi-square statistic to determine statistical significance.
The following narrated presentation describes the properties of the chi-square distribution
and explains how to conduct and interpret the results of chi-square tests. The software
tutorial demonstrates how to conduct this procedure in SPSS.
One-Way Analysis of Variance
Analysis of variance (ANOVA) is used to compare means across groups of similar individuals.
ANOVA is comparing the variation of means across several samples to the variations
of scores within each sample. It allows researchers to compare more than two groups,
and uses the F statistic to determine statistical significance. The narrated presentation
describes the F distribution and discusses ANOVA and its assumptions in more detail.
The software tutorial demonstrates how to conduct a one-way ANOVA in SPSS.
IV. Multivariate Analyses
Multiple Linear Regression
Multiple linear regression allows researchers to predict or to explain the variance
of a response variable using multiple predictors. For instance, college GPA can be
predicted based on SAT scores, the amount of time spent studying, variables measuring
students’ motivation, etc. This procedure allows researchers to compare the predictive
power of each explanatory variable, to identify the strongest predictors and eliminate
the ones that are not statistically significant. The following narrated presentation
shows how to conduct multiple linear regression using the stepwise approach. It explains
how to estimate regression parameters, the inferences that can be made based on multiple
regression, and how to determine whether regression parameters are statistically significant.
The software tutorial demonstrates how to conduct a stepwise multiple regression in
SPSS.
Exploratory Factor Analysis
Exploratory Factor Analysis (EFA) is a statistical procedure that investigates patterns
of variation across multiple variables to identify groups of variables that provide
similar responses. EFA can be used for data reduction purposes (Principal Component
Analysis) or to identify the latent constructs or the dimensions that underlie the
data (Common Factor Analysis). The following presentation provides a brief introduction
to the topic of EFA and provides an example of common factor analysis.
V. Other Topics
Coming Soon
- Aligning Research Questions & Methods
- ANCOVA
- Canonical Correlation
- Causation
- Cluster Analysis
- Confirmatory Factor Analysis
- Data Management & Quality
- Discriminant Analysis
- Experimental vs. Non-Experimental Designs
- Hypothesis Testing
- Interviews: Best Practices
- Kruskal-Wallis Test
- Logistic Regression
- Longitudinal Data Analysis Techniques
- Mail Surveys: Best Practices
- MANCOVA
- Mann-Whitney U Test
- MANOVA
- Meta-Analysis
- Principal Components Factoring
- Profile Analysis
- Psychology of Survey Response
- Reliability & Generalizability
- Response Bias
- Sample Size & Attrition
- Software Options
- Statistical & Practical Significance (Effect Sizes)
- Survey Design
- Survey Item Development
- Survey Research Methods
- Survival Analysis
- Validity
- Variables: Everything You Need to Know
- Web-based Surveys: Best Practices
- Wilcoxon Signed Ranks Test