Statistics 186 Introduction to Statistics for Social Sciences Notes (Stellenbosch): Exam Notes for STS 186 at Stellenbosch University

Statistics 186 at Stellenbosch University introduces the core logic of statistics as it is used in the social sciences: how to describe data clearly, how to reason from samples to populations, and how to judge evidence without overclaiming. The course is usually most challenging not because the calculations are difficult in isolation, but because students must learn to choose the correct method, interpret output responsibly, and connect numerical results to substantive social-science questions. These notes bring the main ideas together in a structured, exam-focused format.

1. What Statistics 186 is trying to teach

Statistics 186 is not only a module about formulas. It is a foundational course in statistical thinking for students who need to work with human behaviour, attitudes, education, health, inequality, and other social-science data. In the social sciences, researchers often cannot run perfect experiments, cannot control every variable, and cannot measure every concept directly. Statistics therefore becomes a disciplined way of reducing uncertainty. It allows a student to move from “what seems to be happening” to “what the data support,” while still recognising the limits of the evidence.

A useful way to think about the module is that it teaches three linked skills:

Describing data accurately and efficiently.
Inferring patterns from samples to broader groups.
Evaluating relationships and differences in a way that is logically defensible.

These three skills appear repeatedly across the course, whether the data come from a questionnaire on study habits, a survey on political attitudes, a psychology experiment, or a dataset on household income. The same statistical vocabulary keeps returning: population, sample, variable, distribution, mean, median, variance, standard deviation, confidence interval, significance, and hypothesis test.

The role of statistics in social-science research

Social-science research asks questions such as:

Do students who study more hours perform better?
Is there a difference in anxiety levels between two groups?
Are attitudes toward a policy related to age or education?
Can we estimate the proportion of a population that supports a particular view?

Statistics helps answer these questions because social-science data are usually incomplete. Researchers do not observe every person in the population. Instead, they work with samples and use probability to estimate what the broader population might look like. This is why the distinction between descriptive statistics and inferential statistics matters so much.

Descriptive statistics summarise the data that have been observed.
Inferential statistics use sample data to draw conclusions about a larger population.

If a class of 50 students is surveyed about commuting time, descriptive statistics might tell us the average commuting time in that class, the spread of responses, and whether the distribution is skewed. Inferential statistics would ask whether that class can be used to estimate commuting patterns among all Stellenbosch students or among students in a wider university system. The descriptive level is about the sample itself; the inferential level is about the leap beyond it.

Why statistical literacy matters in the social sciences

The social sciences are especially vulnerable to misinterpretation because many concepts are abstract and because human behaviour is variable. A result may be statistically significant but socially small. A correlation may exist without implying causation. A group difference may reflect measurement bias, unequal sample sizes, or an omitted factor rather than a genuine effect. For this reason, Statistics 186 usually rewards students who can explain not only what a result is, but also what it means and what it does not mean.

Statistical literacy includes the ability to:

read graphs critically;
distinguish between population parameters and sample statistics;
recognise when a conclusion is stronger than the data justify;
understand how probability underpins confidence intervals and hypothesis tests;
communicate uncertainty clearly, not as weakness but as scientific honesty.

A common exam trap is the temptation to treat every numerical outcome as a direct fact about reality. In social-science contexts, numbers are often proxies for more complex phenomena. For example, a score on a depression scale is not depression itself; it is a measured indicator. A response on a five-point Likert item is not a complete belief system; it is one snapshot of a person’s response pattern. Statistics helps manage these imperfect measurements responsibly.

Key terms that must be mastered early

The foundation of the module rests on a small set of ideas that recur across every later topic. The following terms should be understood precisely:

Population: the full group of interest.
Sample: the subset actually observed.
Parameter: a numerical characteristic of the population.
Statistic: a numerical characteristic of the sample.
Variable: a characteristic measured on each case.
Observation/case: one unit of data, such as one student or one respondent.
Distribution: the pattern of values a variable takes.
Random variable: a numerical outcome of a chance process.
Bias: systematic error that pushes results away from the truth.
Variability: the degree to which data differ from one another.

These ideas are not memorisation exercises alone. They shape interpretation. If a sample mean differs from a population mean, the gap may be due to random sampling variation, measurement error, or bias. Identifying which explanation is plausible is part of statistical reasoning.

The language of uncertainty

A major conceptual shift in Statistics 186 is the move away from certainty. Social data rarely allow statements like “this always happens” or “this is definitely true.” More often, one says:

the data suggest a relationship;
the estimate is approximately so large;
the difference is unlikely to be due to chance alone;
the evidence is insufficient to reject the null hypothesis.

This cautious language is essential because statistics is built on probability, not certainty. In practice, students are expected to make disciplined claims:

strong enough to be meaningful,
but careful enough not to overstate what the sample can prove.

A student who understands uncertainty well will usually perform better in both conceptual questions and interpretation questions. Examiners often reward statements that refer correctly to the sample, the population, the level of significance, and the direction of an effect.

Typical structure of the module

While course structure may vary slightly by year, an introductory social-science statistics module at Stellenbosch generally moves through the following sequence:

Introduction to data and measurement.
Graphical and numerical description of variables.
Probability and distributions.
Sampling and estimation.
Confidence intervals and hypothesis testing.
Comparison of groups and relationships between variables.
Interpretation, reporting, and applied use of statistical results.

Each part depends on the previous one. Descriptive statistics are not isolated from inferential statistics; they support them. For example, understanding the mean and standard deviation is essential before understanding a confidence interval, because the confidence interval is built from a sample statistic and its estimated variability.

What students should prioritise from the beginning

Students often waste time trying to memorise isolated formulas without understanding the structure behind them. A better strategy is to prioritise:

the meaning of each statistic;
when each method is appropriate;
how to interpret output in ordinary language;
which assumptions are being made;
how to detect whether a result is practical, not only statistically significant.

A successful exam answer usually combines calculation, interpretation, and context. If a question asks about a test result, the best response does not simply give the p-value. It states whether the null hypothesis is rejected, explains what that means in the context of the question, and notes the limitations of the conclusion.

2. Data, variables, measurement, and graphical summaries

Before any test can be performed, data must be understood properly. Statistics 186 therefore begins with the nature of data itself: what is being measured, how it is measured, and what kind of summaries are appropriate. In social sciences, poor measurement can create misleading results even when the calculations are correct. This is why the early part of the module is so important.

Types of variables

Variables can be grouped according to the kind of values they take.

Qualitative variables

Qualitative variables describe categories rather than numerical quantities. Examples include:

gender identity;
marital status;
province;
political party preference;
employment status.

These variables may be nominal or ordinal.

Nominal variables have categories without a natural order, such as province or eye colour.
Ordinal variables have a meaningful order, but the distances between categories are not necessarily equal, such as satisfaction levels: very dissatisfied, dissatisfied, neutral, satisfied, very satisfied.

Quantitative variables

Quantitative variables measure amounts. They are often divided into:

Discrete variables: countable values, such as number of siblings, number of study sessions, or number of absences.
Continuous variables: values on a continuum, such as height, time, income, or test score.

The distinction matters because different summaries and graphs suit different variables. It is generally inappropriate to calculate a mean for a purely nominal variable, because categories like “female,” “male,” and “other” are not quantities. Likewise, a pie chart may be suitable for categories, while a histogram is better suited to continuous numerical data.

Levels of measurement

A common exam topic is the distinction between measurement scales. These are:

Nominal – categories only.
Ordinal – ordered categories.
Interval – equal intervals, no true zero.
Ratio – equal intervals with a true zero.

In the social sciences, many measurement instruments are treated carefully because the theoretical scale and the observed score are not always identical. A questionnaire scale often generates ordinal responses, but researchers may sometimes treat the summed score as approximately interval for analysis, provided the scale is sufficiently robust and the assumptions are reasonable.

Examples:

Nominal: home language.
Ordinal: class rank.
Interval: temperature in Celsius.
Ratio: age, income, reaction time.

The true zero in ratio data means that zero indicates absence of the quantity. Age zero means no elapsed years; income zero means no income. By contrast, a temperature of 0°C does not mean absence of temperature, only a point on a scale.

Frequency distributions and relative frequency

A frequency distribution shows how many cases fall into each category or value. A relative frequency distribution expresses these counts as proportions or percentages. These summaries are foundational because they turn raw data into interpretable patterns.

Suppose 40 students are surveyed about their preferred study location:

Library: 18
Home: 14
Café: 6
Other: 2

The total frequency is 40. Relative frequencies are:

Library: 18/40 = 0.45 = 45%
Home: 14/40 = 0.35 = 35%
Café: 6/40 = 0.15 = 15%
Other: 2/40 = 0.05 = 5%

Relative frequencies are useful because they allow comparison even when sample sizes differ. A class of 40 and a class of 100 may have different absolute counts, but percentages make comparison straightforward.

Graphs used in Statistics 186

Graphical display is a major part of communicating data. In exams, students often lose marks not because they misunderstand the numbers, but because they choose the wrong graph or misread the shape.

Bar charts

Bar charts are appropriate for categorical variables. Each bar represents a category, and the height shows the frequency or percentage. Bars should be separated to emphasise that categories are distinct.

Bar charts are especially useful for:

gender categories;
employment sectors;
preference categories;
responses to survey items.

Pie charts

Pie charts show proportions of a whole. They are visually intuitive but can become cluttered if there are many categories or if differences are small. They are best used when categories are few and the goal is to communicate composition rather than precise comparison.

Histograms

Histograms are used for quantitative data, especially continuous variables. The bars touch because the values are on a continuum. The shape of the histogram reveals:

centre;
spread;
skewness;
possible outliers.

A histogram of test scores may show whether most students cluster near the average or whether results are spread widely. If the distribution is skewed to the left, most students scored high, with a few low scores pulling the tail downward. If skewed to the right, most scores are low with some high outliers.

Boxplots

Boxplots are compact summaries that display:

median;
quartiles;
spread;
outliers.

They are useful for comparing groups. For example, boxplots of exam marks for two different tutorial groups may show whether one group tends to score higher and whether the spread differs between groups.

Scatterplots

Scatterplots show the relationship between two quantitative variables. Each point represents one case. The pattern can reveal:

direction (positive or negative);
form (linear or curved);
strength (tight or loose clustering);
outliers.

For example, a scatterplot of study hours and test scores may show a positive trend: more hours associated with higher marks. But the plot may also reveal that one student studied many hours yet scored poorly, suggesting an outlier or a case influenced by factors other than study time.

Numerical summaries: centre and spread

Graphs are not enough. Statistics 186 requires numerical summaries of the distribution.

Measures of central tendency

The main measures are:

Mean: arithmetic average.
Median: middle value when data are ordered.
Mode: most frequent value.

The mean is sensitive to extreme values, while the median is resistant to them. This distinction matters in social data where outliers are common. For example, income distributions are often right-skewed: a few very high incomes can pull the mean upward above the typical person’s income. In such cases, the median may better represent the “typical” case.

A quick guide:

Use the mean for roughly symmetric numerical data without strong outliers.
Use the median for skewed data or data with outliers.
Use the mode when discussing the most common category or value.

Measures of dispersion

Spread tells us how much variability exists.

Common measures include:

Range: maximum minus minimum.
Interquartile range (IQR): Q3 minus Q1.
Variance: average squared deviation from the mean.
Standard deviation: square root of the variance.

The standard deviation is one of the most important concepts in the module. It tells us, in a rough sense, how far observations typically lie from the mean. A small standard deviation indicates data clustered closely around the mean. A large standard deviation indicates more spread.

The following conceptual comparison is useful:

Situation	Mean	Standard deviation	Interpretation
Class A marks	65	5	Students performed similarly, with little spread
Class B marks	65	15	Students had the same average, but performance was much more uneven

This example shows why the mean alone can be misleading. Two distributions can share the same centre but differ greatly in variability.

Shape of distributions

Distribution shape affects interpretation.

Symmetric distributions have similar tails on both sides.
Right-skewed distributions have a long right tail.
Left-skewed distributions have a long left tail.
Bimodal distributions have two peaks, often suggesting two subgroups in the data.

In social sciences, skewness is common. Income, reaction time, and number of absences often show skewed distributions. Skewness matters because it influences which measures are most representative and which inferential procedures are appropriate.

Z-scores and standardisation

A z-score shows how far a value is from the mean in units of standard deviation. It is computed as:

[
z = \frac{x – \bar{x}}{s}
]

where (x) is the raw score, (\bar{x}) is the sample mean, and (s) is the sample standard deviation.

A z-score of:

0 means the value is exactly at the mean;
positive values are above the mean;
negative values are below the mean.

If a student scores 80 on a test where the class mean is 70 and the standard deviation is 5, then:

[
z = \frac{80 – 70}{5} = 2
]

This means the score is 2 standard deviations above the mean. Z-scores are useful because they allow comparison across different scales. A mark of 80 on one test may not mean the same as 80 on another test unless the distributions are similar.

Why graphical and numerical summaries matter in exams

Exams in introductory statistics frequently test the ability to interpret output quickly. Students may be given a table, graph, or summary statistics and asked to explain what they imply. Strong answers usually mention:

the centre of the distribution;
the spread;
the shape;
any outliers;
how these features affect interpretation.

A complete answer does not simply repeat numbers. It explains what those numbers suggest in context. For example, if a distribution of study hours is right-skewed with a few students studying very long hours, one should note that the mean may exceed the median and that the typical student may study less than the average suggests.

3. Probability, sampling, and the logic of inference

Probability is the bridge between the sample and the population. Without probability, statistical inference would be guesswork. In Statistics 186, probability provides the mathematical language for uncertainty, allowing students to assess how surprising a sample result is if some background assumption were true. This is the basis of sampling distributions, confidence intervals, and hypothesis tests.

Why probability matters in social science

Social phenomena are rarely deterministic. Two students with the same background may perform differently. Two communities with similar resources may produce different outcomes. Probability helps describe this variability. It does not eliminate uncertainty, but it quantifies it.

A social-science researcher may ask: if there were actually no difference between two groups, how likely is it that we would observe a difference as large as the one in our sample? If the answer is “very unlikely,” then the observed difference may be evidence of a real effect. This is the logic behind significance testing.

Basic probability concepts

Probability ranges from 0 to 1.

0 means impossible.
1 means certain.
Values in between express likelihood.

For example:

0.25 = 25% chance
0.50 = 50% chance
0.90 = 90% chance

In statistical contexts, probabilities often refer not to everyday certainty but to long-run relative frequency. If an event has probability 0.10, then over many repeated trials it would occur about 10% of the time.

Rules of probability

Several basic rules are essential.

Complement rule

If the probability of an event (A) is (P(A)), then the probability of not (A) is:

[
P(A^c) = 1 – P(A)
]

If the probability that a student passes a module is 0.72, then the probability that the student fails is 0.28.

Addition rule

For mutually exclusive events, the probability of either event occurring is the sum of their probabilities.

If:

30% of students are in first year,
25% are in second year,
and the categories do not overlap,

then the probability of being in first or second year is 0.30 + 0.25 = 0.55.

Multiplication rule

For independent events, the probability of both events occurring is the product of their probabilities.

If the probability of selecting a student who is female is 0.60 and the probability that a randomly selected female student owns a laptop is independent at 0.80, then the joint probability is:

[
0.60 \times 0.80 = 0.48
]

Independence means the occurrence of one event does not affect the probability of the other.

Conditional probability

Conditional probability is one of the most important concepts in the module. It answers: what is the probability of event A given that event B has happened?

[
P(A|B)
]

This is central in social science because many relationships are conditional. The probability of completing a degree may depend on income level, school quality, or access to support services. The probability of voting for a party may depend on age, education, or political identity.

A useful interpretation is that conditional probability updates beliefs when new information becomes available. If a student knows that a respondent is a full-time commuter, that information may change the probability estimates for study patterns or attendance.

Sampling and the idea of a representative sample

A sample is useful only if it reflects the population reasonably well. In social-science research, sampling bias is a serious threat because convenience samples may not generalise. A group of volunteers responding to an online questionnaire may differ systematically from the population that failed to respond.

Important sample-related ideas include:

Random sampling: each member of the population has a known chance of being selected.
Systematic sampling: selecting every kth case after a random start.
Stratified sampling: dividing the population into groups and sampling from each.
Cluster sampling: sampling groups rather than individuals.
Convenience sampling: selecting cases that are easiest to reach.

Random sampling is preferred because it supports generalisation. Stratified sampling is especially useful when the population has known subgroups that should be represented. For example, if a study of student wellbeing needs representation from different faculties, stratification can help ensure the sample reflects that diversity.

Sampling error and bias

A sample statistic will usually differ from the population parameter simply because the sample is not the whole population. This is sampling error. Sampling error is not the same as bias. Sampling error is natural and expected; bias is systematic and problematic.

Examples:

If 100 students are randomly selected, their average study time may differ from the true average for all students. That difference is sampling error.
If only volunteers from a psychology club are surveyed, the sample may overrepresent highly interested or high-performing students. That is bias.

A good statistical analysis aims to minimise bias and quantify sampling error.

Sampling distributions

The sampling distribution is the distribution of a statistic over many repeated samples of the same size from the same population. This is one of the most powerful ideas in inferential statistics because it tells us how much a sample statistic tends to vary.

For example, if many samples of 50 students are taken from the same population and the mean study time is computed each time, those sample means will vary. The pattern of that variation is the sampling distribution of the mean. The average of those sample means will centre on the population mean, and the spread of the sampling distribution is called the standard error.

Standard error

The standard error measures how much a sample statistic is expected to fluctuate from sample to sample. It is not the same as the standard deviation.

Standard deviation: variability among individual observations.
Standard error: variability among sample statistics.

This distinction is often tested. A sample can have a large standard deviation even when its mean is estimated very precisely if the sample is large. Conversely, a small sample can produce an uncertain mean even when the individual observations are not wildly spread.

The central limit idea

Although the exact mathematical form may depend on the syllabus, the central point is that sample means tend to become more normally distributed as sample size increases, under broad conditions. This matters because it allows approximate inference about averages even when the original population is not perfectly normal.

The practical implication is that larger samples generally produce more stable estimates. In social-science research, this is one reason larger surveys are so valuable: they reduce the uncertainty around the estimated mean or proportion.

Why inference is possible at all

Inference is possible because random sampling creates a known structure of error. If samples are random, the observed variation is not pure chaos; it follows probabilistic rules. That means researchers can calculate how likely their results are under a given assumption. This is the foundation of estimation and testing.

In exam language, it is helpful to distinguish:

population parameter: unknown but fixed;
sample statistic: observed and variable;
sampling distribution: theoretical distribution of the statistic.

Understanding these three layers prevents confusion later when confidence intervals and hypothesis tests appear.

4. Confidence intervals and hypothesis testing

Confidence intervals and hypothesis tests are the heart of inferential statistics in introductory social-science modules. They are two different ways of using sample information to make claims about a population. The confidence interval estimates a range of plausible values; the hypothesis test evaluates whether the sample provides enough evidence against a default claim.

Confidence intervals: estimating a plausible range

A confidence interval gives a lower and upper bound for an unknown population parameter. For example, a sample mean may be 72, with a 95% confidence interval from 68 to 76. This means the data are consistent with plausible population means in that range, given the model and method used.

The most important point is that a confidence interval expresses uncertainty. It is not a guarantee that the true value lies in the interval in this one sample. Rather, the procedure used has a long-run property: if repeated many times, 95% of such intervals would capture the true parameter.

How to interpret confidence intervals correctly

A common mistake is to say “there is a 95% probability that the population mean lies in this interval.” That wording is usually too informal for strict statistical interpretation unless the course explicitly allows it. A safer interpretation is:

We are 95% confident that the interval contains the true population parameter.
Or: Using this method, 95% of intervals constructed from repeated samples would contain the true value.

The width of the interval matters:

narrow intervals suggest more precise estimates;
wide intervals suggest more uncertainty.

A confidence interval becomes narrower when:

the sample size increases;
variability decreases;
the confidence level decreases from 99% to 95%, for example.

Hypothesis testing: logic and structure

Hypothesis testing begins with two competing statements:

Null hypothesis ((H_0)): the default assumption, usually no difference or no effect.
Alternative hypothesis ((H_1) or (H_a)): the claim that there is a difference, effect, or relationship.

For example, if we test whether average study time differs from 10 hours per week:

(H_0: \mu = 10)
(H_1: \mu \neq 10)

The null hypothesis is not usually “proven.” Instead, one either rejects it or fails to reject it, based on the evidence.

The p-value

The p-value is the probability of obtaining a result at least as extreme as the observed one, assuming the null hypothesis is true. This is a crucial but often misunderstood concept.

A small p-value means the observed data would be unusual if the null were true. This provides evidence against the null hypothesis.

A large p-value means the observed data are not surprising under the null, so there is not enough evidence to reject it.

Common significance levels are:

0.05
0.01
0.10

If (p < 0.05), the result is often described as statistically significant at the 5% level. This does not mean the result is important in a social sense. It means the observed pattern would be unlikely under the null hypothesis alone.

Type I and Type II errors

Hypothesis testing involves error risk.

Type I error: rejecting a true null hypothesis.
Type II error: failing to reject a false null hypothesis.

The significance level (\alpha) controls the probability of Type I error. If (\alpha = 0.05), then in the long run about 5% of decisions will incorrectly reject a true null if the method assumptions hold.

A lower (\alpha) reduces Type I error but can increase Type II error, making it harder to detect real effects. This trade-off matters in social sciences where false claims can be costly, but missed effects can also be important.

One-tailed and two-tailed tests

A two-tailed test looks for any difference in either direction. A one-tailed test looks only for a difference in one specified direction.

Example:

Two-tailed: test whether mean anxiety differs between two groups.
One-tailed: test whether group A has higher anxiety than group B.

Two-tailed tests are more conservative and are often the safer default unless there is a strong theoretical reason to specify a direction in advance. In exam answers, students should state clearly whether the test is one-tailed or two-tailed and why.

Common test procedures in introductory social science statistics

Depending on the exact syllabus, students may encounter:

One-sample t-test: compares a sample mean to a known or hypothesised value.
Independent samples t-test: compares means of two independent groups.
Paired samples t-test: compares two measurements from the same individuals or matched pairs.
Chi-square test: evaluates association between categorical variables or goodness-of-fit.
Correlation test: examines the strength and direction of linear relationship between two quantitative variables.

The key is not merely to know the names, but to know when each applies.

Interpreting significance in context

An exam response should connect the statistical result to the substantive question. Suppose an independent samples t-test shows a significant difference in exam scores between two study methods, with Method A producing a higher mean. The correct interpretation is not just “p < 0.05.” It should say that there is evidence of a difference in average exam performance between the methods, with Method A higher in the sample, and that the result is unlikely to be due to random sampling variation alone if the test assumptions are met.

But interpretation must remain cautious:

A significant difference does not prove causation unless the design supports causal inference.
A non-significant result does not prove no difference exists; it may simply mean the study lacked sufficient power or the effect is small.

Effect size and practical significance

Statistical significance is not the same as practical significance. A very small effect can become statistically significant in a large sample. Conversely, a meaningful effect may fail to reach significance in a small sample.

Effect size indicates the magnitude of a relationship or difference. In social sciences, magnitude matters because policy, education, and intervention decisions depend not just on whether an effect exists, but on whether it is large enough to matter in practice.

Students should therefore ask:

How big is the difference?
Is it meaningful in context?
Is the confidence interval narrow enough to support a clear conclusion?

A worked example of interpretation

Imagine a survey of 120 students finds that the average weekly study time is 14 hours, with a 95% confidence interval from 13 to 15 hours. This means the estimate is quite precise. If a hypothesis test compares the mean study time to 12 hours and gives (p = 0.02), then the null hypothesis that the mean equals 12 hours would be rejected at the 5% level. The conclusion would be that there is evidence that the average study time is different from 12 hours, and the estimated mean is higher in the sample. The combination of the confidence interval and the test result strengthens the interpretation: both suggest the true average is likely above 12 hours.

5. Relationships, comparing groups, and exam technique

The final major area in introductory statistics is the analysis of relationships and differences in social-science data. Much of the work here involves deciding whether two variables move together, whether one group differs from another, or whether a categorical distribution departs from expectation. This section also covers how to answer exam questions clearly and avoid common mistakes.

Correlation and association

Correlation measures the strength and direction of a linear relationship between two quantitative variables. The correlation coefficient, usually denoted (r), ranges from -1 to +1.

(r = +1): perfect positive linear relationship.
(r = -1): perfect negative linear relationship.
(r = 0): no linear relationship.

A positive correlation means that as one variable increases, the other tends to increase too. A negative correlation means that as one increases, the other tends to decrease.

Examples in social sciences:

study hours and exam marks may be positively correlated;
stress levels and sleep duration may be negatively correlated;
age and political preference may show a pattern, depending on context.

Important limitations of correlation

Correlation does not imply causation. This is one of the most important rules in the module. Even a strong association does not tell us that one variable causes the other. There may be:

reverse causality;
a third variable causing both;
selection effects;
measurement issues.

For example, if students who sleep more perform better, it does not automatically mean sleep alone caused the performance difference. Better-prepared students may both sleep more and study more effectively, or lower stress may improve both sleep and performance. Correlation identifies a relationship; causal claims require stronger design and reasoning.

Scatterplots and interpretation

Scatterplots are essential for correlation because they show the raw relationship. A numerical coefficient alone can hide important features.

A good scatterplot interpretation should consider:

direction;
form;
strength;
outliers;
possible clusters.

For example, two subgroups may produce an overall correlation that looks strong, even though within each subgroup the relationship is weak. This can happen when a third variable influences both axes.

Comparing two means

Comparisons between two groups are common in social-science research. Examples include:

males vs females;
intervention vs control;
urban vs rural;
first-year vs senior students.

The independent samples t-test is used when the two groups are separate. The paired samples t-test is used when the same people are measured twice, such as before and after an intervention.

Independent samples t-test

Use this when the observations in one group are not naturally linked to the other group. The central question is whether the two population means differ.

Paired samples t-test

Use this when data are matched or repeated on the same individuals. The analysis focuses on the differences within pairs, which often reduces noise and makes the test more powerful.

A before-and-after example:

Students’ anxiety scores before a study-skills workshop.
The same students’ anxiety scores after the workshop.

The pairing matters because each person serves as their own control.

Chi-square tests and categorical data

When the variables are categorical, chi-square methods become relevant.

Goodness-of-fit

A goodness-of-fit test checks whether the observed frequencies in categories differ from what would be expected under a specified distribution.

For example, if a course expects students to distribute evenly among four preference categories but the observed counts are uneven, a goodness-of-fit test assesses whether the difference is likely due to random variation.

Test of independence

A chi-square test of independence examines whether two categorical variables are associated. For example:

faculty and preferred study method;
gender and response to a yes/no policy question;
residence status and attendance pattern.

The null hypothesis is that the variables are independent. A significant result suggests an association, but again not causation.

Reading statistical output carefully

Students often lose marks by reading software output mechanically. A disciplined interpretation should proceed in this order:

Identify the variables and the question.
Determine the appropriate test.
State the null and alternative hypotheses.
Check the p-value and compare it to (\alpha).
Make the decision to reject or fail to reject (H_0).
Translate the decision into context.
Mention limitations or assumptions if relevant.

This structure works for most exam questions.

Assumptions and why they matter

Statistical tests rely on assumptions. These should not be memorised as empty labels. They exist because the formulas are derived under conditions that make probability statements valid.

Common assumptions include:

independence of observations;
approximate normality for certain tests;
homogeneity of variance for some group comparisons;
adequate expected cell counts for chi-square tests.

If assumptions are violated seriously, results may be misleading. For example, if a t-test is applied to highly skewed data with extreme outliers and a very small sample, the p-value may not be trustworthy. In an exam, it is wise to mention assumptions whenever they are relevant, even briefly.

Common mistakes students make

The following errors appear frequently in introductory statistics exams:

confusing standard deviation with standard error;
interpreting p-values as the probability that the null hypothesis is true;
treating correlation as causation;
using the mean for a strongly skewed variable with outliers without comment;
forgetting that a non-significant result does not prove no effect;
choosing the wrong test for the variable type;
misreading a confidence interval as a fixed probability statement;
ignoring whether groups are independent or paired.

Avoiding these errors often raises marks more than learning an additional formula.

How to write strong exam answers

A strong answer is not just computationally correct. It is logically complete and context-sensitive. A good response often has the following features:

correct identification of the statistical method;
clear notation of hypotheses;
accurate interpretation of sample statistics;
correct decision based on the p-value or confidence interval;
explicit mention of the social-science meaning;
no overclaiming beyond the data.

A short example of good style:

The null hypothesis is that there is no difference in mean stress levels between the two groups. Since the p-value is 0.03, which is less than 0.05, the null hypothesis is rejected. There is evidence of a difference in mean stress levels, with the treatment group scoring lower in the sample. However, this conclusion supports association or group difference, not necessarily causation unless the design is experimental.

That answer earns credit because it is statistically accurate and carefully framed.

Final consolidation: what to remember for revision

For revision, the most important idea is that Statistics 186 is about reasoning under uncertainty. Every topic in the course supports that central goal.

Descriptive statistics organise data.
Probability quantifies uncertainty.
Sampling explains why estimates vary.
Confidence intervals quantify plausible values.
Hypothesis tests evaluate evidence against a default claim.
Correlation and group comparison help answer social-science questions.
Proper interpretation prevents overstatement.

A student who can connect these ideas will be well prepared for both calculations and theory questions. The exam usually rewards those who can move fluently from raw data to summary, from summary to inference, and from inference back to the research question. That is the real skill of introductory statistics in the social sciences: not just computing answers, but thinking clearly with data.

Compact revision table

Concept	Core question	Main output	Common interpretation
Descriptive statistics	What does the sample look like?	Mean, median, SD, graphs	Summarise centre, spread, shape
Probability	How likely is an event?	Probability value	Quantify uncertainty
Sampling distribution	How do statistics vary across samples?	Standard error	Explain inferential variability
Confidence interval	What values are plausible for the population parameter?	Lower and upper bounds	Estimate a range of likely values
Hypothesis test	Is the sample evidence strong enough against (H_0)?	p-value, test statistic	Decide whether to reject (H_0)
Correlation	Do two quantitative variables move together?	(r) value	Describe direction and strength
Chi-square	Are categorical variables related?	Chi-square statistic, p-value	Assess association or fit

Last-minute exam checklist

Before submitting an answer, check:

Have the variables been identified correctly?
Is the test appropriate for the data type?
Are the hypotheses written in the right form?
Is the decision consistent with the p-value or confidence interval?
Is the interpretation stated in context?
Have you avoided saying “prove” when “suggest” or “provide evidence” is better?
Have you distinguished statistical significance from practical importance?
Have you avoided claiming causation without a design that supports it?

If these points are handled well, the answer is usually on solid ground.

Statistics 186 is one of the most valuable foundation modules in the social sciences because it trains students to be careful readers, careful thinkers, and careful interpreters of evidence. The same discipline that helps with exam questions also helps with later coursework, research projects, and any professional environment where data are used to support decisions.