Guide: How to Choose a Test

Which calculator should I use?

Answer the two questions below and follow the arrow to the right page.

1. What kind of variable are you analysing?

A count or category ("yes / no", "infected / not", "vaccinated / placebo")

Usually a 2×2 table (two variables, two categories each). Decide by sample size:

Large sample, all expected counts ≥ 5: Chi-Square Calculator for a single test, or Compare to see χ², Fisher's, and the Z-test for proportions side-by-side.
Small sample or sparse table: Compare — it flags the Cochran violation and shows Fisher's exact as the recommended alternative.
More than 2×2 (e.g., 3×4): Chi-Square Calculator.
Diagnostic test (sensitivity, specificity, PPV/NPV) or cohort study (RR, OR, NNT): Epi 2×2.

A number ("systolic BP", "cholesterol", "body temperature")

Decide by your comparison structure:

One sample vs a known reference value (μ₀): t Calculator → one-sample mode. Outputs a 95% CI for the mean.
Paired observations (before/after on the same subjects): t Calculator → paired mode.
Two independent groups (treatment vs control): t Calculator → Welch's mode.
You want to verify the t-test's conclusion with a non-parametric method: Simulate → bootstrap CI for one sample or permutation p-value for two samples.
You need a left/right/two-tail probability for a Z-score or a tabled value: Z Calculator.

Not sure if your data are OK for a parametric test

Start at the Assumption Coach. It gives you a Q-Q plot, skewness + kurtosis, a Jarque-Bera normality test, and a traffic-light recommendation. Green = t-test fine; yellow = use caution; red = prefer a non-parametric or bootstrap approach.

You want to run a canonical biostats example

Open the Curated Datasets library. Every entry has one-click "Load into X" buttons that send the data to the right calculator with the context already explained.

Quick-start walkthroughs

A. One-sample t-test from raw data

Open the t Calculator.
Pick the mode "One-sample" at the top.
Enter the hypothesised mean μ₀ and paste your raw data (comma, space, or newline separated).
Click "Calculate t Test Results."
Read the result in this order:
1. The 95% CI for the mean (green banner at top).
2. Whether that CI includes μ₀ or not.
3. The p-values table, which should agree.
4. (Optional) expand "Show calculation steps" for the LaTeX walk-through.

Try it: load the Body Temperature dataset, which tests whether 98.6°F really is the "normal" mean body temperature (spoiler: Mackowiak, 1992 says no).

B. Comparing three tests on a 2×2 table

Open Compare.
Enter your 2×2 counts and your α level.
Click "Run All Three Tests."
Read the cards:
- χ² (Pearson): the classical large-sample test.
- Yates-corrected χ²: more conservative for small tables.
- Fisher's exact: the exact reference for small or sparse tables.
- Z for two proportions: z² matches the uncorrected χ² exactly.
The "Agreement & Divergence" panel below the cards flags anything worth discussing.

Try it: the "small-N example" button loads [[8, 2], [1, 5]], a case where Yates says "fail to reject" while χ², Fisher's, and Z all say "reject" — the exact pedagogical moment this page is designed around.

C. Simulation-based inference

Open Simulate.
Choose a mode:
- Bootstrap for a CI for the mean of a single sample.
- Permutation for a p-value comparing two samples' means.
Paste your raw data and click "Run Simulation." Expect ~40 ms for 10,000 resamples.
Read both the simulation result AND the agreement panel — this is where the app shows whether the formula-based t-test and the simulation-based result agree. Divergence is informative about your data.

Try it: the "Cholesterol" dataset has a clear two-group difference. Run it in permutation mode and compare the permutation p to Welch's p on the same data.

D. Check assumptions before a t-test

Open the Assumption Coach.
Paste the same raw data you would feed into the t-calculator.
Click "Check My Data."
Read the verdict card:
- Green: t-test is fine.
- Yellow: t-test may still work; cross-check with Simulate for a robustness read.
- Red: prefer non-parametric (Wilcoxon in jamovi/SPSS) or simulate instead.
Regardless of colour, look at the Q-Q plot. The p-value of Jarque-Bera is loose for small n; the plot rarely lies.

Instructor notes

Pre-loading a dataset in an assignment

Every "Load into X" click from the Curated Datasets page works by writing the dataset to the browser's session storage and then navigating. That's great for self-study, but if you want to send students directly to a pre-populated calculator in an assignment, use this pattern:

Open the datasets page yourself.
Open the browser console and run ZtChi.datasets.loadInto('body-temperature', 't') (replace the two strings with the dataset id and target calculator).
You land on the calculator with the dataset loaded. Now export a permalink: click "Preview report text" under any Copy APA button and use the URL in your address bar — it includes the calculator page but not the data (data is in sessionStorage, which is scoped to the tab).

Roadmap note: a true "permalink with data baked in" is planned; until then, share the dataset id and have students click through once.

Embedding a calculator in an LMS (Canvas, Blackboard)

Append ?embed=1 to any calculator URL to hide the nav and footer. The page then fits cleanly in an LMS iframe:

<iframe src="https://your-hosted-url/t_calculator.html?embed=1"
        width="100%" height="900" style="border: 0;"></iframe>

Students can still interact with the calculator normally; they just don't see the site chrome.

Learning Mode

The checkbox labelled "Learning Mode" at the top of the t-calculator and chi-square calculator turns on an active-learning cue: clicking Calculate prompts students to commit to a prediction ("reject H₀" or "fail to reject") before they see the result. This is a lighter instance of the retrieval-practice paradigm studied by Kornell, Hays & Bjork (2009) and Richland, Kornell & Kao (2009); their protocols used generative retrieval (recalling a word pair, for example) rather than a binary guess, so treat this as plausibly helpful rather than empirically proven for this specific form. Encourage students to turn it on in the first few weeks.

Print-friendly homework submissions

Every calculator has a print stylesheet that hides the buttons, self-check cards, predict dialogs, and site chrome. Students who print their result page to PDF get a clean one-page homework deliverable with the show-work steps rendered.

Reference card: what each page does

Page	What it does	Best for
Z Calculator	Tail probabilities under the standard normal; probability → Z lookup	Standard normal table replacement
t Calculator	One-sample / paired / Welch's / direct-stat t-test with 95% CIs	Any t-test against a raw-data sample
Chi-Square Calculator	χ² test of independence for r×c tables	Single χ² on a table with more than 2×2
Compare	χ², Yates, Fisher, Z-for-proportions all at once on a 2×2	Teaching the "which test when" decision
Simulate	Bootstrap CIs and permutation p-values in a Web Worker	Verifying a formula-based result or handling non-normal data
Epi 2×2	Sens/spec/PPV/NPV/LR±/RR/OR/NNT with CIs	Diagnostic tests or cohort studies
Datasets	Library of 6 classic/illustrative biostats datasets	Lecture demos and assignment seed data
Assumption Coach	Q-Q plot, skewness/kurtosis, JB test, outliers, traffic-light verdict	Deciding whether a t-test is appropriate
Guide	This page.	Finding the right tool

Troubleshooting

The math in "Show calculation steps" shows as raw LaTeX (e.g., \frac{a}{b}) instead of rendered math.

MathJax loads asynchronously from a CDN. If the CDN is blocked (corporate firewall, offline connection) or slow to load, the raw LaTeX shows. Refresh the page once the network is available; the symbolic integrity hash on the <script> tag ensures MathJax is only loaded if it matches the expected version.

I just updated the page but still see the old behaviour.

Your browser may be serving a cached JS file. Hard-reload: Ctrl+F5 (Windows/Linux) or Cmd+Shift+R (Mac), or open the page in a private/incognito window.

I loaded a dataset but the calculator shows the default data.

The dataset handoff uses sessionStorage, which is scoped to a single tab. If you opened the calculator in a new tab instead of letting the Datasets page navigate you, the session data doesn't cross. Go back to Datasets and click "Load into X" — it navigates in the same tab.

The Compare page says "tests disagree about significance at α = 0.05." How do I decide which to report?

Follow this priority order:

If any expected cell is below 5 (Cochran, 1954): prefer Fisher's exact.
If N is large (> 40) and all expected counts ≥ 5: report the uncorrected χ² or the Z-test for proportions (they're mathematically equivalent for a 2×2).
Yates' correction is the most conservative; it's a safe choice but can make a real effect look non-significant.

Whatever you pick, state it plainly in your write-up: "We used Fisher's exact test because expected cell count was X < 5."

The Simulate page shows "permutation p ≠ Welch's t p." Is the simulator wrong?

Almost always it's the data. The two tests answer subtly different questions — permutation tests "are the two distributions identical?" while Welch's t-test tests "are the means equal, possibly with different variances?" For approximately normal, similar-variance data they agree closely; for skewed data or very different variances they can diverge, and the permutation p-value is the more robust one in that regime. Check the Assumption Coach on each group's data if you're unsure.

A self-check card flagged my answer "Not quite" but I'm sure it's right.

Each card cites a source. If you disagree after reading the cited reference, file it as a bug. The self-check bank is curated from ASA (2016), Haller & Krauss (2002), and Cumming (2014) — so the cards reflect the current consensus interpretation, not every possible nuance. If your field or textbook uses a different convention, the card may simply not match.

User Guide