t-tests for a single sample (Lab 5)

labs
jamovi
tests
Author
Affiliation
Published

February 29, 2024

Objectives

Today, we’ll combine some plotting and some t-tests. As we’ve already discussed in class, it turns out that however complicated the math is for doing a t-test by hand, it’s infinitely simpler in Jamovi or in another software program. We’ll also talk a bit about doing some z-tests for a sample.

Remember that a z-test is one where we compare a sample mean to a population where we know the population mean and variance, whereas a one-sample t-test is one where we compare a sample mean to a population with known population mean, but unknown population variance.

You’ll turn in an “answer sheet” on Brightspace. Please be sure to turn that in by the end of the weekend.

The data

Today you’ll be looking at the friends dataset again. The friends dataset is the same as late week, available on Brightspace, here, or for download here. It is unchanged from the data you downloaded in the last lab, if you’ve got that saved. (But it is different from the first time we used the friends data!)

Playing with this dataset in Jamovi

  1. Under the Data ribbon menu, use a Filter function to filter out people with only one sibling listed in the column called siblings (i.e., remove people who have only one sibling, but people with zero siblings should be included.) Said another way: you want everyone with 0 siblings or more than one, but not one sibling.

    If you’re not sure how to say “not” in the filter: not is an exclamation point (!), as it is in many software or programming languages. Thus, writing != is like saying “not equivalent”. If you’re wanting to instead use “or”, Jamovi expects you to use the word or in there. Note that <> is not a way that works for Jamovi for you to write “not equal”.

    To recap (using x here instead of a variable; you could imagine replacing it with siblings):

  • “is x equal to 5?” – x == 5

  • “is x NOT equal to 5?” – x != 5

  • “is x more than 5?” – x > 5

  • “is x less than or equal to 5?” – x <= 5

  • “is x more than 5 or less than 2?” – x > 5 or x < 2

    In your answer sheet, write the code you used for filtering only rows where siblings isn’t 1. This is #1 for your answer sheet.

  1. Find the mean of siblings when folks who have one sibling isn’t included. (You should probably use Analyses: Exploration: Descriptives.) Then find the mean of siblings when nothing is filtered out. Which one is larger? These answers are #2 on your answer sheet.
  1. Using Jamovi (or, if you prefer, Excel/Sheets), make a bar plot of gram.followers split by siblings. Your final plot should not have anyone filtered out, and should have 6 separate bars. See more than that? Try switching the gram.followers to make sure it’s marked as a continuous (rather than nominal) variable. Screenshot (or just save) this plot and include it as #3 in your answer sheet. Note that you can save a plot by right-clicking on it in Jamovi and Exporting it.

    You may also want to create a histogram for this instagram follower data when it is not split by siblings, just to see what the data look like. Are they normal?

  2. Look at the plot below. For #4 on your answer sheet, explain what is different in this plot compared to the bar plot you created in Jamovi. I certainly don’t object to obvious details (e.g., this shows points instead of bars), but which plot do you think better allows you to make conclusions about this relationship?

z-tests for a single sample

Suppose we are interested in whether our sample has, on average, the same age at first job as people in general. We could say our research hypothesis is that our sample has a different mean age of a first job, but the null hypothesis is that it is the same. (Step 1!)

You might say then that:

\[H_0: \mu_{our~data}=\mu_{people~in~general}\]

\[H_1: \mu_{our~data}\neq{}\mu_{people~in~general}\]

Now, how do we define people in general? This is pretty tricky in most cases—which is why we’re already about to stop using z-tests. However, we can choose a guess here—perhaps 14. Is our sample different from that number?

Because we’re still doing a z-test here, let’s also imagine a population standard deviation of, say, 2. Remember that we need this information when we’re doing a one-sample z-test. (But that it is removed in the one-sample t-test!)

Step 2: Determine the characteristics of the comparison distribution

Okay, let’s get the information about our sample and therefore define the comparison distribution. We’ll use this info for calculating z.

  1. Get the mean age for our sample’s firstjobage in Jamovi (from Descriptives) and write it down. This is #5a.

We don’t need to calculate the standard deviation—we’ve got it (even if we’ve made it up). (That is, we’re not using the SD of the sample here.) In essence, we have the information about the population already. We know that \(\mu=14\) (because I told you) and that \(\sigma=2\) (again, because I told you). But now we need to define the comparison distribution.

We know that, based on the central limit theorem, \(\mu_M=\mu\). So the comparison distribution’s mean is the same as the population distribution’s mean. Write the comparison distribution’s mean value as #5b.

To get the standard deviation of the comparison distribution, we need to get the SD for the sampling distribution of the mean, which has a standard deviation equal to the standard error of the mean—that’s \(\sigma_M=SEM=\frac{\sigma}{\sqrt{n}}\). We have an n—it’s the number of participants with data for firstjobage, and Jamovi probably gave it to you when you found the mean. Use a calculator, your phone, Google, or a spreadsheet—whatever means—to calculate this. Write the comparison distribution’s SD down as #5c.

So, we can define our comparison distribution as follows: it is a z-distribution based on the sampling distribution of the mean, which has a mean and SD defined as you wrote above.

Step 3: Determine the sample cutoff score to reject the null hypothesis

We’re still using the same cut-offs from all the z-tests here, which correspond to any z-distribution with a significance level of \(p < .05\). That means that our “extreme” scores are those less than -1.96 or more than +1.96.

Step 4: Determine your sample’s score

  1. Okay, let’s find z. Use the equation below, where M is the sample mean, \(\mu\) is the mean for the sampling distribution, and \(\sigma_M\) is the SEM and is the standard deviation of the sampling distribution. (You’re essentially going to be plugging in the answers from #5.) Find z and write it as answer #6.

\[z=\frac{M_{sample}-\mu_{M}}{\sigma_M}\]

Step 5: Decide whether or not to reject the null hypothesis

Compare your z-score to the cutoff score. If it’s positive: is it larger than +1.96? If it’s negative: is it smaller than -1.96?

Yes, we can reject the null. The z-value you found should have been larger than the cutoff of +1.96, meaning that you can conclude that this is different from the null distribution.

Switching from z to t

Now we’ll try to do this whole thing again, with a slight difference: looking at the most basic kind of t-score. A one-sample t-test has one primary difference from the basic z-test: rather than assuming that we know the population’s standard deviation, we instead accept that we do not. All we know in this instance is the population mean. So, here, let’s repeat this test. Only this time, we’re going to calculate the standard deviation ourselves. (We’ll still imagine a population mean of 14.)

What do we need to do to find an estimate of the population standard deviation? We can actually do this just by looking back at the Descriptives in Jamovi. Write down the standard deviation for our sample—the dispersion for how much ages of first jobs vary. (Add this as #5d.)

Step 1 hasn’t changed. \(H_0: \mu_{our~data}=\mu_{people~in~general}\) and \(H_1: \mu_{our~data}\neq{}\mu_{people~in~general}\)

Step 2

We’re describing the t-distribution for comparison.

Our steps for doing this (after noting that its shape is that of a t distribution) are:

A. Use the sample data to estimate population variance for the sample, and then calculate the standard deviation

B. Estimate the standard deviation for the distribution of means—the standard error based on the SD for the sample and its sample size

  1. We just did A. To do B, remember that \(S_M=\frac{S}{\sqrt{n}}\), where \(S_M\) is the standard deviation of the comparison distribution, S is the standard deviation estimated from the sample, and n is the sample size.

    Calculate \(S_M\). This is answer #7.

We have now defined our comparison distribution!

It’s the t-distribution with \(df=n-1\), a mean of \(\mu_M=\mu\), and a standard deviation (the standard error of the mean) that you just found in #7.

Step 3

  1. Pull out the t-table on Brightspace or the one in your book (or one you find online). Look up the critical t-value at the df we’re using for this test. (Remember, our n and df are only based on the participants who have data for the firstjobage variable.)

    You can write it like the following: \(t_{crit}(df)=\pm?.??\) – for example, if your df was 4, you’d write \(t_{crit}(4)=\pm2.78\). Add the critical t-value you find as #8.

Step 4: Determine your sample’s score

Okay, now we can calculate t. It looks quite similar to z, with the exception (to repeat) that the only thing we’re claiming to “know” at this point is the population mean—the \(S_M\) is coming from our estimate of the population standard deviation, \(S\), where \(S_M=\frac{S}{\sqrt{n}}\):

\[t=\frac{M-\mu_M}{S_M}\]

  1. Calculate t. This is answer #9.

Step 5: Decide whether or not to reject the null hypothesis

  1. Is the value you found in #9 more extreme than the critical value? Compare it to your cutoff. Write your conclusion as #10. Be specific: do you reject the null hypothesis for this test? Then write your conclusions, including the results of the test.

    For example, if you were going to reject the null, you might write (but would not use these numbers): We reject the null. People in our sample had a different first job age than those on average, \(t(4)=2.64, p<.05\). OR, if you were unable to reject the null hypothesis, you might write (but also would not use these numbers): We fail to reject the null. People in our sample did not have a significantly different first job age than those on average \(t(4)=0.30, p>.05\).

    Remember that \(p<.05\) is something we write when we have rejected the null, because it means that we’re concluding that it was unlikely under the null.

Run an actual t-test

Okay, now we get to do the easy stuff. In Jamovi, under Analyses, select T-Tests, then One Sample T-Test. Put firstjobage into Dependent Variables.

Under Hypothesis, put our guess at the population mean (\(\mu=14\)) into the “Test Value” box.

You should see the same Statistic (that’s the t-value) and df as you found! The t might be rounded slightly differently, though. What you’ll also see is an exact p-value. If you got something different than the t-value Jamovi gives you, talk to me or a classmate.

  1. Is this one-sample t-test’s conclusions different from the results of the z-test? (Note that I said z here.) Answer this as #11. Also, looking back at the answers in #5, and thinking about how they’re used in calculating z and t, can you speculate as to why they’re the same (or different)?

  2. Lastly, under Additional Statistics (still in the t-test pane), click on “Mean difference” and “Confidence Interval”. You’ll see two new columns pop up to the right of your t-test results. What might those be? What’s the confidence interval showing?

The mean difference is our sample mean MINUS the “population mean” that we gave it (14). The confidence interval is showing that there’s a huge variety there. It’s possible the real confidence interval is as high as the bigger number—which would be a pretty large difference!—but it’s also possible it’s negative. If the confidence interval overlaps with 0, that is a good sign that the test wouldn’t be statistically significant.

Great! You now know how to run a one-sample t-test in Jamovi.

Try one out on your own

Can you tell me if our sample has a different number of covidtimes (i.e., how many times they reported having had covid) compared to a population mean of 0.3? (That seems low because it is, but bear with me on trying it.)

Decide what test you’re doing, then follow the five steps of null hypothesis significance testing (NHST) to find your answer. As answer #12, write up your results (like above) to tell me whether our sample has a different number of times they’ve had covid compared to that supposed population mean. You don’t need to share your steps; just give me the conclusion. (If you’re not sure, you can show your work.) Use Jamovi—or do it by hand—it’s up to you.

Reuse

Citation

BibTeX citation:
@online{dainer-best2024,
  author = {Dainer-Best, Justin},
  title = {\_T\_-Tests for a Single Sample {(Lab} 5)},
  date = {2024-02-29},
  url = {https://faculty.bard.edu/jdainerbest/stats/labs//posts/05-one-sample-t-tests},
  langid = {en}
}
For attribution, please cite this work as:
Dainer-Best, Justin. 2024. “_T_-Tests for a Single Sample (Lab 5).” February 29, 2024. https://faculty.bard.edu/jdainerbest/stats/labs//posts/05-one-sample-t-tests.