Date | Month | Year | Army | Disease | Wounds | Other | Disease.rate | Wounds.rate | Other.rate |
---|---|---|---|---|---|---|---|---|---|
1854-04-01 | Apr | 1854 | 8571 | 1 | 0 | 5 | 1.4 | 0.0 | 7.0 |
1854-05-01 | May | 1854 | 23333 | 12 | 0 | 9 | 6.2 | 0.0 | 4.6 |
1854-06-01 | Jun | 1854 | 28333 | 11 | 0 | 6 | 4.7 | 0.0 | 2.5 |
1854-07-01 | Jul | 1854 | 28722 | 359 | 0 | 23 | 150.0 | 0.0 | 9.6 |
1854-08-01 | Aug | 1854 | 30246 | 828 | 1 | 30 | 328.5 | 0.4 | 11.9 |
1854-09-01 | Sep | 1854 | 30290 | 788 | 81 | 70 | 312.2 | 32.1 | 27.7 |
1854-10-01 | Oct | 1854 | 30643 | 503 | 132 | 128 | 197.0 | 51.7 | 50.1 |
1854-11-01 | Nov | 1854 | 29736 | 844 | 287 | 106 | 340.6 | 115.8 | 42.8 |
1854-12-01 | Dec | 1854 | 32779 | 1725 | 114 | 131 | 631.5 | 41.7 | 48.0 |
1855-01-01 | Jan | 1855 | 32393 | 2761 | 83 | 324 | 1022.8 | 30.7 | 120.0 |
1855-02-01 | Feb | 1855 | 30919 | 2120 | 42 | 361 | 822.8 | 16.3 | 140.1 |
1855-03-01 | Mar | 1855 | 30107 | 1205 | 32 | 172 | 480.3 | 12.8 | 68.6 |
1855-04-01 | Apr | 1855 | 32252 | 477 | 48 | 57 | 177.5 | 17.9 | 21.2 |
1855-05-01 | May | 1855 | 35473 | 508 | 49 | 37 | 171.8 | 16.6 | 12.5 |
1855-06-01 | Jun | 1855 | 38863 | 802 | 209 | 31 | 247.6 | 64.5 | 9.6 |
1855-07-01 | Jul | 1855 | 42647 | 382 | 134 | 33 | 107.5 | 37.7 | 9.3 |
1855-08-01 | Aug | 1855 | 44614 | 483 | 164 | 25 | 129.9 | 44.1 | 6.7 |
1855-09-01 | Sep | 1855 | 47751 | 189 | 276 | 20 | 47.5 | 69.4 | 5.0 |
1855-10-01 | Oct | 1855 | 46852 | 128 | 53 | 18 | 32.8 | 13.6 | 4.6 |
1855-11-01 | Nov | 1855 | 37853 | 178 | 33 | 32 | 56.4 | 10.5 | 10.1 |
1855-12-01 | Dec | 1855 | 43217 | 91 | 18 | 28 | 25.3 | 5.0 | 7.8 |
1856-01-01 | Jan | 1856 | 44212 | 42 | 2 | 48 | 11.4 | 0.5 | 13.0 |
1856-02-01 | Feb | 1856 | 43485 | 24 | 0 | 19 | 6.6 | 0.0 | 5.2 |
1856-03-01 | Mar | 1856 | 46140 | 15 | 0 | 35 | 3.9 | 0.0 | 9.1 |
Today, you’ll be making some graphs in Jamovi and Excel/Sheets. You’ll also be playing around a bit with three different datasets. Again, you’ll turn in an “answer sheet” on Brightspace. Please turn that in by the end of the weekend. You needn’t turn in your data. Just the answer sheet.
Data
You might want to start by downloading two of the datasets.
Nightingale data
Let’s start by looking at the nightingale data (nightingale.csv
), which we also discussed in class. This is the data from Florence Nightingale’s research in the 1850s on causes of death after the Crimean war. Load it into Jamovi.
Modern graphing software can try to do the Nightingale coxcomb plot we saw in class—see it on wikipedia here—but you’ll see that it doesn’t quite look as nice as hers.
Jamovi can’t do a plot like this, but what it can do quite easily is create a histogram.
- Create a histogram using the full Disease data. Note that this histogram should only involve the disease data—a histogram shows frequencies of how often you get a certain response. Is the data normally distributed? How do you know? Answer these last two questions on your answer sheet, #1. Your plot should look something like this (although it might not have the title):
Create a histogram of the deaths from Wounds in the Nightingale data. Is that one normally distributed? (You don’t need to answer on the answer sheet.)
Okay, let’s make a scatterplot. This kind of plot compares two variables to one another—plotting one on the x-axis and the other on the y-axis. There are a few ways to do this in Jamovi, but we’ll use one that’s straightforward to carry out. Select “scatr” under Analyses: Exploraton. (If you don’t have it, install it under Modules; let me know if you need help.)
Plot deaths from Wounds against those from Disease. It’s up to you which is on the x-axis and which on the y. Add Year into the Group box. Can you draw any conclusions from this?
Suppose you wanted to plot cause of death over time for the entirety of the data we have… Line graphs like this are more challenging in Jamovi, but you could actually make it in Sheets/Excel. In this case, I’ll just include the plot below. Take a look.
- What conclusions do you draw from this figure? How does it compare to the Nightingale coxcomb diagram above? Include this answer as #2 in your answer sheet.
Teaching and Learning Research
Fiorella & Mayer (2013) hypothesized that students would learn course material better if they thought they were going to later be asked teach the material to the rest of the class. To test this, the researchers divided students into three groups. All groups read a short excerpt about the Doppler effect and were later given a 10-question quiz. The control group studied the excerpt and then immediately took the quiz. The preparation group was instructed that they would later teach the material to a group of students. This group studied the excerpt then immediately took the quiz. Finally, the teaching group was instructed that they would later teach the material to a group of students. This group studied the excerpt, taught it to a group of students, and then took the quiz. Fiorella & Mayer reported the following results:
Group | n | Comprehension score | |
---|---|---|---|
M | SD | ||
Control | 31 | 6.2 | 3.3 |
Preparation | 32 | 7.9* | 2.4 |
Teaching | 30 | 8.7* | 2.8 |
* Significantly different from control group at p < .05
We’re going to plot these in a bar graph. Open Excel or Google Sheets and copy these data into a table. Before doing anything else, delete the asterisks in your copied data. We want S/E to recognize these as numbers.
Move the n (sample size) values to the far right column, and then delete the empty column that remains. Now, column A should be group, column B should be means, column C should be SD, and column D should be sample size.
Calculate the standard error of the mean or SEM in column E for each group. Remember that \(\textrm{SEM}=\frac{SD}{\sqrt{n}}\). In Sheets or Excel (S/E), remember that an equation starts with
=
and then refers to the cell names. Square roots are gotten by writing outSQRT()
. In cell E7, calculate the average of your SEMs using the=AVERAGE()
formula. Your answer should be 0.50938.Select the cells representing the names of the groups (i.e., Control, Preparation, Teaching) and the means (6.2, 7.9, 8.7). This should be cells A3:B5.
In Excel, go to the Insert menu, then Chart, then Column. In Sheets, go to Insert, then click Chart. Then from the dropdown menu at the top be sure that “Column Chart” is selected. Both should make a chart that compares the means and add labels on the x-axis. Do they?
This part I want you to figure out how to do on your own: give the graph a title, label the y-axis, and explore other possible settings. You can probably get to settings for it by double-clicking on it or right-clicking.
Add the error bars we calculated as SEM: this works correctly in Excel, but in Sheets we’ll need to only do it halfway.
In Excel: go to Add Chart Element: Error Bars: More Error Bars Options. Click on the picture of a column chart in the menu, and change “Error Amount” to Custom. Specify the values in E3:E5 as both positive and negative error values.
In Google Sheets: After double-clicking on the chart, drop down the “Series” menu. At the bottom of it, check the Error Bars checkbox. Change Type from percent to Constant. Set the value to the average value of the SEM that we calculated above. (Google Sheets won’t let you have different error bars for different bars.)
Submit the plot as part of your Brightspace answer sheet, #3. Chat with a neighbor about what is “lost” by doing this in Sheets vs. Excel. Please copy the chart into your answers or take a screenshot; I’m happy to help you figure that out.
Draw a conclusion from the graph, using the error bars for information. Which method of instruction results in the best scores? What information from the graph makes you feel more confident in that conclusion? Write this answer as #4 on your answer sheet.
Friends data
Okay, now let’s talk about the data from your class (and a few friends). Here’s what the beginning of the data looks like (note that you can scroll to the right):
StartDate | EndDate | Status | Progress | Duration (in seconds) | Finished | RecordedDate | ResponseId | DistributionChannel | UserLanguage | socialmedia | gender | siblings | smed.hrs | gram.followers | fbfriends | tiktokFollow | tvhours | haircolor | belief.in.god | liveoncampus | numclasses | hs_students | firstjobage | covidtimes | eatmeat | operas | cigarettes | like.dance | shakespeare | voting | expectedoutcome_1 | expectedoutcome_5 | expectedoutcome_2 | expectedoutcome_3 | expectedoutcome_4 | majordiv | hrs.sleep | height.unvalidated | shootingdrills | handedness |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
2024-01-31 10:22:14 | 2024-01-31 10:25:14 | 0 | 100 | 179 | 1 | 2024-01-31 10:25:14 | R_6PmJR5uqHSbAkK2 | anonymous | EN | 42 | female | 0 | 4.0 | NA | 20 | 235 | 10 | 2 | 1 | 1 | 5 | 600 | 15 | 1 | 1 | 1 | 2 | 21 | 2 | 1 | 100 | 29 | 50 | 50 | 100 | 3 | 9.5 | 66 | 1 | 1 |
2024-01-31 10:22:28 | 2024-01-31 10:25:36 | 0 | 100 | 188 | 1 | 2024-01-31 10:25:37 | R_6rV09pDZ2KdNlEe | anonymous | EN | NA | Non-binary | 0 | 1.0 | NA | NA | NA | 1 | 1 | 3 | 1 | 4 | 2000 | 15 | 2 | 1 | 0 | 2 | 15 | 3 | 1 | 65 | 50 | 82 | 83 | 100 | 3 | 8.0 | 62 | 1 | 5 |
2024-01-31 10:22:18 | 2024-01-31 10:30:48 | 0 | 100 | 510 | 1 | 2024-01-31 10:30:49 | R_5ta18rQRoooKcUW | anonymous | EN | 142 | female | 2 | 6.0 | 1900 | 32 | 50 | 0 | 2 | 1 | 1 | 4 | 1000 | 17 | 0 | 1 | 0 | 2 | 19 | 3 | 1 | 10 | 50 | NA | NA | 90 | 3 | 7.0 | 61 | 1 | 4 |
2024-01-31 10:25:26 | 2024-01-31 10:32:05 | 0 | 100 | 399 | 1 | 2024-01-31 10:32:05 | R_3KYGqB6L1xXkpKN | anonymous | EN | 14 | Woman | 0 | 3.0 | 717 | NA | 265 | 10 | 1 | 3 | 1 | 4 | 2000 | 13 | 1 | 1 | 3 | 2 | 19 | 8 | 1 | 60 | 40 | 50 | 50 | 100 | 2 | 8.0 | 69 | NA | 5 |
2024-01-31 10:22:17 | 2024-01-31 10:38:51 | 0 | 100 | 993 | 1 | 2024-01-31 10:38:52 | R_6kh7ebTNsC0lHPO | anonymous | EN | 14 | Non-binary | 1 | 1.5 | 346 | NA | 561 | 2 | 1 | 2 | 1 | 4 | 1075 | 15 | 0 | 1 | 0 | 2 | 18 | 1 | 3 | 50 | 35 | 50 | 50 | 75 | 3 | 2.0 | 62 | 1 | 4 |
2024-01-31 10:35:40 | 2024-01-31 10:40:29 | 0 | 100 | 288 | 1 | 2024-01-31 10:40:30 | R_7Kjqt7AqMU392HD | anonymous | EN | 1 | Female | 1 | 4.0 | 307 | NA | NA | 10 | 1 | 3 | 1 | 5 | 4000 | 15 | 1 | 1 | 1 | 2 | 17 | 2 | 1 | 80 | 56 | 77 | 66 | 100 | 3 | 8.0 | 64 | 1 | 4 |
Many of you have collected data in Qualtrics before; this is what the data from Qualtrics look like. What you’ll see now is that it isn’t immediately usable.
Go ahead and open the friends
data in Jamovi. (If you didn’t download it before, it’s available under data above.) I haven’t edited any of this data yet. Explore the data a bit.
Check to make sure that your data has the values you’d expect. How many rows does it have? You’ll note that the first two rows are extra information. Delete them. How many rows are left? Enter this value on Brightspace as #5 on your answer sheet.
Because these data were imported with text at the top, Jamovi doesn’t automatically know that some are numbers. You’ll have to tell it. Pick two variables you think should be a number. Under Data or Variables, switch it to be continuous (or, possibly, ordinal). Then create a scatterplot, as we did above.
If we want to create a new variable that’s a rough combination of number of instagram followers with number of hours spent on social media, we might create it as a new variable where we divide the first (gram.followers) by the second (smed.hours). First, turn both of those to continuous variables. Then, under the Data menu, use the Compute button to do this. Call the new variable something like
social.quotient
. (Remember that you can use the/
for division.)Find the range of how many students went to respondents’ high schools (
hs_students
). Enter it as answer #6.How many different answers for gender are in data? For the moment, just answer “from a computer’s perspective”. Add this as answer #7 From a human’s perspective: well, it depends a bit on how you categorize them? There are some automated ways of dealing with this, but not (unfortunately) ones I know of in Jamovi or Excel/Sheets. Take some time to recode: everything in lower case. Do you put “woman” and “female” together? Probably. There are also two responses that are intentionally not binary without describing themselves as such. This is where some subjectivity comes into data analysis. (Happy to discuss what it means to collect data on gender at another point; I have many thoughts.)
Okay, finally: go back to Analysis: Exploration: Descriptives. Put one of your numeric variables in Variables, and put the cleaned up Gender variable in the Split by box. Turn on a box plot, and check the checkbox to add the Data onto it. Submit this plot as #8.
Play around some more, if you like. Then save your data for youself, and submit the answer sheet. (I don’t need your data, though!)
Reuse
Citation
@online{dainer-best2024,
author = {Dainer-Best, Justin},
title = {Visual {Displays} of {Information} {(Lab} 3)},
date = {2024-02-15},
url = {https://faculty.bard.edu/jdainerbest/stats/labs//posts/03-visualizations},
langid = {en}
}