Today’s focus is on practicing things in Jamovi and in Excel.
You will turn something in from today’s lab on Brightspace (here). You should plan to turn it in at the end of the lab, but it may be turned in up next week. Make a document in whatever program you would like, write your name at the top, and label it Lab 2. You can just paste the results into the textbox in Brightspace, or upload a PDF/word doc. (Pages files don’t load well in Brightspace, so please save as a PDF if you are using Pages.) I do not need your spreadsheet or data. Just these answers.
As you go through this lab, feel free to discuss with your classmates.
Import data. Find the pizzaplace.csv dataset on Brightspace or here. Load it into Jamovi. Read about it more here.
Start by going through the variables, either by clicking on the Variables menu at the top or by going to Data and double-clicking on a variable name (e.g., ID). What are they? Do they make sense compared to the brief description you just read?
Open the same dataset up in Excel (if you have access to it) or in Google Sheets. (If you have it, use Excel over Sheets, as it has more functionality. You can probably do this in Numbers, if you have that and would prefer to use it.) In either program, you can go to the File menu and then click Open, and find the file on your computer. In Google Sheets, you’ll need to click “Upload” once you access the Open menu. (In Excel, you can probably just drag the file onto the program icon in your dock [Mac] or taskbar [PC]. Or you can use File: Open to get it in.)
- Now that you have the data open in both programs, what seems different in Excel/Sheets compared to Jamovi? What’s the same? (Hopefully the data look identical!)
Jamovi gives control of types of data, but only by column (variable). On the other hand, S/E allow you to set data types by cell—cell A1 could be a word, whereas cell A2 is a number.
(See more info on Google Sheets here.)
The other major important thing is that Jamovi “understands” that you have named variables. E/S don’t know that. Instead, their columns are all letters and their rows all numbers. When sorting data, for example, you have to let them know that your data has a header row. Jamovi, on the other hand, assumes that the variable names are the first row. (Which they are, and should be most of the time you have data!)
- Use both Jamovi and Sheets/Excel to find the average price of a pizza. Use the
=AVERAGE()
function in S/E; use Analyses/Exploration/Descriptives in Jamovi. What difference do you see between the scores on Jamovi vs. S/E? Add the answers here as item #1 in your answers to turn in. If you’re having trouble here, please click to expand the note below.
In S/E, you should click on some cell that is currently empty, for example cell I3. Copy the function above (=AVERAGE()
) or type it into the cell. Sheets will automatically suggest applying it to the full column of price. Excel will not (or at least my copy does not). Instead, you will have to indicate that column. You can type out the range of cells from top (G2) to bottom (G49575). You use the colon :
to indicate the range.
In Sheets, you can click the \(\sum\) button and click on “AVERAGE”, rather than typing it out. In Excel, you can click on the \(fx\) button (functions) and find the average function.
In Jamovi, you can look back at our lab from last week. In #7, we found averages. Today, you’ll do the same thing:
- Click on the Analyses tab of the main menu ribbon
- Click on “Exploration”, with the small image of a histogram, all the way to the left.
- Then click “Descriptives”.
- Drag
price
into the Variables window. - Under “Statistics”, below, make sure that “Mean” is checked under the sub-heading of Central Tendency.
Identify the median price for all of the pizzas in S/E, using the
=MEDIAN()
function. Does this one look the same as the one in Jamovi? Why or why not? Add this median price to the answer sheet as item #2. What does it signify that the median is so close/so far from the mean?Imagine that you’re the owner of this pizza place, and you want to figure out which pizzas are different from the normal prices. How would you do that? Well, you might look for outliers! With your Exploration/Descriptives tab up in Jamovi, and
price
in the Variables: find the plots tab (below Statistics) and turn on a Boxplot. Ask it to label outliers. Recall from our class on the median absolute deviation (MAD) that this is a method for identifying outliers… how can we make that happen? (You do not need to add this to your answer sheet, since you haven’t done it yet. Answering this is the next question.)
When you look at this plot this is how you should interpret it (from here): the thick line in the middle of the box is the median. The box itself spans the range from the 25th percentile to the 75th percentile; and the “whiskers” go out to the most extreme data point that doesn’t exceed a certain bound. By default, this value is 1.5 times the interquartile range (IQR), calculated as 25th percentile - (1.5 * IQR) for the lower boundary, and 75th percentile + (1.5 * IQR) for the upper boundary. Any observation whose value falls outside this range is plotted as a circle or dot instead of being covered by the whiskers, and is commonly referred to as an outlier.
We’re going to calculate the MAD ourselves in Google Sheets or Excel. I think this has some utility, and is conceptually worth doing. For this task, you’ll use the same data. I’m going to assume that you have
price
in column G of your spreadsheet (as it was when you opened the data). Let column H be blank (clear it if needed), and make sure the median is in a cell you can refer to (not in H) or written down.I’ve calculated the median of
price
; it is in cell J5. I would recommend that you do the same. In column H, row 1 (cell H1), write the name “abs.deviation” or something like that. In row 2, write the following formula:=ABS(G2-$J$5)
. (If you want to use your written-down median value, replace$J$5
with the median value.) What’s the formula we just wrote doing? From the outside in: the=
sign tells S/E that this is a formula.ABS()
asks for the absolute value (i.e., removes negatives).G2
is the first price (a Hawaiian medium classic pizza for $13.25). The term$J$5
refers to the cell with our median in it. The $ signs tell E/S not to change this cell when we do what we’re doing next (fill), below.Select cells H2:H7 using your mouse or the Shift & arrow keys. Then either press CMD+D on a Mac (CTRL+D on a PC), or (in Excel) go to Edit: Fill: Down. (In Sheets, you will need to use the key sequence.) What happens? Your first three cells should now read 3.25, 0.5, 0.5.
Use your keyboard (or mouse—but this will take some scrolling) to select all the way to the bottom of the data in row H. Fill all the way down in H. You may see a loading bar at the top right for a second or two in Sheets. In an empty cell to the right, find the median of cells H2:H49575. This is the median absolute deviation (MAD)! If we want to find values that are 3 MAD above the median of the data, and 3 MAD below the median of the data, what would the range be? Add this range to the answer sheet as answer #3. Feel free to discuss, check your notes or the slides from class, or ask me or an assistant.
There’s a neat function in S/E called
=COUNTIF()
. If you’re feeling like you’re comfortable with what we’ve been doing thus far, try to see if you can figure out how to use this=COUNTIF()
function to count how many of the prices in column G are more than 3 MADs over the median, and how many are smaller than 3 MADs under the median. Once you start typing it, S/E will give you a bit of an idea what should go first and second in the parentheses. If you do this question, it is answer #4. It is not required, and I will discuss it in class. Even if you don’t do this question, look over the first 20–30 prices. Do any of these seem like outliers to you? What about a price of $30 for a pizza from this shop?In Jamovi, turn on the Histogram under plots for these data. Look at the histogram. Does it seem normally distributed (i.e., does it look like a normal distribution)? Answer why / why not in your answer sheet as answer #5.
In Jamovi, turn on the Q-Q plot under plots. In a Q-Q plot, datapoints are plotted along a diagonal line. The closer they are to falling on the line (especialy at the tails), the more normally-distributed the data. Does this confirm your decision on #10 (answer 5)? Feel free to add to your answer.
In Jamovi, uncheck all of the descriptives. Also, uncheck all of the plots except for boxplot + outliers. At the top of the Descriptives menu, add the variable/column size into the “Split by” section. The sizes will show up in alphabetical order in the plot. Does price align with size?
Go to the Data tab in Jamovi, and double click on the size variable. Click into the section that says Levels, and put them in the correct order (i.e., S, then M, then L, then XL, then XXL). Then go back to the Analyses -> Exploration -> Descriptives (you should be able to click on the plot in the right pane), and take a look at the plot from question #12. Does it make more sense now?
In the Split By menu, replace size with type. This isn’t an ordered variable (except maybe by your preference). Which type of pizza has outliers in the boxplot? This is answer #6 on your answer sheet. We’re done with Jamovi for today. You can close it if you’d like.
Back to S/E. We’re going to calculate z-scores for some of our values. Make sure column
I
(I
as in igloo) is blank—you can insert a blank column if you’ve been typing in it. In row 1 of columnI
, label itz-scores
. In another empty cell (probably off to the right, maybe in column J or K), calculate the standard deviation of all of the prices in column G using the population SD function:=STDEV.P()
.
This function calculates the STandard DEViation of the population. There’s a separate formula for standard deviation of just a sample—but right now, we’re pretending we have all of the info, i.e., that we have the population.
Using the mean (average) which we calculated in #5, and the standard deviation from #15, let’s calculate a z-score. As you’ll recall from class, \(z=\frac{(X-M)}{SD}\) where \(X\) is an individual value, \(M\) is the mean of the population, and \(SD\) is the standard deviation of the population. (For the moment, the pizza prices are our population.) Can you figure out how to calculate that by yourself, in Excel? It will be quite similar to what we did in question 8. Don’t forget to use parentheses—you can understand what’s going on without them on paper, but you will need them in S/E. Write the equation you use in cell
I2
to do this as answer #7. Your answer for cellI2
should be, rounded, -0.90 (to three decimals, it’s -0.897).Also note that you can use a “relative formula”, which refers to a specific cell, or just enter the values for SD and M. Your choice.
Fill down for the first 20 or so values. Are any of them more then 2 standard deviations away from the mean? (That is, are they more than 2 z-scores from 0?) This is your final answer #8.
Reuse
Citation
@online{dainer-best2025,
author = {Dainer-Best, Justin},
title = {Finding {Your} {Jamovi} {Footing} {(Lab} 2)},
date = {2025-09-11},
url = {https://faculty.bard.edu/jdainerbest/stats/labs/posts/02-finding-your-jamovi-feet/},
langid = {en}
}