Week 1 : Data Handling

This week will serve as a refresher on Jamovi. You will use Jamovi throughout your degree for statistical analysis, so it is very important to refresh your memory on this!

A guide to installing Jamovi on your personal computer can be found here (input link here)

Learning Objectives

	Quantitative Methods
	Recoding scores into groups
	Calculating total scores
	Calculating means

	Data Skills
	Working with the Jamovi editor
	Setting up data files for between- and within-participants designs
	Computing descriptive statistics in Jamovi

	Open Science
	Working with openly available research data

Today’s session

Today you will be re-familiarising yourself with the Jamovi interface and its layout / basic functions as well as learning how to set up data for different statistical designs.

1. Setting up data files and entering data

Data are entered into Jamovi using a ‘spreadsheet’ format of rows and columns. When in the Data tab you can view the data. The rows are cases (i.e. participants – each participant gets their own row), the columns are variables (i.e. whatever was measured for each participant gets a column). In the Variable tab rows are the details for each variable (e.g. one row = one variable).
Entering the data in the Data tab is straightforward: you simply type a value in the appropriate box, or ‘cell’.

The data files for use in the computer practicals are available on the RMC/NM1 Canvas ‘Course Pack’ module. In the text of these pages, the data files will be referred to by file name. You can open different types of data file directly in Jamovi (e.g. .sav or .csv). How to do this is described on pp. 9-10).

2. Data entry for within- and between-participants or mixed designs

In a within-participants (or ‘repeated measures’) design the dependent variable (DV) is measured more than once under different conditions or at different times. Therefore, one participant will have two or more scores (e.g. before and after an intervention). This means we need more than one column to record the relevant scores/measurements of the DV.
- For example, in Data tab column 1 could be their participant number, column 2 could be their first score and column 3 could be their second score. In the Variables tab you would therefore need three rows to define these variables.
- Participants could provide more than two scores in a repeated measures design.
In a between-participants design each participant is assigned to a single condition or group. Scores/measurements from two different people can NEVER go in the same row because one row = one person. This means that to identify which condition or group a participant belongs to, we need to assign a label to that participant, which identifies them by their group membership. This is called a grouping variable.
- For example, in the Data tab, the column 1 could be the participant number, column 2 could encode which group they were in (e.g. 1 = Experimental Condition OR 2 = Control Group) and column 3 could be their score. In this example, we would need to set up three variables in the Variable tab and for the second row (the group they were in) you would need to define the grouping variable so that you later know what the numbers refer to (e.g. 1 = experimental group, 2 = control group). You could add this information into the description of a variable in the Variables tab.
- Participants could be categorised as belonging to more than on group, e.g. they could be classified according to whether they are studying psychology or neuroscience AND whether they feel confident about using Jamovi or not.
In some studies, participants belonging to different groups provide more than one score each. Such designs are called mixed designs.
- For example, we might wish to study if a particular intervention helps students feel more confident with their use of Jamovi. This study might record Jamovi-confidence before the intervention and after the intervention. But we might also want to compare the size of the impact of the intervention for students who started off with high confidence to those who started off with low confidence.

3. Activity 1

Download and open RMC_NM1 Wksh1_Activity1.sav from the Canvas module for the Course Pack
Load the data into Jamovi
Open Jamovi and click the three horizontal lines in the top left

Click open - browse

Navigate to where you saved the data and click to open. Note: when you download the file from the Canvas page it will automatically be saved into your ‘Downloads’ folder so look here if you are struggling to find it.

Look at the raw data (Data tab) and the variable labels (Variables tab) and answer the following questions.

Activity 1 Questions

Check Your Understanding - Activity 1 Questions
Hint

What is the repeated (within-subjects) variable and how many times is it measured?

The repeated (within-subjects) variable is and is measured times.

Look at the description in the Variables tab

Check Your Understanding - Activity 1 Questions

Question 2
Hint

What is the grouping (between-subjects) variable and what are the different groups?

The grouping (between-subjects) variable is and the groups are and .

Look at the description in the Data tab

4. Activity 2 - Understanding the data

Once the data are entered, we need to look at what they can tell us. Jamovi has several tools we can use to better understand the data.

Visual inspection of the data

Before doing any statistical analysis, we can look at the data which gives us an idea of what we are dealing with: e.g. How many participants? How many variables? What is the range of scores? Are there any values that seem out of line with the rest?
When satisfied that the raw data look okay, we continue our inspection of the data by using descriptive statistics, such as frequency counts, and measures of central tendency (such as mean, median and mode) and measures of spread (such as range or standard deviation).
Download and open the RMC_NCM_example_output_Explore.sav in Jamovi
Look at the raw data (Data tab) and information about the variables (Variables tab) and answer the following questions.

Activity 2 Q1 - How many participants and how many variables were there?

177 participants and 2 variables

Activity 2 Q2 - Are there any missing data or any odd-looking cells?

Yes

Activity 2 Q3 - Give an example (identify the example by the Jamovi row number)

Rows 48 and 175

Activity 2 Q4 - What did you notice about these rows of data? Describe it in a few words.

One of the rows is blank and the other has 999 in grey inside it

5. Activity 3a - Frequency counts

For nominal and ordinal data, the next step in getting to know the data is to determine the frequency counts for each possible score/value/option, i.e. how often each value occurs (e.g. the number of people who answered “yes” to the question “Did you take Psychology at A-Level?”).
For example, in the dataset RMC_NM1_example_output_Explore.sav, the variable about whether or not a participant took Psychology ‘A’-Level can be examined using this method.
Go to the Analyses menu and choose Exploration
From the submenu, choose Descriptives

In the descriptives box, choose the variable for which you want to calculate frequency counts and enter it into the box on the right-hand side. Click the tick box next to ‘Frequency tables’ to display the frequency table.

Answer the following questions based on the Jamovi output

Check Your Understanding - Activity 3a Questions

Question 1
Hint

How many valid cases were processed? Was this correct?

cases were processed.

Compare cases to your answer to Activity 2, Q1

Check Your Understanding - Activity 3a Questions

Question 2

Look at the number of times each option occurred. Which was the most common outcome? What percentage of the valid data did this account for?

The most common outcome was and this accounted for % of the data.

Crosstabs: looking at frequencies in combinations of groups

Sometimes we want to look at combinations of categories, e.g. how many people studied Psychology at ‘A’-level, how many studied Biology at ‘A’-level, how many studied both, and how many studied neither subject.
For example, in the dataset RMC_NM1_example_output_Crosstabs.sav, the variable about whether or not a participant took Psychology ‘A’-Level and the variable about whether or not a participant took Biology ‘A’-level can be examined using this method.

Activity 3b

Using RMC_NM1_example_output_Crosstabs.sav in Jamovi:
- Go to the Analyses menu and choose Frequencies.
- From the submenu, choose Independent Samples (this is under contingency tables).

In the Contingency Tables box, put one categorical variable into the Rows box and the other into the Columns box. (If you have more categorical variables that you want to examine, you add a new Layer for each further variable added).

Check Your Understanding - Activity 3b Questions

Question 1

How many students studied both Psychology and Biology at A-level?

Check Your Understanding - Activity 3b Questions

Question 2

Which was the most common combination of subjects studied?

The most common combination of subjects studied was

6. Activity 4 - Exploring the data

Scale data (interval or ratio data) are not always suitable for analysis by frequency counts. Instead, we want to look at measures of central tendency (e.g. means) and measures of spread (e.g. standard deviation). A good way to examine scale data is to ‘explore’ them.
The Exploration module in Jamovi gives a range of statistics including charts to help us see whether the data are normally distributed, whether there are any outliers, and also to get an idea of the spread of the data and their typical value.
While we should always carry out these explorative analyses, we tend not to report them all.
- For example, we do not usually include histograms or stem-and-leaf plots in a results write-up. A good rule of thumb in terms of what we need to report is to think of the exploratory data analyses as being a way in which to check whether our data meet the assumptions underlying successful statistical analysis.
- If they do not meet the assumptions, we need to report how they fail to meet them and state what this means in terms of how we then analyse the data.
The main assumptions which should be met for all parametric tests are:
- The data are at least interval level of measurement.
- The data are normally distributed.
- The variance is homogeneous (similar) between different groups or conditions.
We always have to describe what the data are like in order to carry out a successful statistical analysis. Hence, we routinely report the means and standard deviations of scale data when we carry out statistical tests. (In RMC/NM1, this will be referred to as ‘reporting the descriptives’).
Using RMC_NM1_example_output_Explore.sav in Jamovi:
- Go to the Analyses menu and choose the Exploration module.
- From the submenu, choose Descriptives.

In the Descriptives box, choose the variable(s) you want to explore and enter it (them) into the Variables box on the right-hand side.
- If you want to compare the score on one variable for two or more groups, enter the variable that represents your independent or grouping variable into the Split by box.