Oct 13 – 16, 2024
MPI for Human Development
Europe/Berlin timezone

Accounting for cultural factors in PISA – An exploration using moderated non-linear factor analysis (MNLFA)

Oct 15, 2024, 10:50 AM
1h 30m

Speaker

Gillian (Rujun) Xu

Description

Background/Context:
The Programme for International Student Assessment (PISA) is a triennial global education assessment providing various tests that measure student ability in solving academic problems in different domains, as well as surveys that include a large battery of psychological/socio-emotional constructs such as instrumental motivation and domain-specific self-concept (Schleicher & OECD, 2018). PISA is globally influential in the field of education and national education policymaking is that it allows for cross-national comparison on both academic outcomes and affective measure outcomes, which can be combined to shape education policy. With the cross-national comparison results, PISA gives feedback on whether the education system in a country is improving its effectiveness over years, which not only reflects the past education achievement of a country but also influences policymakers’ future education decisions. A lot of countries conduct policy reformations according to PISA’s suggestions (Gorur & Wu, 2015; Yang & Fan, 2019); therefore, it is crucial to make sure that PISA’s cross-national comparisons are valid, which requires the items to perform consistently for students from different backgrounds. However, in reality, factors like translation across languages and cultural background difference could result in different interpretations for the same item and biased results (Asil & Brown, 2016). Thus, detecting and accounting for such sources of bias is important and could promote equity and effectiveness of global education policy making in the future.
To account for cultural differences and other sources of bias that can lead to inconsistencies in measurement, the typical method is to conduct Measurement Invariance (MI) analysis, which examines whether the measure is consistent regardless of participants’ group membership (e.g., cultural backgrounds; Bauer, 2017). The traditional method to examine MI is Multiple Group Confirmatory Factor Analysis (MGCFA). However, MGCFA could only account for one categorical factor at a time, which is far from fully teasing out the potential non-invariant factors in real-world data. In addition, various studies exploring the PISA datasets using MGCFA failed to establish MI (Gungor & Atalay Kabasakal, 2020; Odell et al., 2021; Segeritz & Pant, 2013; Yildirim and Aybek, 2019). MNLFA, on the other hand, is more flexible in that it can address more than one factors in a single analysis, and it also allows for continuous factors as well as interactions between factors. Not only can MNLFA detect sources of non-invariance but it can also account for non-invariance by assigning items with different weight for people from different cultural backgrounds. Studies have shown that MNLFA was effective in addressing non-invariance in various psychological measures (Bauer, 2017; Pacheco-Colon et al., 2019; Rose et al., 2018).

Purpose/Objective/Research Question:
The purpose of this article is to investigate and account for potential non-invariance in the affective and motivational PISA science scales with MNLFA, given that the majority of studies focused on psychological constructs in mathematics. Following the previous studies, we control for student gender, socioeconomic status, and immigration status and analyze if science motivation and science efficacy items demonstrate measurement non-invariance. The majority of studies included as the constraint covariates, but we used test languages instead to also reflect potential translation issues. We are also interested in identifying the items and language groups that cause non-invariance. After reassigning the noninvariant items different weights and establishing a full model to account for the non-invariance, we aim to correct the individual scores. The ultimate goal is then to compare the original CFA scores with the corrected MNLFA scores to investigate how much the scores changed when non-invariance in the scales is accounted for.

Methods:
Sample
The study sample of N = 130,164 was extracted from the PISA 2015 dataset. A total of 16 countries with 15 different test languages were selected. Given the strong relationship between learning motivation and test performance, the 16 countries were selected based on their science literacy performances to makes sure the country-level variance in science literacy outcomes, so that the sample covers all the country-level science score ranges (below 450, between 450 and 500, and above 500). Only students who completed the survey instruments consisting of eight items on self-efficacy and four items on learning motivation were included.

Measures
All of the measures used in this study were part of the publicly available PISA 2015 dataset. Those measures include both item-level data for the survey instruments, as well as variables used as moderators. The PISA 2015 dataset had four scales that measured socioeconomic status from different perspectives. We conducted a principal component analysis based on these four scales and used the factors scores as our socioeconomic status moderator. All other moderators were dummy coded.
A four-item self-reported science learning scale was used to indicate student perceived motivation to study science with a four-point Likert scale ranging from Strongly agree to Strongly disagree. The science self-efficacy scale was also self-reported and it contained eight items indicating how easy it would be for students to perform various tasks in science, with a four-point Likert scale ranging from I could do this easily to I couldn’t do this. (OECD, 2017b: 38).

Analysis:
We used a typical 4-step MNLFA procedure outlined by Bauer (2017) to conduct our analyses. First, we ran the baseline model with no constraint variables and obtained the baseline parameters. Second, we ran the model with the baseline parameters as starting values and added all the constraint variables. Then with the constrained parameters, we let each item to be moderated by the covariates. That is, we went item by item to examine non-invariance. Finally, we combined these item-by-item analyses to create a complete non-invariance model. With the full model, we then re-estimated the science motivation and efficacy scores to determine if the adjusted score is different from the original score.

Findings & Conclusions:
Our study indicated that all the items in self-efficacy and learning motivation scales had significant predictors for both loadings and thresholds. That is, all the items demonstrate non-invariance to different extent. Converting thresholds into probabilities, Figure 1 indicates that students from different cultural backgrounds had very different probabilities of endorsement on the same item, and this is true for most of the items in self-efficacy scale. However, the probabilities of endorsement in learning motivation scale did not change much for students with different backgrounds (Figure 3). Item weights were calculated from loadings and varied when changing cultural backgrounds in learning motivation scale but not so much in self-efficacy scale (Figure 2 and 4). Comparing personal scores from the regular CFA score with the MNLFA score, we concluded that both scores were highly correlated, but MNLFA scores were better at differentiating students when the scores were at extremes (Figure 5).

Presentation materials

There are no materials yet.