In this scenario, we evaluate the match advantage along with the experiment platforms (in site vs. on line). In terms of the coefficients from the real data analysis, we evaluate how many participants and items the reproduction studies will employ. The required numbers of participants and items are based on the simulated-based power analysis. We use mixedpower
(Kumle, Võ, and Draschkow, 2021) to compute the reached power. This page followed their notebook to estimate how many participants/items required in the next study. For each simulation the codes estimated powers based on the artificial data and on the smallest effect size of interest(SESOI).
The data collected from OSWeb appeared to be longer the the data collected from OpenSesame in site. Below mixed-effect model showed a weak match advantage when the platforms(OpenSesame, OSWeb) is treated as fixed effect. We also extract the coefficients for the simulations from the below codes.
EN_OS_model <- lmer(response_time ~ Match*opensesame_codename + (1|Subject) + (1|Target),
data = SP_V_en_mutate)
## mixed effect model summary
summary(EN_OS_model)
Linear mixed model fit by REML ['lmerMod']
Formula:
response_time ~ Match * opensesame_codename + (1 | Subject) +
(1 | Target)
Data: SP_V_en_mutate
REML criterion at convergence: 421667.9
Scaled residuals:
Min 1Q Median 3Q Max
-0.284 -0.122 -0.043 0.004 85.348
Random effects:
Groups Name Variance Std.Dev.
Subject (Intercept) 2479 49.79
Target (Intercept) 23764 154.16
Residual 30302998 5504.82
Number of obs: 21017, groups: Subject, 991; Target, 48
Fixed effects:
Estimate Std. Error t value
(Intercept) 651.541 87.368 7.457
MatchY -4.069 119.055 -0.034
opensesame_codenameosweb 559.782 109.684 5.104
MatchY:opensesame_codenameosweb 88.170 154.592 0.570
Correlation of Fixed Effects:
(Intr) MatchY opnss_
MatchY -0.686
opnssm_cdnm -0.745 0.546
MtchY:pnss_ 0.528 -0.770 -0.709
The summary showed a weak orientation effect and a huge difference between the study platforms.
We extracted the required columns from the tidy data. For the accuracy of analysis process, participants’ and targets’ ids were converted to numbers.
## Retrieve the required columns
partial_real_data <- SP_V_en %>% select(PSA_ID, subject_nr, Target, Match, opensesame_codename) %>%
mutate(Subject = paste0(PSA_ID,"_",subject_nr))
## Build the artifial data
artificial_data <- partial_real_data %>%
mutate(Subject = as.numeric(as.factor(Subject)),
Target = as.numeric(as.factor(Target)))
In total, we convert the data columns from 991 participants into the artificial data.
In the mixed-effect model by artificial data, in addition to the residual variance was 1/100 of original coefficient, the other coefficients followed the original result. We adjusted the residual variance in terms of the assumption that the researchers would reproduce this study in a single data collection workflow where the performance variance would be controlled.
# ------------------------------------------ #
# formula for GLMM
formula_lmer <- RT ~ Match*opensesame_codename + (1 | Subject) + (1 | Target)
# ------------------------------------------ #
# CREATE LMER
artificial_lmer <- makeLmer(RT ~ Match*opensesame_codename + (1 | Subject) + (1 | Target),
fixef = EN_OS_model_fef, VarCorr = EN_OS_model_ref, sigma = EN_OS_model_res/100, ## Minimze residual std
data = artificial_data)
# lets have a look!
summary(artificial_lmer)
Linear mixed model fit by REML ['lmerMod']
Formula:
RT ~ Match * opensesame_codename + (1 | Subject) + (1 | Target)
Data: artificial_data
REML criterion at convergence: 333640.9
Scaled residuals:
Min 1Q Median 3Q Max
-10.1141 -1.7823 0.0032 1.8248 12.6499
Random effects:
Groups Name Variance Std.Dev.
Subject (Intercept) 2479 49.79
Target (Intercept) 23764 154.16
Residual 3030 55.05
Number of obs: 21017, groups: Subject, 991; Target, 48
Fixed effects:
Estimate Std. Error t value
(Intercept) 651.541 22.398 29.089
MatchY -4.069 1.192 -3.412
opensesame_codenameosweb 559.782 3.165 176.881
MatchY:opensesame_codenameosweb 88.170 1.550 56.896
Correlation of Fixed Effects:
(Intr) MatchY opnss_
MatchY -0.027
opnssm_cdnm -0.087 0.190
MtchY:pnss_ 0.021 -0.769 -0.247
The estimated effects were as equal as the real data analysis, but the standard deviations had been smaller.
How many participants we will require?
We decided 5 sample size: 50, 75, 100, 125, 150 and accumulated 1000 simulated results respectively. Because the simulation codes take a lot of time, the below chunk shows the codes only.
model <- artificial_lmer # which model do we want to simulate power for?
data <- artificial_data # data used to fit the model
fixed_effects <- c("Match", "opensesame_codename") # all fixed effects specified in artificial_glmer
simvar_subj <- "Subject" # which random variable do we want to vary in the simulation?
simvar_item <- "Target"
# ------------------------------------------ #
# SIMULATION PARAMETERS
steps_subj <- c(50,75,100,125,150) # which sample sizes do we want to look at?
steps_item <- c(24,48,72,96,120) # which sample sizes do we want to look at?
critical_value <- 2 # which t/z value do we want to use to test for significance?
n_sim <- 1000 # how many single simulations should be used to estimate power?
# ------------------------------------------ #
# INCLUDE SESOI SIMULATION
SESOI <- EN_OS_model_fef*.15 # specify SESOI (15% smaller betas)
# ------------------------------------------ #
# RUN SIMULATION WITH MIXEDPOWER
power_subj <- mixedpower(model = model, data = data,
fixed_effects = fixed_effects,
simvar = simvar_subj, steps = steps_subj,
critical_value = critical_value, n_sim = n_sim,
SESOI = SESOI)
The simulated powers indicated that increasing sample size difficultly improve the power. We could learn very little from a mega study collected thousands of participants.
50 | 75 | 100 | 125 | 150 | mode | effect | |
---|---|---|---|---|---|---|---|
MatchY | 0.123 | 0.141 | 0.184 | 0.221 | 0.266 | databased | MatchY |
opensesame_codenameosweb | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | databased | opensesame_codenameosweb |
MatchY:opensesame_codenameosweb | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | databased | MatchY:opensesame_codenameosweb |
MatchY1 | 0.056 | 0.052 | 0.046 | 0.046 | 0.052 | SESOI | MatchY |
opensesame_codenameosweb1 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | SESOI | opensesame_codenameosweb |
MatchY:opensesame_codenameosweb1 | 0.442 | 0.648 | 0.740 | 0.833 | 0.922 | SESOI | MatchY:opensesame_codenameosweb |
How many items we will require?
We decided 5 sample size: 24,48,72,96,120 and accumulated 1000 simulated results respectively. In the above chunk, we replaced the parameters simvar
and steps
with simvar_item
and steps_item
. Because the simulation codes take a lot of time, we finished the simulations before we created this page.
24 | 48 | 72 | 96 | 120 | mode | effect | |
---|---|---|---|---|---|---|---|
MatchY | 0.655 | 0.918 | 0.946 | 0.999 | 0.998 | databased | MatchY |
opensesame_codenameosweb | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | databased | opensesame_codenameosweb |
MatchY:opensesame_codenameosweb | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | databased | MatchY:opensesame_codenameosweb |
MatchY1 | 0.053 | 0.084 | 0.038 | 0.044 | 0.047 | SESOI | MatchY |
opensesame_codenameosweb1 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | SESOI | opensesame_codenameosweb |
MatchY:opensesame_codenameosweb1 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | SESOI | MatchY:opensesame_codenameosweb |
The results showed that increasing the items in a study could improve the success rate to detect the orientation effect. On the other hand, when we hypothesize the true effect was smaller than this project measured (see SESOI rows), increasing items help little to detect the orientation effect.
If you see mistakes or want to suggest changes, please create an issue on the source repository.
Text and figures are licensed under Creative Commons Attribution CC BY 4.0. Source code is available at https://github.com/SCgeeker/PSA002_report, unless otherwise noted. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".