Here, pred1 is correlated r = 0.5 to the DV, and pred2 is correlated 0.0 to the DV, and pred1 and pred2 are correlated r = -0.2 to each other.
If the continuous predictors are within-subjects (e.g., dv and predictor are measured at pre- an post-test), you can set it up like below.
The correlation matrix can start getting tricky, so I usually map out the upper right triangle of the correlation matrix separately. Here, the dv and predictor are correlated 0.0 in the pre-test and 0.5 in the post-test. The dv is correlated 0.8 between pre- and post-test and the predictor is correlated 0.3 between pre- and post-test. There is no correlation between the pre-test predictor and the post-test dv, but I’m not sure what values are possible then for the correlation between the post-test predictor and pre-test dv, so I can set that to NA and use the
pos_def_limits function to determine the range of possible correlations (gven the existing correlation structure). Those range from -0.08 to 0.88, so I’ll set the value to the mean.
# pre_pred, post_dv, post_pred r <- c( 0.0, 0.8, NA, # pre_dv 0.0, 0.3, # pre_pred 0.5) # post_dv lim <- faux::pos_def_limits(r) r[] <- mean(c(lim$min, lim$max)) dat <- sim_design(within = list(time = c("pre", "post"), vars = c("dv", "pred")), mu = list(pre_dv = 100, pre_pred = 0, post_dv = 110, post_pred = 0.1), sd = list(pre_dv = 10, pre_pred = 1, post_dv = 10, post_pred = 1), r = r, plot = FALSE)
You have to make this sort of dataset in wide format and then manually convert it to long. I prefer
spread, but I’m trying to learn the new pivot functions, so I’ll use them here.
In this design, the DV is 10 higher for group B than group A and the correlation between the predictor and DV is 0.5 for group A and 0.0 for group B.
If you already have a dataset and want to add a continuous predictor, you can make a new column with a specified mean, SD and correlation to one other column.
First, let’s make a simple dataset with one between-subject factor.
Now we can add a continuous predictor with
rnorm_pre by specifying the vector it should be correlated with, the mean, and the SD. By default, this produces values sampled from a population with that mean, SD and r. If you set
empirical to TRUE, the resulting vector will have that sample mean, SD and r.
If you want to set a different mean, SD or r for the between-subject groups, you can split and re-merge the dataset (or use your data wrangling skills to devise a more elegant way using purrr).