The `rnorm_multi()`

function makes multiple normally distributed vectors with specified parameters and relationships.

For example, the following creates a sample that has 100 observations of 3 variables, drawn from a population where A has a mean of 0 and SD of 1, while B and C have means of 20 and SDs of 5. A correlates with B and C with r = 0.5, and B and C correlate with r = 0.25.

```
dat <- rnorm_multi(n = 100,
mu = c(0, 20, 20),
sd = c(1, 5, 5),
r = c(0.5, 0.5, 0.25),
varnames = c("A", "B", "C"),
empirical = FALSE)
#> The number of variables (vars) was guessed from the input to be 3
```

n | var | A | B | C | mean | sd |
---|---|---|---|---|---|---|

100 | A | 1.00 | 0.49 | 0.51 | -0.04 | 1.04 |

100 | B | 0.49 | 1.00 | 0.19 | 19.95 | 4.91 |

100 | C | 0.51 | 0.19 | 1.00 | 19.64 | 4.61 |

You can specify the correlations in one of four ways:

- A single r for all pairs
- A vars by vars matrix
- A vars*vars length vector
- A vars*(vars-1)/2 length vector

If you want all the pairs to have the same correlation, just specify a single number.

n | var | a | b | c | d | e | mean | sd |
---|---|---|---|---|---|---|---|---|

100 | a | 1.00 | 0.18 | 0.29 | 0.33 | 0.31 | 0.04 | 1.03 |

100 | b | 0.18 | 1.00 | 0.18 | 0.33 | 0.30 | 0.13 | 1.06 |

100 | c | 0.29 | 0.18 | 1.00 | 0.14 | 0.20 | 0.07 | 0.99 |

100 | d | 0.33 | 0.33 | 0.14 | 1.00 | 0.28 | 0.15 | 1.06 |

100 | e | 0.31 | 0.30 | 0.20 | 0.28 | 1.00 | 0.03 | 1.03 |

If you already have a correlation matrix, such as the output of `cor()`

, you can specify the simulated data with that.

n | var | Sepal.Length | Sepal.Width | Petal.Length | Petal.Width | mean | sd |
---|---|---|---|---|---|---|---|

100 | Petal.Length | 0.87 | -0.58 | 1.00 | 0.96 | 0.04 | 1.03 |

100 | Petal.Width | 0.82 | -0.52 | 0.96 | 1.00 | 0.05 | 1.04 |

100 | Sepal.Length | 1.00 | -0.24 | 0.87 | 0.82 | 0.09 | 0.98 |

100 | Sepal.Width | -0.24 | 1.00 | -0.58 | -0.52 | 0.07 | 1.08 |

You can specify your correlation matrix by hand as a vars*vars length vector, which will include the correlations of 1 down the diagonal.

```
cmat <- c(1, .3, .5,
.3, 1, 0,
.5, 0, 1)
bvn <- rnorm_multi(100, 3, 0, 1, cmat,
varnames = c("first", "second", "third"))
```

n | var | first | second | third | mean | sd |
---|---|---|---|---|---|---|

100 | first | 1.00 | 0.31 | 0.48 | 0.05 | 1.02 |

100 | second | 0.31 | 1.00 | 0.01 | -0.14 | 0.86 |

100 | third | 0.48 | 0.01 | 1.00 | 0.02 | 1.12 |

You can specify your correlation matrix by hand as a vars*(vars-1)/2 length vector, skipping the diagonal and lower left duplicate values.

```
rho1_2 <- .3
rho1_3 <- .5
rho1_4 <- .5
rho2_3 <- .2
rho2_4 <- 0
rho3_4 <- -.3
cmat <- c(rho1_2, rho1_3, rho1_4, rho2_3, rho2_4, rho3_4)
bvn <- rnorm_multi(100, 4, 0, 1, cmat,
varnames = letters[1:4])
```

n | var | a | b | c | d | mean | sd |
---|---|---|---|---|---|---|---|

100 | a | 1.00 | 0.29 | 0.61 | 0.41 | -0.10 | 1.06 |

100 | b | 0.29 | 1.00 | 0.23 | -0.03 | 0.09 | 1.14 |

100 | c | 0.61 | 0.23 | 1.00 | -0.28 | 0.08 | 1.17 |

100 | d | 0.41 | -0.03 | -0.28 | 1.00 | -0.12 | 0.97 |

If you want your samples to have the *exact* correlations, means, and SDs you entered, set `empirical`

to TRUE.

n | var | a | b | c | d | e | mean | sd |
---|---|---|---|---|---|---|---|---|

100 | a | 1.0 | 0.3 | 0.3 | 0.3 | 0.3 | 0 | 1 |

100 | b | 0.3 | 1.0 | 0.3 | 0.3 | 0.3 | 0 | 1 |

100 | c | 0.3 | 0.3 | 1.0 | 0.3 | 0.3 | 0 | 1 |

100 | d | 0.3 | 0.3 | 0.3 | 1.0 | 0.3 | 0 | 1 |

100 | e | 0.3 | 0.3 | 0.3 | 0.3 | 1.0 | 0 | 1 |

Us `rnorm_pre()`

to create a vector with a specified correlation to a pre-existing variable. The following code creates a vector called `sl.5`

with a mean of 10, SD of 2 and a correlation of r = 0.5 to the `Sepal.Length`

column in the built-in dataset `iris`

.

```
sl <- iris$Sepal.Length
sl.5.v1 <- rnorm_pre(sl, mu = 10, sd = 2, r = 0.5)
sl.5.v2 <- rnorm_pre(sl, mu = 10, sd = 2, r = 0.5)
```

n | var | sl | sl.5.v1 | sl.5.v2 | mean | sd |
---|---|---|---|---|---|---|

150 | sl | 1.00 | 0.45 | 0.49 | 5.84 | 0.83 |

150 | sl.5.v1 | 0.45 | 1.00 | 0.17 | 10.12 | 2.27 |

150 | sl.5.v2 | 0.49 | 0.17 | 1.00 | 10.05 | 2.05 |

Set `empirical = TRUE`

to return a vector with the **exact** specified parameters.

```
sl.5.v1 <- rnorm_pre(sl, mu = 10, sd = 2, r = 0.5, empirical = TRUE)
sl.5.v2 <- rnorm_pre(sl, mu = 10, sd = 2, r = 0.5, empirical = TRUE)
```

n | var | sl | sl.5.v1 | sl.5.v2 | mean | sd |
---|---|---|---|---|---|---|

150 | sl | 1.0 | 0.5 | 0.5 | 5.84 | 0.83 |

150 | sl.5.v1 | 0.5 | 1.0 | 0.3 | 10.00 | 2.00 |

150 | sl.5.v2 | 0.5 | 0.3 | 1.0 | 10.00 | 2.00 |