# studyStrap Package

The $$\texttt{studyStrap}$$ package implements multi-Study Learning algorithms such as Merging, Study-Specific Ensembling (Trained-on-Observed-Studies Ensemble), the Study Strap, and the Covariate-Matched Study Strap. It calculates and applies Covariate Profile Similarity and Stacking weights. By training models within the $$\texttt{caret}$$ ecosystem, this package can flexibly apply different methods (e.g., random forests, linear regression, neural networks) as single-study learners within the multi-Study ensembling framework. The package allows for multiple single-study learners per study as well as custom functions for Covariate Profile Similarity weighting and for the accept/reject step utilized in the Covariate-Matched Study Strap. The prediction function allows use of this framework without having to manually ensemble and weight model predictions.

Below we offer a few basic examples using the core functions of the package. We begin by simulating a multi-study prediction setting.

## Generate data and import packages

set.seed(1)
library(studyStrap)
# create half of training dataset from 1 distribution
X1 <- matrix(rnorm(2000), ncol = 2) # design matrix - 2 covariates
B1 <- c(5, 10, 15) # true beta coefficients
y1 <- cbind(1, X1) %*% B1

# create 2nd half of training dataset from another distribution
X2 <- matrix(rnorm(2000, 1,2), ncol = 2) # design matrix - 2 covariates
B2 <- c(10, 5, 0) # true beta coefficients
y2 <- cbind(1, X2) %*% B2

X <- rbind(X1, X2)
y <- c(y1, y2)

study <- sample.int(10, 2000, replace = TRUE) # 10 studies
data <- data.frame( Study = study, Y = y, V1 = X[,1], V2 = X[,2] )

# create target study design matrix for covariate profile similarity weighting and
# accept/reject algorithm (Covariate-matched study strap)
target <- matrix(rnorm(1000, 3, 5), ncol = 2) # design matrix
colnames(target) <- c("V1", "V2")

## Structure of Data

We have 10 studies (combined into a single dataframe), each with an outcome vector $$\mathbf{Y}$$ and two covariates $$V1$$ and $$V2$$.

head(data)
##   Study          Y         V1          V2
## 1     6  15.759938 -0.6264538  1.13496509
## 2     1  23.515411  0.1836433  1.11193185
## 3    10 -16.417951 -0.8356286 -0.87077763
## 4     6  24.113782  1.5952808  0.21073159
## 5     7   9.336012  0.3295078  0.06939565
## 6     6 -28.144417 -0.8204684 -1.66264885

## Study-Specific Ensemble (Trained-on-Observed-Studies Ensemble)

We begin with the basic ensembling setting (the Study-Specific Ensemble or Trained-on-Observed-Studies Ensemble) where we train one or more models on each study and then ensemble the models.

### Study-Specific Ensemble with 1 SSL: Principal Component Regression

Here we just use one single-study learner: PCR. We assume one has tuned the model to their liking and specifies the tuning parameters as they would in caret. Here we show an example of a custom function used for Covariate Profile Similarity weighting but we point out that this is not necessary.

Moreover, we specify a target study to allow for Covariate Profile Similarity weighting. This is unnecessary and we show an example without this below.

# custom function
fn1 <- function(x1,x2){
return( abs( cor( colMeans(x1), colMeans(x2) )) )
}

sseMod1 <- sse(formula = Y ~.,
data = data,
target.study = target,
ssl.method = list("pcr"),
ssl.tuneGrid = list(data.frame("ncomp" = 1)),
customFNs = list(fn1) )

### Make predictions with Study-Specific Ensemble (Trained-on-Observed-Studies Ensemble)

preds <- studyStrap.predict(sseMod1, target)
head(preds)[1:3,]
FALSE             Avg standard_Stacking customFn_1
FALSE [1,]  0.1774518        -13.653802  0.1774518
FALSE [2,]  5.9290711         -4.472652  5.9290711
FALSE [3,] 30.8083742         35.331259 30.8083742

The predictions are a matrix here since we have default Covariate Profile Similarity measures, stacking weights and the custom weighting function we used. Notice that the custom weights are identical to those of the “Mean Corr” weights by design. The first column is a simple average of the predictions from all of the models.

### Study-Specific Ensemble (Trained-on-Observed-Studies Ensemble) with 2 SSLs

As above, we run the same algorithm but for each study, we now train a model on both linear regression and PCR.

# custom function
fn1 <- function(x1,x2){
return( abs( cor( colMeans(x1), colMeans(x2) )) )
}

sseMod2 <- sse(formula = Y ~.,
data = data,
target.study = target,
ssl.method = list("lm","pcr"),
ssl.tuneGrid = list(NA, data.frame("ncomp" = 2)),
customFNs = list(fn1) )

## Make predictions with Study-Specific Ensemble (Trained-on-Observed-Studies Ensemble) with 2 SSLs

Making predictions is identical and produces output with identical structure. The function will automatically account for the fact that each study has a model trained on linear regression and a model trained on PCR. Covariate Profile Similarity weights account for these by weighting equally two models trained on the same data.

preds <- studyStrap.predict(sseMod2, target)
head(preds)[1:3,]
FALSE             Avg standard_Stacking customFn_1
FALSE [1,] -13.995010        -14.066971 -13.995010
FALSE [2,]  -4.730832         -4.792895  -4.730832
FALSE [3,]  35.433599         35.415095  35.433599

Now let us assume we do not have a target study to generate Covariate Profile Similarity weights.

sseMod3 <- sse(formula = Y ~.,
data = data,
ssl.method = list("pcr"),
ssl.tuneGrid = list(NA, data.frame("ncomp" = 1)),
sim.mets = FALSE)

preds <- studyStrap.predict(sseMod3, target)
head(preds)[1:3,]
FALSE             Avg standard_Stacking
FALSE [1,]  0.1774518        -13.653802
FALSE [2,]  5.9290711         -4.472652
FALSE [3,] 30.8083742         35.331259

Since we do not have a target study we cannot generate Covariate Profile Similarity weights and predictions are only for stacking and simple averaging.

Now let us move on to another standard multi-study learning method, Merging:

## Merged Approach

### Merged with 1 SSL and 2 SSLs

# 1 SSL
mrgMod1 <- merged(formula = Y ~.,
data = data,
ssl.method = list("pcr"),
ssl.tuneGrid = list( data.frame("ncomp" = 2))
)

# 2 SSLs
mrgMod2 <- merged(formula = Y ~.,
data = data,
ssl.method = list("lm","pcr"),
ssl.tuneGrid = list(NA, data.frame("ncomp" = 2))
)

Predictions only produce 1 vector of predictions listed under Avg.

preds <- studyStrap.predict(mrgMod2, target)
head(preds)
FALSE             Avg NA_Stacking
FALSE [1,] -14.066971          NA
FALSE [2,]  -4.792895          NA
FALSE [3,]  35.415095          NA
FALSE [4,]  53.204946          NA
FALSE [5,]  60.384128          NA
FALSE [6,]  29.725637          NA

## Study Strap

We now demonstrate the use of the Study Strap with 10 straps and all available weighting schemes.

## Study Strap with 1 and 2 SSLs

# custom function
fn1 <- function(x1,x2){
return( abs( cor( colMeans(x1), colMeans(x2) )) )
}

# 1 SSL
ssMod1 <- ss(formula = Y ~.,
data = data,
target.study = target,
bag.size = length(unique(data$Study)), straps = 10, stack = "standard", sim.covs = NA, ssl.method = list("pcr"), ssl.tuneGrid = list(data.frame("ncomp" = 2)), sim.mets = TRUE, model = TRUE, customFNs = list( fn1 ) ) # 2 SSLs ssMod2 <- ss(formula = Y ~., data = data, target.study = target, bag.size = length(unique(data$Study)),
straps = 10,
stack = "standard",
sim.covs = NA,
ssl.method = list("lm","pcr"),
ssl.tuneGrid = list(NA, data.frame("ncomp" = 2)),
sim.mets = TRUE,
model = TRUE,
customFNs = list( fn1 ) )

Predictions have the same structure as the Study-Specific Ensemble.

preds <- studyStrap.predict(ssMod2, target)
head(preds)[1:3,]
FALSE             Avg standard_Stacking Matcor Diag Matcor Sum Matcor Sum Abs
FALSE [1,] -14.627272        -14.066971  -14.284018 -14.452172      -14.33067
FALSE [2,]  -5.138826         -4.792895   -4.828864  -4.823478       -4.86358
FALSE [3,]  35.998061         35.415095   36.161704  36.918680       36.17842
FALSE           |rho|     rho sq  UV rho sq  UV cov sq     UV rho     UV cov
FALSE [1,] -14.211107 -14.099586 -14.740194 -14.822641 -14.956572 -15.330968
FALSE [2,]  -4.776769  -4.663318  -5.269537  -5.334426  -5.438017  -5.734821
FALSE [3,]  36.123186  36.243555  35.789701  35.801652  35.830304  35.874078
FALSE      diag UV rho sq diag UV cov diag UV cov sq  Mean Corr        SMI         RV
FALSE [1,]     -14.740194  -14.828585     -14.638752 -14.627272 -14.112789 -14.197656
FALSE [2,]      -5.269537   -5.320772      -5.189034  -5.138826  -4.673056  -4.735643
FALSE [3,]      35.789701   35.900237      35.777924  35.998061  36.248897  36.283369
FALSE             RV2      RVadj        PSI         r1         r2         r3
FALSE [1,] -15.360730 -15.353452 -14.239038 -13.350311 -13.117915 -13.273435
FALSE [2,]  -5.662882  -5.660354  -4.799337  -3.997512  -3.938409  -3.938728
FALSE [3,]  36.383972  36.366180  36.124056  36.543014  35.852250  36.522881
FALSE              r4        GCD customFn_1
FALSE [1,] -13.018795 -14.112789 -14.627272
FALSE [2,]  -3.861005  -4.673056  -5.138826
FALSE [3,]  35.834974  36.248897  35.998061

Now let’s say we do not want to use the custom similarity measures. We can turn these off and this will significantly improve the time it takes to fit the models and will alter the structure of the prediction output. We must specify the bag size. The default is to use the number of training studies, but this must be tuned for optimal performance.

# custom function
fn1 <- function(x1,x2){
return( abs( cor( colMeans(x1), colMeans(x2) )) )
}

ssMod3 <- ss(formula = Y ~.,
data = data,
target.study = target,
bag.size = length(unique(data$Study)), straps = 10, sim.covs = NA, ssl.method = list("pcr"), ssl.tuneGrid = list(data.frame("ncomp" = 2)), sim.mets = FALSE, customFNs = list( fn1 ) ) preds <- studyStrap.predict(ssMod3, target) head(preds)[1:3,] FALSE Avg standard_Stacking customFn_1 FALSE [1,] -12.97376 -14.066971 -12.97376 FALSE [2,] -4.03119 -4.792895 -4.03119 FALSE [3,] 34.73606 35.415095 34.73606 Now, let’s deal with the case when we do not have a target study at all. We can simply remove this argument and our predictions will be limited to a simple average and stacking weights. ssMod4 <- ss(formula = Y ~., data = data, bag.size = length(unique(data$Study)),
straps = 10,
sim.covs = NA, ssl.method = list("pcr"),
ssl.tuneGrid = list(data.frame("ncomp" = 2)),
sim.mets = FALSE)

preds <- studyStrap.predict(ssMod4, target)
head(preds)[1:3,]
FALSE            Avg standard_Stacking
FALSE [1,] -12.27087        -14.066971
FALSE [2,]  -3.45521         -4.792895
FALSE [3,]  34.75931         35.415095

## Covariate-Matched Study Strap (Accept/Reject)

Now we turn to the accept/reject algorithm. Here we must specify a target study. We need to specify the number of paths (we recommend 5) and the convergence limit (number of consecutive rejected study straps to meet convergence criteria). This depends on computational cost, but we would recommend at least 1000 and the more the better. Here we choose a low number for demonstration purposes. We could choose a custom function ( sim.fn ) for the accept/reject step or use the default of $$|cor(\bar{x}^{(r)}, \bar{x}_{target}) |$$. Similarly we can provide custom functions for weighting as above. We also specify the maximum number of study straps allowed in total in case many are accepted without convergence. We recommend 50 straps per path to be safe, but this is obviously application specific and depends on the distribution of the covariates.

We could use 1 SSL or multiple SSLs as above. We need to specify the bag size as in the Study Strap algorithm. The default is to use the number of training studies, but this must be tuned for optimal performance.

## Covariate-Matched Study Strap with 1 and 2 SSLs

# 1 SSL
arMod1 <-  cmss(formula = Y ~.,
data = data,
target.study = target,
converge.lim = 2,
bag.size = length(unique(data$Study)), max.straps = 50, paths = 2, ssl.method = list("pcr"), ssl.tuneGrid = list(data.frame("ncomp" = 2)) ) # 2 SSLs arMod2 <- cmss(formula = Y ~., data = data, target.study = target, converge.lim = 2, bag.size = length(unique(data$Study)),
max.straps = 50,
paths = 2,
ssl.method = list("lm","pcr"),
ssl.tuneGrid = list(NA, data.frame("ncomp" = 2))
)

preds <- studyStrap.predict(arMod2, target)
head(preds)[1:3,]
FALSE             Avg standard_Stacking Matcor Diag Matcor Sum Matcor Sum Abs
FALSE [1,] -14.040566        -13.760921  -13.920651 -13.889937     -14.145602
FALSE [2,]  -4.813971         -4.555755   -4.730099  -4.683699      -4.897403
FALSE [3,]  35.192579         35.352613   35.119938  35.232755      35.203636
FALSE           |rho|     rho sq  UV rho sq  UV cov sq     UV rho     UV cov
FALSE [1,] -14.169412 -14.249983 -14.442813 -14.443724 -14.407886 -13.747953
FALSE [2,]  -4.914306  -4.980559  -5.146644  -5.147101  -5.113404  -4.595961
FALSE [3,]  35.216743  35.213340  35.165515  35.167016  35.190825  35.085728
FALSE      diag UV rho sq diag UV cov diag UV cov sq  Mean Corr        SMI         RV
FALSE [1,]     -14.442813  -13.719055     -14.001783 -14.040566 -14.253161 -14.245844
FALSE [2,]      -5.146644   -4.615958      -4.836169  -4.813971  -4.984143  -4.974931
FALSE [3,]      35.165515   34.856149      34.908938  35.192579  35.208081  35.225245
FALSE             RV2      RVadj        PSI         r1         r2         r3
FALSE [1,] -14.882410 -14.652337 -14.160104 -13.704759 -13.743827 -13.709580
FALSE [2,]  -5.425419  -5.242989  -4.905941  -4.572621  -4.592698  -4.573953
FALSE [3,]  35.583131  35.557210  35.220889  35.023320  35.085220  35.037006
FALSE              r4        GCD
FALSE [1,] -13.744843 -14.253161
FALSE [2,]  -4.591946  -4.984143
FALSE [3,]  35.093549  35.208081

Now let us use the accept/reject step based upon our own custom function (sim.fn). We turn off the default Covariate Profile Similarity weights to speed up runtime (sim.mets = FALSE) but provide 2 of our own custom functions for Covariate Profile Similarity weights.

## Covariate-Matched Study Strap with Custom Function for Accept/Reject Step

# 1 SSL

# custom function for CPS
fn1 <- function(x1,x2){
return( abs( cor( colMeans(x1), colMeans(x2) )) )
}

# custom function for Accept/Reject step criteria
fn2 <- function(x1,x2){
return( sum ( ( colMeans(x1) - colMeans(x2) )^2 ) )
}

arMod3 <-  cmss(formula = Y ~.,
data = data,
target.study = target,
converge.lim = 2,
bag.size = length(unique(data$Study)), max.straps = 50, paths = 2, ssl.method = list("pcr"), ssl.tuneGrid = list(data.frame("ncomp" = 2)), sim.mets = FALSE, sim.fn = fn2, customFNs = list( fn1, fn2 ) ) preds <- studyStrap.predict(arMod3, target) head(preds)[1:3,] FALSE Avg standard_Stacking customFn_1 customFn_2 FALSE [1,] -13.963306 -14.066971 -13.963306 -14.026422 FALSE [2,] -4.777185 -4.792895 -4.777185 -4.816016 FALSE [3,] 35.052555 35.415095 35.052555 35.118968 ## Model Object Structure Now that we understand how to fit models, let us take a second to explore the model object that the package produces. The model objects are S3 classes. That is, they are functionally a list. sseMod1 ##$models
## $models[[1]] ##$models[[1]][[1]]
## Principal Component Analysis
##
## No pre-processing
## Resampling: None
##
## $models[[1]][[2]] ## Principal Component Analysis ## ## No pre-processing ## Resampling: None ## ##$models[[1]][[3]]
## Principal Component Analysis
##
## No pre-processing
## Resampling: None
##
## $models[[1]][[4]] ## Principal Component Analysis ## ## No pre-processing ## Resampling: None ## ##$models[[1]][[5]]
## Principal Component Analysis
##
## No pre-processing
## Resampling: None
##
## $models[[1]][[6]] ## Principal Component Analysis ## ## No pre-processing ## Resampling: None ## ##$models[[1]][[7]]
## Principal Component Analysis
##
## No pre-processing
## Resampling: None
##
## $models[[1]][[8]] ## Principal Component Analysis ## ## No pre-processing ## Resampling: None ## ##$models[[1]][[9]]
## Principal Component Analysis
##
## No pre-processing
## Resampling: None
##
## $models[[1]][[10]] ## Principal Component Analysis ## ## No pre-processing ## Resampling: None ## ## ## ##$data
## NULL
##
## [[3]]
## NULL
##
## [[4]]
## list()
##
## $dataInfo ##$dataInfo$studyNames ## [1] 6 1 10 7 4 8 3 2 9 5 ## ##$dataInfo$sampleSizes ## [1] 2000 2000 2000 2000 2000 2000 2000 2000 2000 2000 ## ## ##$modelInfo
## $modelInfo$sampling
## [1] "studySpecificEnsemble"
##
## $modelInfo$numStraps
## [1] 10
##
## $modelInfo$SSL
## $modelInfo$SSL[[1]]
## [1] "pcr"
##
##
## $modelInfo$ssl.tuneGrid
## NULL
##
## $modelInfo$numPaths
## [1] NA
##
## $modelInfo$convg.vec
## NULL
##
## $modelInfo$convgCritera
## [1] NA
##
## $modelInfo$meanSamp
## [1] NA
##
## $modelInfo$stack.type
## [1] "standard"
##
## $modelInfo$custFNs
## $modelInfo$custFNs[[1]]
## function(x1,x2){
##     return( abs( cor( colMeans(x1), colMeans(x2) )) )
##     }
## <bytecode: 0x7fb8361996e0>
##
##
## $modelInfo$bagSize
## [1] 1
##
##
## [[7]]
## NULL
##
## $simMat ## customFn_1 ## [1,] 0.1 ## [2,] 0.1 ## [3,] 0.1 ## [4,] 0.1 ## [5,] 0.1 ## [6,] 0.1 ## [7,] 0.1 ## [8,] 0.1 ## [9,] 0.1 ## [10,] 0.1 ## ##$stack.coefs
##  [1] 0.00000000 0.00000000 0.00000000 0.00000000 0.84196123 0.00000000
##  [7] 0.00000000 0.09382132 0.00000000 0.00000000 0.00000000
##
## $strapRows ##$strapRows$length ## [1] 1 4 6 10 22 43 49 53 55 87 100 106 108 128 140 ## [16] 149 154 176 180 190 191 203 214 250 253 259 262 273 290 308 ## [31] 315 321 329 332 344 355 368 374 396 399 409 414 415 419 429 ## [46] 433 443 444 464 466 473 485 491 492 495 503 534 558 561 598 ## [61] 605 619 633 642 646 650 656 667 671 684 697 713 740 752 753 ## [76] 776 791 810 811 825 842 845 848 857 863 866 921 936 948 974 ## [91] 979 993 996 1014 1023 1038 1041 1048 1052 1064 1065 1067 1080 1086 1097 ## [106] 1120 1130 1137 1159 1193 1199 1208 1214 1220 1225 1232 1234 1242 1250 1253 ## [121] 1275 1286 1304 1305 1358 1360 1362 1368 1378 1406 1428 1430 1433 1453 1469 ## [136] 1488 1508 1509 1519 1537 1539 1553 1573 1582 1587 1593 1600 1602 1625 1643 ## [151] 1649 1651 1685 1696 1707 1720 1724 1737 1756 1779 1818 1836 1878 1887 1888 ## [166] 1913 1917 1919 1934 1946 1956 1969 1974 1975 1976 1978 ## ##$strapRows[[2]]
##   [1]    2   13   21   29   33   34   44   47   52   57   59   61   76  104  109
##  [16]  118  121  123  135  137  158  179  200  217  233  237  256  274  279  280
##  [31]  288  314  317  324  328  333  349  354  366  370  375  376  389  397  398
##  [46]  400  436  452  467  493  508  516  520  523  535  549  554  567  584  586
##  [61]  592  595  604  625  661  665  685  687  692  696  701  703  722  734  745
##  [76]  748  757  764  768  778  781  789  815  818  833  838  840  876  888  889
##  [91]  911  916  917  919  930  931  951  953  957  964  969  981  985  999 1000
## [106] 1007 1021 1037 1049 1070 1071 1077 1091 1092 1096 1099 1100 1105 1124 1125
## [121] 1148 1162 1165 1167 1170 1182 1186 1198 1200 1210 1222 1244 1247 1251 1254
## [136] 1264 1280 1326 1331 1347 1353 1359 1366 1367 1372 1376 1391 1398 1400 1405
## [151] 1414 1422 1429 1441 1449 1451 1459 1474 1483 1484 1493 1498 1500 1501 1504
## [166] 1510 1533 1556 1561 1571 1576 1577 1588 1603 1627 1641 1642 1670 1671 1682
## [181] 1683 1689 1691 1706 1709 1712 1715 1717 1722 1740 1759 1762 1787 1792 1809
## [196] 1813 1821 1826 1835 1843 1845 1846 1852 1855 1866 1869 1891 1898 1900 1905
## [211] 1943 1945 1948 1953 1967 1968 1998 1999
##
## $strapRows[[3]] ## [1] 3 11 16 18 35 51 71 75 92 95 99 116 117 127 133 ## [16] 165 168 170 178 215 236 245 248 254 272 316 320 323 331 360 ## [31] 363 379 388 431 437 446 454 461 471 483 515 518 521 530 551 ## [46] 559 560 563 575 577 581 589 623 629 635 652 653 664 678 682 ## [61] 702 712 731 732 733 756 761 779 828 852 855 868 874 904 906 ## [76] 908 925 926 937 939 942 965 991 994 1001 1002 1024 1028 1029 1030 ## [91] 1036 1047 1055 1076 1082 1085 1088 1108 1112 1134 1142 1161 1180 1212 1213 ## [106] 1233 1241 1246 1252 1255 1262 1283 1292 1332 1343 1351 1357 1375 1384 1403 ## [121] 1418 1423 1432 1447 1456 1495 1514 1518 1522 1525 1527 1530 1542 1549 1552 ## [136] 1575 1580 1585 1586 1598 1610 1619 1621 1628 1653 1657 1666 1693 1708 1728 ## [151] 1738 1751 1757 1764 1766 1767 1775 1776 1777 1780 1782 1786 1808 1825 1830 ## [166] 1834 1838 1851 1854 1858 1868 1870 1871 1880 1894 1901 1910 1918 1922 1923 ## [181] 1925 1933 1937 1942 1957 1960 1965 1979 1980 1991 ## ##$strapRows[[4]]
##   [1]    5   41   42   62   67   68   74  102  105  111  113  120  124  148  157
##  [16]  160  166  171  184  185  195  207  209  211  213  230  232  234  252  255
##  [31]  265  289  291  306  325  338  345  353  356  382  387  391  392  394  402
##  [46]  405  418  430  432  435  440  445  449  450  469  472  476  477  480  489
##  [61]  538  540  541  544  553  569  585  639  647  648  663  666  670  672  673
##  [76]  695  708  715  718  723  728  738  741  762  772  775  777  783  792  794
##  [91]  795  797  800  803  808  823  824  841  860  865  872  885  890  894  899
## [106]  905  910  912  918  924  933  945  977  984 1013 1015 1017 1019 1022 1032
## [121] 1033 1035 1039 1051 1059 1069 1079 1087 1113 1135 1143 1145 1158 1189 1201
## [136] 1205 1209 1219 1223 1231 1235 1267 1268 1272 1306 1307 1316 1320 1325 1330
## [151] 1335 1365 1385 1402 1411 1412 1439 1448 1452 1457 1468 1478 1487 1489 1497
## [166] 1499 1505 1507 1538 1543 1566 1569 1570 1599 1601 1611 1624 1631 1634 1636
## [181] 1645 1648 1654 1655 1660 1672 1694 1700 1704 1729 1731 1749 1771 1781 1785
## [196] 1804 1806 1814 1817 1837 1847 1856 1862 1865 1875 1882 1885 1897 1899 1911
## [211] 1926 1931 1947 1950 1951 1963 1977 1981 1986 1994 2000
##
## $strapRows[[5]] ## [1] 7 23 26 31 50 54 66 72 73 83 90 91 93 98 115 ## [16] 136 142 159 162 163 169 187 198 205 218 226 227 239 257 261 ## [31] 263 266 281 285 301 327 336 343 347 352 365 377 381 383 393 ## [46] 406 412 413 427 434 439 478 484 488 497 502 504 510 525 533 ## [61] 555 571 579 580 583 596 597 601 607 608 614 618 624 626 627 ## [76] 630 640 660 674 679 691 693 694 707 709 714 729 739 742 750 ## [91] 755 769 774 782 784 786 788 796 831 846 850 877 882 896 900 ## [106] 909 920 922 932 941 943 950 954 961 972 992 1016 1026 1034 1040 ## [121] 1043 1058 1068 1101 1109 1110 1117 1121 1131 1147 1173 1175 1185 1204 1207 ## [136] 1211 1216 1227 1228 1257 1273 1295 1296 1301 1311 1312 1322 1338 1381 1382 ## [151] 1386 1394 1396 1401 1413 1417 1435 1467 1475 1476 1494 1502 1512 1520 1521 ## [166] 1528 1534 1544 1545 1546 1594 1617 1626 1632 1635 1647 1650 1652 1658 1687 ## [181] 1697 1711 1714 1719 1733 1746 1753 1774 1788 1791 1795 1805 1857 1859 1883 ## [196] 1912 1914 1915 1916 1924 1927 1938 1939 1966 1970 1972 1983 1984 1996 ## ##$strapRows[[6]]
##   [1]    8   12   40   65   69   89   96  119  131  161  173  193  199  223  224
##  [16]  225  228  240  244  283  293  297  300  304  311  348  373  384  395  403
##  [31]  428  438  441  442  451  456  459  462  470  475  499  511  528  536  543
##  [46]  548  556  557  582  599  600  606  609  610  613  655  669  686  705  743
##  [61]  758  759  780  787  802  805  809  812  814  820  834  839  843  851  853
##  [76]  856  870  887  895  914  923  927  928  947  952  968  975  976  990  997
##  [91] 1003 1004 1009 1020 1027 1031 1042 1054 1074 1083 1093 1094 1104 1166 1169
## [106] 1172 1179 1183 1188 1192 1243 1245 1256 1259 1261 1266 1269 1271 1278 1288
## [121] 1294 1297 1302 1308 1310 1323 1324 1349 1352 1355 1369 1374 1380 1388 1389
## [136] 1392 1409 1416 1420 1424 1426 1437 1443 1444 1446 1479 1513 1515 1532 1540
## [151] 1541 1554 1558 1560 1574 1589 1604 1608 1612 1620 1623 1638 1639 1640 1644
## [166] 1662 1676 1681 1692 1702 1703 1705 1725 1734 1736 1739 1754 1755 1760 1769
## [181] 1810 1816 1819 1822 1831 1839 1853 1861 1867 1874 1892 1902 1909 1940 1941
## [196] 1955 1971 1995
##
## $strapRows[[7]] ## [1] 9 27 28 45 46 56 82 86 103 126 129 138 147 156 164 ## [16] 172 189 202 206 208 210 220 221 229 235 258 268 270 275 276 ## [31] 307 310 313 318 335 337 341 390 404 407 424 425 460 463 481 ## [46] 490 501 522 524 532 539 542 588 591 615 616 621 628 631 662 ## [61] 706 716 717 746 766 771 785 790 798 804 827 847 849 858 873 ## [76] 875 884 902 944 946 960 967 995 998 1006 1018 1025 1046 1050 1063 ## [91] 1098 1102 1114 1122 1127 1129 1139 1141 1149 1153 1155 1156 1160 1163 1184 ## [106] 1190 1191 1195 1218 1226 1239 1240 1263 1265 1274 1276 1285 1298 1303 1313 ## [121] 1328 1333 1336 1337 1340 1345 1346 1356 1371 1373 1379 1390 1395 1404 1434 ## [136] 1442 1460 1461 1463 1466 1472 1473 1481 1503 1516 1517 1548 1564 1567 1578 ## [151] 1579 1595 1606 1607 1618 1622 1675 1679 1695 1713 1718 1721 1723 1726 1745 ## [166] 1750 1758 1793 1820 1829 1840 1842 1848 1864 1872 1873 1877 1881 1893 1895 ## [181] 1921 1928 1932 1952 1987 1989 1993 1997 ## ##$strapRows[[8]]
##   [1]   14   24   36   48   60   70   79   80   85   88   97  101  112  122  146
##  [16]  174  181  182  183  192  194  196  222  238  251  271  277  278  286  292
##  [31]  298  299  305  350  357  358  359  367  372  378  380  411  417  420  423
##  [46]  448  482  486  487  527  531  547  550  552  572  573  593  603  612  620
##  [61]  632  641  644  651  654  658  668  675  677  681  683  688  689  720  724
##  [76]  727  736  754  760  767  770  806  816  821  826  832  862  869  883  901
##  [91]  903  929  934  935  955  971  978  980  982  988 1010 1011 1044 1045 1056
## [106] 1073 1081 1089 1090 1107 1111 1115 1118 1119 1126 1128 1132 1151 1152 1168
## [121] 1177 1202 1203 1279 1281 1289 1291 1293 1300 1309 1319 1327 1329 1341 1348
## [136] 1350 1354 1364 1399 1415 1419 1421 1440 1458 1471 1482 1485 1486 1490 1491
## [151] 1511 1531 1535 1547 1551 1559 1565 1581 1592 1616 1630 1633 1659 1667 1668
## [166] 1669 1684 1698 1744 1761 1790 1815 1827 1860 1886 1889 1908 1935 1954 1959
## [181] 1962 1982 1990 1992
##
## $strapRows[[9]] ## [1] 15 17 19 20 30 37 58 63 64 77 78 81 94 107 114 ## [16] 130 141 143 144 145 150 151 152 153 167 177 197 201 204 219 ## [31] 231 243 247 249 260 282 287 294 295 296 302 303 309 312 319 ## [46] 326 330 342 346 362 371 401 410 416 426 447 465 474 479 494 ## [61] 498 500 505 506 507 509 512 517 519 526 529 537 546 562 564 ## [76] 565 568 587 590 594 602 617 622 637 643 645 657 676 680 690 ## [91] 700 704 711 725 726 730 735 737 744 747 763 765 773 793 801 ## [106] 819 822 829 830 836 837 844 854 861 867 871 879 893 897 949 ## [121] 958 959 970 973 986 987 1005 1008 1053 1066 1078 1103 1116 1133 1144 ## [136] 1146 1174 1176 1181 1196 1206 1215 1229 1230 1260 1277 1290 1299 1315 1317 ## [151] 1318 1321 1334 1339 1361 1383 1436 1445 1454 1464 1465 1477 1492 1496 1506 ## [166] 1524 1529 1550 1562 1563 1583 1584 1591 1614 1615 1646 1664 1673 1678 1680 ## [181] 1701 1710 1727 1735 1741 1743 1747 1763 1765 1768 1772 1784 1797 1798 1800 ## [196] 1803 1807 1812 1824 1833 1841 1844 1863 1879 1890 1896 1904 1906 1907 1936 ## [211] 1944 1949 1958 1964 ## ##$strapRows[[10]]
##   [1]   25   32   38   39   84  110  125  132  134  139  155  175  186  188  212
##  [16]  216  241  242  246  264  267  269  284  322  334  339  340  351  361  364
##  [31]  369  385  386  408  421  422  453  455  457  458  468  496  513  514  545
##  [46]  566  570  574  576  578  611  634  636  638  649  659  698  699  710  719
##  [61]  721  749  751  799  807  813  817  835  859  864  878  880  881  886  891
##  [76]  892  898  907  913  915  938  940  956  962  963  966  983  989 1012 1057
##  [91] 1060 1061 1062 1072 1075 1084 1095 1106 1123 1136 1138 1140 1150 1154 1157
## [106] 1164 1171 1178 1187 1194 1197 1217 1221 1224 1236 1237 1238 1248 1249 1258
## [121] 1270 1282 1284 1287 1314 1342 1344 1363 1370 1377 1387 1393 1397 1407 1408
## [136] 1410 1425 1427 1431 1438 1450 1455 1462 1470 1480 1523 1526 1536 1555 1557
## [151] 1568 1572 1590 1596 1597 1605 1609 1613 1629 1637 1656 1661 1663 1665 1674
## [166] 1677 1686 1688 1690 1699 1716 1730 1732 1742 1748 1752 1770 1773 1778 1783
## [181] 1789 1794 1796 1799 1801 1802 1811 1823 1828 1832 1849 1850 1876 1884 1903
## [196] 1920 1929 1930 1961 1973 1985 1988
##
##
## attr(,"class")
## [1] "ss"

Let us begin by exploring how the models are stored.

sseMod1$models ## [[1]] ## [[1]][[1]] ## Principal Component Analysis ## ## No pre-processing ## Resampling: None ## ## [[1]][[2]] ## Principal Component Analysis ## ## No pre-processing ## Resampling: None ## ## [[1]][[3]] ## Principal Component Analysis ## ## No pre-processing ## Resampling: None ## ## [[1]][[4]] ## Principal Component Analysis ## ## No pre-processing ## Resampling: None ## ## [[1]][[5]] ## Principal Component Analysis ## ## No pre-processing ## Resampling: None ## ## [[1]][[6]] ## Principal Component Analysis ## ## No pre-processing ## Resampling: None ## ## [[1]][[7]] ## Principal Component Analysis ## ## No pre-processing ## Resampling: None ## ## [[1]][[8]] ## Principal Component Analysis ## ## No pre-processing ## Resampling: None ## ## [[1]][[9]] ## Principal Component Analysis ## ## No pre-processing ## Resampling: None ## ## [[1]][[10]] ## Principal Component Analysis ## ## No pre-processing ## Resampling: None Models are organized as a list of lists. Each element in the primary list is itself a list of all the models trained on one single study learner (e.g., lm, random forests). Each element in that list is a model trained on a study/study strap. Here we have only one single study learner (PCR), so the list is of length 10. ### Model Info Model Info provides information about how the models were fit. These are stored based upon user input when fitting the model. names(sseMod1$modelInfo)
##  [1] "sampling"     "numStraps"    "SSL"          "ssl.tuneGrid" "numPaths"
##  [6] "convg.vec"    "convgCritera" "meanSamp"     "stack.type"   "custFNs"
## [11] "bagSize"

### Data Info

Data Info provides information about the raw data that was fed to the model fitting functions. Original data is stored if “model = TRUE” is specified.

names(sseMod1\$dataInfo)
## [1] "studyNames"  "sampleSizes"

### Similarity Matrix

simMat provides the similarity matrix that is used for Covariate Profile Similarity weights.