Taylor, J. and Verbyla, A (2011) R package **wgaim**: QTL Analysis in Bi-Parental Populations using Linear Mixed Models, *Journal of Statistical Software*, **40**(7).

*Note: The QTL analysis functions in **wgaim** explicitly use and build upon the functionality provided by the linear mixed modelling **ASReml-R** package (currently version 4). This is a commercial package available from VSNi at https://vsni.co.uk/software/asreml-r/ with pricing dependent on the institution. Users will require a fully licensed version of **ASReml-R** to use the QTL analysis functionality of the **wgaim** package and to run the code in this vignette. Users should consult the **ASReml-R** documentation for thorough details on the model syntax and extensive peripheral features of the package.* This introductory vignette presents the workflow of a **wgaim** QTL analysis. More in depth analyses can be found in an upcoming sister vignette, "A deeper look at the **wgaim** functionality." The analysis workflow can be summarised simply with three steps: 1. With phenotypic data, build a base linear mixed model using the functionality of **ASReml-R**. 2. Construct a genotypic linkage map, store it as a **qtl** cross object and convert it to a **wgaim** interval object. 3. Use the base model from 1. and the interval genotype object from 2. to conduct a **wgaim** QTL analysis. **Package restrictions**: The current version of **wgaim** provides functionality for QTL analysis of Double Haploid, Backcross, Advanced Recombinant Inbred and F2 populations. ## Package data The **wgaim** package contains several pre-packaged phenotypic data sets with matching genetic linkage maps ready for QTL analysis. ``` text data(package = "wgaim") ``` The data has also been placed in a second location to provide the ability to read in manually. ``` text wgpath <- system.file("extdata", package = "wgaim") list.files(wgpath) ``` ``` ## [1] "genoCxR.csv" "genoRxK.csv" "genoSxT.csv" "phenoCxR.csv" "phenoRxK.csv" "phenoSxT.csv" ``` ## Example: RAC875 x Kukri phenotypic and genotypic data ### Phenotypic data and base model This example consists of phenotypic and genotypic data sets involving a Doubled Haploid (DH) population derived from the crossing of wheat varieties RAC875 and Kukri [@bon12]. The main goal of the experiment was to find causal links between grain yield related traits and genetic markers associated with the population. ``` text data(phenoRxK, package = "wgaim") head(phenoRxK) ``` ``` ## Genotype Type Row Range Rep yld tgw lrow lrange ## 1 DH_R003 DH 1 1 1 2.2384 33.4 -12.5 -9.5 ## 40 DH_R055 DH 2 1 1 1.1576 31.6 -11.5 -9.5 ## 41 DH_R056 DH 3 1 1 1.6424 48.3 -10.5 -9.5 ## 80 DH_R111 DH 4 1 1 2.3991 31.6 -9.5 -9.5 ## 81 DH_R112 DH 5 1 1 1.9744 33.4 -8.5 -9.5 ## 120 DH_R170 DH 6 1 1 1.2741 26.3 -7.5 -9.5 ``` The RAC875 x Kukri phenotypic data relates to a field trial consisting of 520 plots. Two replicates of 256 DH lines (`Genotype`) from the RAC875 x Kukri population were allocated to a 20 `Row` by 26 `Range` layout using a randomized complete block design with 2 Blocks (`Rep`). The additional plots remaining in each block were filled with one of each of the parents and controls (ATIL, SOKOLL, WEEBILL). A `Type` factor is included to distinguish the set of DH lines from each of the parents and controls. `lrow` and `lrange` are numerically encoded and zero centred row and range covariates. A number of yield related trait measurements were collected including grain yield (t/ha) (`yld`) and thousand grain weight (`tgw`). The analysis in this vignette concentrates on grain yield (`yld`). Before using the QTL analysis functions in **wgaim**, an appropriate initial base **ASReml-R** linear mixed model needs to be built and fitted. ``` text rkyld.asi <- asreml::asreml(yld ~ Type, random = ~ Genotype + Rep, residual = ~ ar1(Range):ar1(Row), data = phenoRxK) ``` ``` ## Online License checked out Sun Aug 25 16:54:36 2024 ## Model fitted using the gamma parameterization. ## ASReml 4.1.0 Sun Aug 25 16:54:36 2024 ## LogLik Sigma2 DF wall cpu ## 1 128.285 0.202517 514 16:54:36 0.0 ## 2 178.124 0.126231 514 16:54:36 0.0 ## 3 211.555 0.086862 514 16:54:36 0.0 ## 4 221.240 0.074148 514 16:54:36 0.0 ## 5 222.515 0.071463 514 16:54:36 0.0 ## 6 222.595 0.072380 514 16:54:36 0.0 ## 7 222.606 0.072730 514 16:54:36 0.0 ## 8 222.607 0.072856 514 16:54:36 0.0 ``` The focus of this model is the accurate calculation of the genetic variance of the DH progeny using `Genotype`. This accuracy is dramatically enhanced through the addition of terms used to account for extraneous variation arising from the experimental design (random term `Rep`) as well as potential correlation of the observations due to the similarity of neighbouring field trial plots (separable residual correlation structure `ar1(Row):ar1(Range)`)[@ver07; @gil07]. Additionally, the inclusion of a `Type` factor as a fixed effect ensures the random `Genotype` factor only contains non-zero effects for the DH progeny. A summary of the models variance parameter estimates shows a moderate correlation exists in the Range direction with a small correlation existing across the Rows. ``` text summary(rkyld.asi)$varcomp ``` ``` ## component std.error z.ratio bound %ch ## Rep 0.001733554 0.003934291 0.4406269 P 0.7 ## Genotype 0.167952916 0.017092886 9.8258958 P 0.0 ## Range:Row!R 0.072856232 0.007130535 10.2174994 P 0.0 ## Range:Row!Range!cor 0.240159289 0.068828138 3.4892603 U 0.3 ## Range:Row!Row!cor 0.506495872 0.048864188 10.3653798 U 0.1 ``` **ASReml-R** provides functionality for diagnostically checking the linear mixed model residuals. The variogram of the residuals indicates there is potential trends in the row and range directions of the experimental layout. ``` text plot(asreml::varioGram.asreml(rkyld.asi)) ```