Introduction

SNMoE (Skew-Normal Mixtures-of-Experts) provides a flexible modelling framework for heterogenous data with possibly skewed distributions to generalize the standard Normal mixture of expert model. SNMoE consists of a mixture of K skew-Normal expert regressors network (of degree p) gated by a softmax gating network (of degree q) and is represented by:

• The gating network parameters alpha’s of the softmax net.
• The experts network parameters: The location parameters (regression coefficients) beta’s, scale parameters sigma’s, and the skewness parameters lambda’s. SNMoE thus generalises mixtures of (normal, skew-normal) distributions and mixtures of regressions with these distributions. For example, when $$q=0$$, we retrieve mixtures of (skew-normal, or normal) regressions, and when both $$p=0$$ and $$q=0$$, it is a mixture of (skew-normal, or normal) distributions. It also reduces to the standard (normal, skew-normal) distribution when we only use a single expert ($$K=1$$).

Model estimation/learning is performed by a dedicated expectation conditional maximization (ECM) algorithm by maximizing the observed data log-likelihood. We provide simulated examples to illustrate the use of the model in model-based clustering of heterogeneous regression data and in fitting non-linear regression functions.

It was written in R Markdown, using the knitr package for production.

See help(package="meteorits") for further details and references provided by citation("meteorits").

Application to a simulated dataset

Generate sample

n <- 500 # Size of the sample
alphak <- matrix(c(0, 8), ncol = 1) # Parameters of the gating network
betak <- matrix(c(0, -2.5, 0, 2.5), ncol = 2) # Regression coefficients of the experts
lambdak <- c(3, 5) # Skewness parameters of the experts
sigmak <- c(1, 1) # Standard deviations of the experts
x <- seq.int(from = -1, to = 1, length.out = n) # Inputs (predictors)

# Generate sample of size n
sample <- sampleUnivSNMoE(alphak = alphak, betak = betak, sigmak = sigmak,
lambdak = lambdak, x = x)
Set up EM parameters

n_tries <- 1
max_iter <- 1500
threshold <- 1e-6
verbose <- TRUE
verbose_IRLS <- FALSE

Estimation

snmoe <- emSNMoE(X = x, Y = y, K, p, q, n_tries, max_iter, threshold, verbose, verbose_IRLS)
## EM - SNMoE: Iteration: 1 | log-likelihood: -527.287937164066
## EM - SNMoE: Iteration: 2 | log-likelihood: -488.149669819772
## EM - SNMoE: Iteration: 3 | log-likelihood: -486.613979894615 SNMoE: Iteration: 90 | log-likelihood: -485.579645636856 ## EM - SNMoE: Iteration: 91 | log-likelihood: -485.579071362399 ## EM - SNMoE: Iteration: 92 | log-likelihood: -485.578512662018 ## EM - SNMoE: Iteration: 93 | log-likelihood: -485.577973190244 ## EM - SNMoE: Iteration: 94 | log-likelihood: -485.577452194271 ## EM - SNMoE: Iteration: 95 | log-likelihood: -485.576948142351 ## EM - SNMoE: Iteration: 96 | log-likelihood: -485.576456396579 ## EM - SNMoE: Iteration: 97 | log-likelihood: -485.575974064756 Summary snmoe$summary()
## -----------------------------------------------
## Fitted Skew-Normal Mixture-of-Experts model
## -----------------------------------------------
##
## SNMoE model with K = 2 experts:
##
##  log-likelihood df      AIC      BIC       ICL
##        -485.576 10 -495.576 -516.649 -516.6574
##
## Clustering table (Number of observations in each expert):
##
##   1   2
## 249 251
##
## Regression coefficients:
##
##     Beta(k = 1) Beta(k = 2)
## 1      1.051904    1.013374
## X^1    3.004689   -2.778066
##
## Variances:
##
##  Sigma2(k = 1) Sigma2(k = 2)
##      0.3738266     0.4534028

Application to a real dataset

data("tempanomalies")
x <- tempanomalies$Year y <- tempanomalies$AnnualAnomaly

Set up SNMoE model parameters

K <- 2 # Number of regressors/experts
p <- 1 # Order of the polynomial regression (regressors/experts)
q <- 1 # Order of the logistic regression (gating network)

Set up EM parameters

n_tries <- 1
max_iter <- 1500
threshold <- 1e-6
verbose <- TRUE
verbose_IRLS <- FALSE

Estimation

snmoe <- emSNMoE(X = x, Y = y, K, p, q, n_tries, max_iter,
threshold, verbose, verbose_IRLS)
Summary

Log-likelihood

snmoe\$plot(what = "loglikelihood")