GLM一般是指 generalized linear model ,也就是广义线性模型;而非 general linear model,也就是一般线性模型;而GLMM (generalized linear mixed model)是广义线性混合模型。

广义线性模型GLM很简单,举个例子,药物的疗效和服用药物的剂量有关。这个相关性可能是多种多样的,可能是简单线性关系(发烧时吃一片药退烧0.1度,两片药退烧0.2度,以此类推;这种情况就是一般线性模型),也可能是比较复杂的其他关系,如指数关系(一片药退烧0.1度,两片药退烧0.4度),对数关系等等。这些复杂的关系一般都可以通过一系列数学变换变成线性关系,以此统称为广义线性模型。

广义线性混合模型GLMM比较复杂,GLM要求观测值误差是随机的,而GLMM则要求误差值并非随机,而是呈一定分布的。举个例子,我们认为疗效可能与服药时间相关,但是这个相关并不是简简单单的疗效随着服药时间的变化而改变。更可能的是疗效的随机波动的程度与服药时间有关。比如说,在早上10:00的时候,所有人基本上都处于半饱状态,此时吃药,相同剂量药物效果都差不多。但在中午的时候,有的人还没吃饭, 有的人吃过饭了,有的人喝了酒,结果酒精和药物起了反应,有的人喝了醋,醋又和药物起了另一种反应。显然,中午吃药会导致药物疗效的随机误差非常大。这种疗效的随机误差(而非疗效本身)随着时间的变化而变化,并呈一定分布的情况,必须用广义线性混合模型了。

广义线性混合模型GLMM(Generalized Linear Mixed Model),是广义线性模型GLM 和线性混淆模型LMM 的扩展形式,于二十世纪九十年代被提出。GLMM因其借鉴了混合模型的思想,其在处理纵向数据(重复测量资料)时,被认为具有独特的优势。GLMM不仅擅长处理重复测量资料,还可以用于任何层次结构的数据(因为本质上又是多水平模型)。

广义线性混合模型GLMM,可以看做是线性混合模型LMM的扩展形式,使得因变量不再要求满足正态分布;也可以看作是GLM的扩展形式,使得可以同时包含固定效应和随机效应。

使用GLMM的软件包

R语言中的nlme:lme()应用于线性混合模型,nlme()应用于非线性混合模型。可以定义复杂的方差结构,不支持广义线性混合模型(GLMM),对于镶嵌结构(nested)的随机因子定义简单,但对于交叉的随机因子(crossed)定义困难;

R语言中的lme4:nlme的进一步扩展,支持GLMM,很难处理交叉的随机因子,运行速度要快于nlme;

R语言中的MCMCglmm:用马尔可夫链蒙特卡罗(Markov Chain Monte Carlo,MCMC)的方法拟合模型,贝叶斯先验分布,可以定义一些复杂的方差结构(heterogeneous yes, AR1 no);

SAS中的PROC GLIMMIX模块支持了Laplace approximation和adaptive Gaussian quadrature方法,但对于复杂的模型,用的还是PQL方法;

SAS中的PROC MIXED模块执行一般线性混合模型(LMM);

ASReml

ASReml-R包:是ASReml软件的R版本,运算速度快,支持复杂的模型(随机因子的定义G矩阵和残差矩阵的定义R矩阵),支持系谱信息和多性状分析,在动物、作物、林木、水产育种和科研中应用广发。

GenStat软件的GLMM模型

Packages

For now, this page is only covering "basic" mixed modeling packages (although the line is admittedly somewhat blurry): see the list of packages on the main page for packages covering additive mixed models, Cox regression, etc.

In a nutshell

R packages

  • MCMCglmm. Uses MCMC instead of ML to fit the model. Bayesian priors can be included. Some complex variance structures (heterogeneous yes, AR1 no).
  • nlme One of the first widely-used mixed-models software for S-Plus. Ported from S-plus to R. Nested random effects easily modeled. Crossed random effects difficult. Stable (maintenance-mode). Multiple functions (lme for linear, nlme for nonlinear, gls for no random terms). Complex (and custom) variance structures possible. No GLMMs.
  • lme4. Under active development, especially for GLMMs. No complex variance structures. Uses sparse matrix algebra, handles crossed random effects well. Much faster than nlme.
  • glmmADMB interface to ADMB (see below); flexible, but slower than other R packages.

non-R

  • ADMB. Automatic Differentiation Model Builder. Mostly used in Forestry/Fish/Wildlife. Started out as a commercial product, but now open-source. Non-linear models handled. ADMB-RE, implements random effects in non-linear models via Laplace, importance sampling, GHQ in some cases.
  • SAS Commercial. Full-featured.
  • PROC MIXED implements modern LMMs; it is very widely used with lots of examples, but can be very slow.
  • PROC GLIMMIX added generalized models; it now incorporates Laplace approximation and adaptive Gaussian quadrature, but falls back to PQL for models with complex correlation structures. It also has other features such as simpler syntax to request predictable functions of random effects.
  • HPMIXED is "High Performance" to address the slow speed of MIXED, but low-featured.
  • PROC NLMIXED is for non-linear and linear models (i.e. models that cannot be fitted in PROC MIXED/GLIMMIX, such as those with unusual variance-covariance structures or variances that are functions of fixed or random predictors). It also fits GLMMs via Laplace/GHQ (but ''not'' crossed effects). Multiple denominator degrees of freedom methods (Kenward Roger, Satterthwaite, Containment).
  • ASREML Commercial: free licenses available for academic and developing-country use. Available as a standalone, R package (ASREML-R, or in Genstat. Uses sparse matrices and Average Information for speed. Widely used in plant and animal breeding. Numerous error structures supported. Splines well-integrated. Generalized models: PQL only, warnings in documentation. Wald-type tests. Constraints on parameters allowed.

(To add: npmlreg, regress (from Gabor Grothendieck))


Linear mixed models

package

function

estimation

inference (tests)

inference (confidence intervals)

random effects (G structure)

residuals (R structure)

~other

nlme

lme

ML, REML

Wald (summary), likelihood ratio test (anova), sequential and marginal conditional F tests (anova)

Wald intervals on fixed and RE parameters (intervals)

multiple (nested) random effects;

diagonal, blocked structures (pdClasses);

crossed possible, but slow

spatial and temporal correlations (corStruct),

continuous and discrete heteroscedasticity (varStruct)

lme4

lmer1

ML, REML

F statistics (sans denominator df: summary), likelihood ratio test (anova),

post-hoc MCMC (mcmcsamp)2

post-hoc MCMC mcmcsamp

nested and crossed RE, diagonal or block diagonal3

none

lme4a

lmer4

as above

as above

as above + likelihood profiles, fast parametric bootstrapping bootMer

as above

none

lmm

ecmeml.lmm

ML(ECME algorithm)

lmm

fastml.lmm

ML(rapidly Converging algorithm)

asreml

asreml

Sparse matrix, Average Information REML

Wald anova

Standard errors

Multiple crossed/nested/blocked/splines

(Blocked) AR1xAR1, Matern, Factor Analytic, Heteroskedastic

statmod

mixedModel25

REML

SAS

PROC MIXED

REML,ML, MIVQUE0, or Type1–Type3(method= option)

wald t and F test

multiple,complex (you can define the co-variance structure by type option in random statement)

SAS

PROC GLIMMIX

pseduo

likelihood(default),Laplace,GHQ,REML,PQL

Wald, LRT(COVTEST Statement) ,Type III test for fixed effects

Wald (default),LRT

Multiple,nested or crossed

SAS

HPMIXED

REML

wald t, F test, type III test and chisq test

wald intervals on fixed effect and random effect (CL option)

multiple,complex

HLM

HLM

REML,FML

Multilevel,nested and or crossed random effects

MLWiN

ML,MCMC

Multilevel,nested/crossed random effects

Stata?

xtmixed//xtreg(random-intercept model)

REML,ML

Wald,LR test (with ML)

Wald

multilevel,nested/crossed,4 types of covariance structure

diagonal-blocked structures,Heteroskedastic random effects

Heteroskedastic (residuals()), _ independent/exchangeable/unstructured/banded/exponential


GLMMs

package

function

estimation

inference (tests)

inference (confidence intervals)

families

random effects

other

lme4

glmer

Laplace, AGHQ

Wald (summary), LRT (anova),

simulation tests of simple random effects (RLRsim package)

Wald (by hand)

Poisson, binomial

multiple: nested, crossed

lme4a

glmer

Laplace, AGHQ

Wald (summary), LRT (anova)

Wald (by hand): eventually, likelihood profiles

Poisson, binomial

multiple: nested, crossed

glmmML

glmmML

Laplace, AGHQ

Wald

Poisson, binomial [logit, cloglog]

single

glmmAK

logpoissonRE

MCMC

Wald

Poisson

single (normal or G-spline)

MCMCglmm

MCMCglmm

MCMC

'Bayesian p-value'

credible intervals (coda::HPDinterval)

Gaussian, Poisson, categorical,

multinomial, exponential, geometric,

categorical, various zero-inflated/altered

multiple, complex

MASS

glmmPQL

PQL

Wald (summary)

Wald

binomial, Poisson, Gamma, …

(see ?family)

spatial/temporal

correlation structures

(?nlme::corClasses)

gamlss.mx

glmmNP

GHQ/Expectation-maximization

many (see gamlss.family in the gamlss.dist package)

single ("two-level")

glmmBUGS

glmmBUGS

MCMC

Poisson, Binomial

spatial effects

hglm

hglm or hglm2

hierarchical likelihood

Wald (summary)

see ?family

HGLMMM

HGLMfit

hierarchical likelihood

first order Laplace ?

Wald (summary)

LRT (HGLMLRTest())

Binomial(logit),poisson(log),Normal(Identity),

Gamma(log, inverse)

complex,multiple

profile(LapFix=TRUE)

bernor

bnlogl

Monte Carlo sampling

Bernoulli (logit link)

glmmADMB

glmm.admb

Laplace

Wald (summary), LRT (anova), MCMC

Poisson, negative binomial, Bernoulli (+ zero-inflation)

single (multiple under development)

profiles

repeated

glmm

GHQ

Wald (summary)

Wald (by hand)

see ?family

single

R-INLA

inla

nested Laplace

Poisson,Binomial [logit,probit,cloglog]

Negative Binomial …

Spatial and temporal correlation models

SAS PROC GLIMMIX

PROC GLIMMIX

pseduo

likelihood(default),Laplace,GHQ,REML,PQL

Wald, LRT(COVTEST Statement)

Type III test for fixed effects

Wald (default),LRT

Binomial,Poisson,Gamma(check the Dist option)

multiple,nested and crossed

profile or non-profile

SAS PROC NLMIXED

PROC NLMIXED

GHQ, First-order method…(Check "method=" option)

Laplace (QPOINTS=1 option)

Wald, LRT

Wald

Normal,Binomial,Poisson,Binary,Gamma

Negative Binomial, General (custom defined), zero-inflated

number of random effects < 5

limited to only 2 levels

NLMMs and other extensions


package

function

estimation

inference (tests)

inference (confidence intervals)

families

random effects

other

nlme

nlme

ML OR REML

Wald t (summary)

Wald F (anova)

use intervals()

no specific family required ?

nested

lme4

nlmer

Laplace or PQL

(method option)

wald (summary)

wald (hand?)

no family required

nested or crossed

nlmixr

nlmixr

fo, foi, foce, focei, saem

(est option)

wald

wald

no family required

nested


Accessors

lme (nlme)

glmmPQL (MASS)

[g]lmer (lme4)

[g]lmer (lme4a)

MCMCglmm

glmm.admb

summary

estimate, std err, t, df, p

estimate, std err, t, df, p

lmer: estimate, std err, t

glmer: est, std err, Z, p (Wald/asymptotic)

like lme4

post.mean, CI, eff.sample

estimate,std.error,z values, p

coef

all coefficients

(predicted values for each group)







fixef

fixed effect parameters ()






ranef

random effect estimates ()






logLik

(marginal) log-likelihood






AIC

marginal AIC






confint

confidence intervals




intervals

confidence intervals



plot

diagnostic plots


✓ (not diagnostic plots)


predict

predicted values,

allowing new data






simulate

simulated values

from fitted model



✓ (for lmer)


fitted

fitted values







update

update model






residuals





VarCorr

variance-covariance matrices

of random effects






coefplot

plot of coefficients

and confidence/credible intervals





anova



✓ (no p-values)

✓(compare two models)

drop1

✓ (no LRT)


✓(no p-values)