GLM一般是指 generalized linear model ,也就是广义线性模型;而非 general linear model,也就是一般线性模型;而GLMM (generalized linear mixed model)是广义线性混合模型。
广义线性模型GLM很简单,举个例子,药物的疗效和服用药物的剂量有关。这个相关性可能是多种多样的,可能是简单线性关系(发烧时吃一片药退烧0.1度,两片药退烧0.2度,以此类推;这种情况就是一般线性模型),也可能是比较复杂的其他关系,如指数关系(一片药退烧0.1度,两片药退烧0.4度),对数关系等等。这些复杂的关系一般都可以通过一系列数学变换变成线性关系,以此统称为广义线性模型。
广义线性混合模型GLMM比较复杂,GLM要求观测值误差是随机的,而GLMM则要求误差值并非随机,而是呈一定分布的。举个例子,我们认为疗效可能与服药时间相关,但是这个相关并不是简简单单的疗效随着服药时间的变化而改变。更可能的是疗效的随机波动的程度与服药时间有关。比如说,在早上10:00的时候,所有人基本上都处于半饱状态,此时吃药,相同剂量药物效果都差不多。但在中午的时候,有的人还没吃饭, 有的人吃过饭了,有的人喝了酒,结果酒精和药物起了反应,有的人喝了醋,醋又和药物起了另一种反应。显然,中午吃药会导致药物疗效的随机误差非常大。这种疗效的随机误差(而非疗效本身)随着时间的变化而变化,并呈一定分布的情况,必须用广义线性混合模型了。
广义线性混合模型GLMM(Generalized Linear Mixed Model),是广义线性模型GLM 和线性混淆模型LMM 的扩展形式,于二十世纪九十年代被提出。GLMM因其借鉴了混合模型的思想,其在处理纵向数据(重复测量资料)时,被认为具有独特的优势。GLMM不仅擅长处理重复测量资料,还可以用于任何层次结构的数据(因为本质上又是多水平模型)。
广义线性混合模型GLMM,可以看做是线性混合模型LMM的扩展形式,使得因变量不再要求满足正态分布;也可以看作是GLM的扩展形式,使得可以同时包含固定效应和随机效应。
使用GLMM的软件包
R语言中的nlme:lme()应用于线性混合模型,nlme()应用于非线性混合模型。可以定义复杂的方差结构,不支持广义线性混合模型(GLMM),对于镶嵌结构(nested)的随机因子定义简单,但对于交叉的随机因子(crossed)定义困难;
R语言中的lme4:nlme的进一步扩展,支持GLMM,很难处理交叉的随机因子,运行速度要快于nlme;
R语言中的MCMCglmm:用马尔可夫链蒙特卡罗(Markov Chain Monte Carlo,MCMC)的方法拟合模型,贝叶斯先验分布,可以定义一些复杂的方差结构(heterogeneous yes, AR1 no);
SAS中的PROC GLIMMIX模块支持了Laplace approximation和adaptive Gaussian quadrature方法,但对于复杂的模型,用的还是PQL方法;
SAS中的PROC MIXED模块执行一般线性混合模型(LMM);
ASReml
ASReml-R包:是ASReml软件的R版本,运算速度快,支持复杂的模型(随机因子的定义G矩阵和残差矩阵的定义R矩阵),支持系谱信息和多性状分析,在动物、作物、林木、水产育种和科研中应用广发。
GenStat软件的GLMM模型
Packages
For now, this page is only covering "basic" mixed modeling packages (although the line is admittedly somewhat blurry): see the list of packages on the main page for packages covering additive mixed models, Cox regression, etc.
In a nutshell
R packages
- MCMCglmm. Uses MCMC instead of ML to fit the model. Bayesian priors can be included. Some complex variance structures (heterogeneous yes, AR1 no).
- nlme One of the first widely-used mixed-models software for S-Plus. Ported from S-plus to R. Nested random effects easily modeled. Crossed random effects difficult. Stable (maintenance-mode). Multiple functions (lme for linear, nlme for nonlinear, gls for no random terms). Complex (and custom) variance structures possible. No GLMMs.
- lme4. Under active development, especially for GLMMs. No complex variance structures. Uses sparse matrix algebra, handles crossed random effects well. Much faster than nlme.
- glmmADMB interface to ADMB (see below); flexible, but slower than other R packages.
non-R
- ADMB. Automatic Differentiation Model Builder. Mostly used in Forestry/Fish/Wildlife. Started out as a commercial product, but now open-source. Non-linear models handled. ADMB-RE, implements random effects in non-linear models via Laplace, importance sampling, GHQ in some cases.
- SAS Commercial. Full-featured.
- PROC MIXED implements modern LMMs; it is very widely used with lots of examples, but can be very slow.
- PROC GLIMMIX added generalized models; it now incorporates Laplace approximation and adaptive Gaussian quadrature, but falls back to PQL for models with complex correlation structures. It also has other features such as simpler syntax to request predictable functions of random effects.
- HPMIXED is "High Performance" to address the slow speed of MIXED, but low-featured.
- PROC NLMIXED is for non-linear and linear models (i.e. models that cannot be fitted in PROC MIXED/GLIMMIX, such as those with unusual variance-covariance structures or variances that are functions of fixed or random predictors). It also fits GLMMs via Laplace/GHQ (but ''not'' crossed effects). Multiple denominator degrees of freedom methods (Kenward Roger, Satterthwaite, Containment).
- ASREML Commercial: free licenses available for academic and developing-country use. Available as a standalone, R package (ASREML-R, or in Genstat. Uses sparse matrices and Average Information for speed. Widely used in plant and animal breeding. Numerous error structures supported. Splines well-integrated. Generalized models: PQL only, warnings in documentation. Wald-type tests. Constraints on parameters allowed.
(To add: npmlreg, regress (from Gabor Grothendieck))
- Linear mixed models
- Generalized linear mixed models
- Nonlinear mixed models and other extensions
- Interfaces from R to other systems
- Accessor methods within R
Linear mixed models
package | function | estimation | inference (tests) | inference (confidence intervals) | random effects (G structure) | residuals (R structure) | ~other |
nlme | lme | ML, REML | Wald (summary), likelihood ratio test (anova), sequential and marginal conditional F tests (anova) | Wald intervals on fixed and RE parameters (intervals) | multiple (nested) random effects; diagonal, blocked structures (pdClasses); crossed possible, but slow | spatial and temporal correlations (corStruct), continuous and discrete heteroscedasticity (varStruct) | |
lme4 | lmer1 | ML, REML | F statistics (sans denominator df: summary), likelihood ratio test (anova), post-hoc MCMC (mcmcsamp)2 | post-hoc MCMC mcmcsamp | nested and crossed RE, diagonal or block diagonal3 | none | |
lme4a | lmer4 | as above | as above | as above + likelihood profiles, fast parametric bootstrapping bootMer | as above | none | |
lmm | ecmeml.lmm | ML(ECME algorithm) | |||||
lmm | fastml.lmm | ML(rapidly Converging algorithm) | |||||
asreml | asreml | Sparse matrix, Average Information REML | Wald anova | Standard errors | Multiple crossed/nested/blocked/splines | (Blocked) AR1xAR1, Matern, Factor Analytic, Heteroskedastic | |
statmod | mixedModel25 | REML | |||||
SAS | PROC MIXED | REML,ML, MIVQUE0, or Type1–Type3(method= option) | wald t and F test | multiple,complex (you can define the co-variance structure by type option in random statement) | |||
SAS | PROC GLIMMIX | pseduo likelihood(default),Laplace,GHQ,REML,PQL | Wald, LRT(COVTEST Statement) ,Type III test for fixed effects | Wald (default),LRT | Multiple,nested or crossed | ||
SAS | HPMIXED | REML | wald t, F test, type III test and chisq test | wald intervals on fixed effect and random effect (CL option) | multiple,complex | ||
HLM | HLM | REML,FML | Multilevel,nested and or crossed random effects | ||||
MLWiN | ML,MCMC | Multilevel,nested/crossed random effects | |||||
Stata? | xtmixed//xtreg(random-intercept model) | REML,ML | Wald,LR test (with ML) | Wald | multilevel,nested/crossed,4 types of covariance structure diagonal-blocked structures,Heteroskedastic random effects | Heteroskedastic (residuals()), _ independent/exchangeable/unstructured/banded/exponential |
GLMMs
package | function | estimation | inference (tests) | inference (confidence intervals) | families | random effects | other |
lme4 | glmer | Laplace, AGHQ | Wald (summary), LRT (anova), simulation tests of simple random effects (RLRsim package) | Wald (by hand) | Poisson, binomial | multiple: nested, crossed | |
lme4a | glmer | Laplace, AGHQ | Wald (summary), LRT (anova) | Wald (by hand): eventually, likelihood profiles | Poisson, binomial | multiple: nested, crossed | |
glmmML | glmmML | Laplace, AGHQ | Wald | Poisson, binomial [logit, cloglog] | single | ||
glmmAK | logpoissonRE | MCMC | Wald | Poisson | single (normal or G-spline) | ||
MCMCglmm | MCMCglmm | MCMC | 'Bayesian p-value' | credible intervals (coda::HPDinterval) | Gaussian, Poisson, categorical, multinomial, exponential, geometric, categorical, various zero-inflated/altered | multiple, complex | |
MASS | glmmPQL | PQL | Wald (summary) | Wald | binomial, Poisson, Gamma, … (see ?family) | spatial/temporal correlation structures (?nlme::corClasses) | |
glmmNP | GHQ/Expectation-maximization | many (see gamlss.family in the gamlss.dist package) | single ("two-level") | ||||
glmmBUGS | glmmBUGS | MCMC | Poisson, Binomial | spatial effects | |||
hglm | hglm or hglm2 | hierarchical likelihood | Wald (summary) | see ?family | |||
HGLMMM | HGLMfit | hierarchical likelihood first order Laplace ? | Wald (summary) LRT (HGLMLRTest()) | Binomial(logit),poisson(log),Normal(Identity), Gamma(log, inverse) | complex,multiple | profile(LapFix=TRUE) | |
bnlogl | Monte Carlo sampling | Bernoulli (logit link) | |||||
glmm.admb | Laplace | Wald (summary), LRT (anova), MCMC | Poisson, negative binomial, Bernoulli (+ zero-inflation) | single (multiple under development) | profiles | ||
glmm | GHQ | Wald (summary) | Wald (by hand) | see ?family | single | ||
inla | nested Laplace | Poisson,Binomial [logit,probit,cloglog] Negative Binomial … | Spatial and temporal correlation models | ||||
SAS PROC GLIMMIX | PROC GLIMMIX | pseduo likelihood(default),Laplace,GHQ,REML,PQL | Wald, LRT(COVTEST Statement) Type III test for fixed effects | Wald (default),LRT | Binomial,Poisson,Gamma(check the Dist option) | multiple,nested and crossed | profile or non-profile |
SAS PROC NLMIXED | PROC NLMIXED | GHQ, First-order method…(Check "method=" option) Laplace (QPOINTS=1 option) | Wald, LRT | Wald | Normal,Binomial,Poisson,Binary,Gamma Negative Binomial, General (custom defined), zero-inflated | number of random effects < 5 limited to only 2 levels |
NLMMs and other extensions
package | function | estimation | inference (tests) | inference (confidence intervals) | families | random effects | other |
nlme | nlme | ML OR REML | Wald t (summary) Wald F (anova) | use intervals() | no specific family required ? | nested | |
lme4 | nlmer | Laplace or PQL (method option) | wald (summary) | wald (hand?) | no family required | nested or crossed | |
nlmixr | nlmixr | fo, foi, foce, focei, saem (est option) | wald | wald | no family required | nested |
Accessors
lme (nlme) | glmmPQL (MASS) | [g]lmer (lme4) | [g]lmer (lme4a) | MCMCglmm | glmm.admb | ||
summary | estimate, std err, t, df, p | estimate, std err, t, df, p | lmer: estimate, std err, t glmer: est, std err, Z, p (Wald/asymptotic) | like lme4 | post.mean, CI, eff.sample | estimate,std.error,z values, p | |
coef | all coefficients (predicted values for each group) | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
fixef | fixed effect parameters () | ✓ | ✓ | ✓ | ✓ | ✓ | |
ranef | random effect estimates () | ✓ | ✓ | ✓ | ✓ | ✓ | |
logLik | (marginal) log-likelihood | ✓ | ✓ | ✓ | ✓ | ✓ | |
AIC | marginal AIC | ✓ | ✓ | ✓ | ✓ | ✓ | |
confint | confidence intervals | ✓ | ✓ | ✓ | |||
intervals | confidence intervals | ✓ | ✓ | ||||
plot | diagnostic plots | ✓ | ✓ (not diagnostic plots) | ✓ | |||
predict | predicted values, allowing new data | ✓ | ✓ | ✓ | ✓ | ✓ | |
simulate | simulated values from fitted model | ✓ | ✓ | ✓ (for lmer) | ✓ | ||
fitted | fitted values | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
update | update model | ✓ | ✓ | ✓ | ✓ | ✓ | |
residuals | ✓ | ✓ | ✓ | ✓ | |||
VarCorr | variance-covariance matrices of random effects | ✓ | ✓ | ✓ | ✓ | ✓ | |
coefplot | plot of coefficients and confidence/credible intervals | ✓ | ✓ | ✓ | ✓ | ||
anova | ✓ | ✓ | ✓ (no p-values) | ✓(compare two models) | |||
drop1 | ✓ (no LRT) | ✓ | ✓(no p-values) | |