太阳总会升起，只是需要冉冉: R中glm模型的实现（总结）

（1）因变量>20&<20
>data("plasma", package = "HSAUR")
>plasma_glm_1 <- glm(ESR ~ fibrinogen, data = plasma,family = binomial())
>plasma_glm_2 <- glm(ESR ~ fibrinogen + globulin, data = plasma,family = binomial())

画阴影图：
> data("plasma", package = "HSAUR")
> layout(matrix(1:2, ncol = 2))
> cdplot(ESR ~ fibrinogen, data = plasma)
> cdplot(ESR ~ globulin, data = plasma)
比较两个模型：anova(plasma_glm_1, plasma_glm_2, test = "Chisq")
比较1：confint(plasma_glm_1, parm = "fibrinogen")
2.exp(coef(plasma_glm_1)["fibrinogen"])
3.exp(confint(plasma_glm_1, parm = "fibrinogen"))

泡泡图：
> prob <- predict(plasma_glm_2, type = "response")
画点> plot(globulin ~ fibrinogen, data = plasma, xlim = c(2, 6),ylim = c(25, 55), pch = ".")
画圈> symbols(plasma$fibrinogen, plasma$globulin, circles = prob,add = TRUE)

(2)women's role ----fm1 <- cbind(agree, disagree) ~ sex + education
一共有两种情况，（1）是只做sex和education的主效应，（2）是主效应再加上两者的交互作用
> data("womensrole", package = "HSAUR")
> fm1 <- cbind(agree, disagree) ~ sex + education
> womensrole_glm_1 <- glm(fm1, data = womensrole,family = binomial())
> fm2 <- cbind(agree,disagree) ~ sex * education
> womensrole_glm_2 <- glm(fm2, data = womensrole,family = binomial())
*做预测曲线
（1）> role.fitted1 <- predict(womensrole_glm_1, type = "response")
（2）> role.fitted2 <- predict(womensrole_glm_2, type = "response")

先做一个语句>myplot <- function(role.fitted) {
f <- womensrole$sex == "Female"
plot(womensrole$education, role.fitted, type = "n",
ylab = "Probability of agreeing",
xlab = "Education", ylim = c(0,1))
lines(womensrole$education[!f], role.fitted[!f], lty = 1)
lines(womensrole$education[f], role.fitted[f], lty = 2)
lgtxt <- c("Fitted (Males)", "Fitted (Females)")
legend("topright", lgtxt, lty = 1:2, bty = "n")
y <- womensrole$agree / (womensrole$agree +
womensrole$disagree)
text(womensrole$education, y, ifelse(f, "\\VE", "\\MA"),
family = "HersheySerif", cex = 1.25)
}
再输出图形
（1）> myplot(role.fitted1)
（2）> myplot(role.fitted2)
*做残差散点图
> res <- residuals(womensrole_glm_2, type = "deviance")
> plot(predict(womensrole_glm_2), res,xlab="Fitted values", ylab = "Residuals",ylim = max(abs(res)) * c(-1,1))
> abline(h = 0, lty = 2)

(3)number ~ treat + age,
计数数据的两个问题：都是正值，不是正态分布。考虑用Poisson回归的GML,这种标准化线性模型，即对数函数来保证都是正值，且保证 Poisson error distribution另外的方法就是用quasi-likelihood
3.1Poisson回归
> data("polyps", package = "HSAUR")
> polyps_glm_1 <- glm(number ~ treat + age, data = polyps,family = poisson())
> summary(polyps_glm_1)
3.2quasi-likelihood回归
> data("polyps", package = "HSAUR")
> polyps_glm_2 <- glm(number ~ treat + age, data = polyps, family = quasipoisson())
> summary(polyps_glm_2)

**3.1比3.2的要更加显著，虽然两者都是显著的

太阳总会升起，只是需要冉冉

2008年11月27日星期四

R中glm模型的实现（总结）

2 条评论:

心理统计

博客归档

我的简介

我的博客列表