# R in action读书笔记（6）-第七章：基本统计分析（中）

2015/04/20 22:28

7.2 频数表和列联表

> library(vcd)

ID Treatment Sex Age Improved

1 57 Treated Male 27 Some

2 46 Treated Male 29 None

3 77 Treated Male 30 None

4 17 Treated Male 32 Marked

5 36 Treated Male 46 Marked

6 23 Treated Male 58 Marked

7.2.1 生成频数表

table(var1, var2, , varN) 使用 N 个类别型变量（因子）创建一个 N 维列联表

xtabs(formula, data) 根据一个公式和一个矩阵或数据框创建一个 N 维列联表

prop.table(table, margins) margins定义的边际列表将表中条目表示为分数形式

margin.table(table, margins) margins定义的边际列表计算表中条目的和

ftable(table) 创建一个紧凑的“平铺”式列联表

1. 一维列联表

> mytable<-with(Arthritis,table(Improved))
> mytable
Improved
 None   Some Marked 
 42     14     28 

> prop.table(mytable)
Improved
 None      Some    Marked 
0.5000000 0.1666667 0.3333333 

2. 二维列联表

> mytable<-xtabs(~Treatment+Improved,data=Arthritis)
> mytable
 Improved
Treatment None Some Marked
 Placebo   29    7      7
 Treated   13    7     21

> margin.table(mytable,1)
Treatment
Placebo Treated 
 43      41 
> prop.table(mytable,1)
 Improved
Treatment   None      Some    Marked
 Placebo 0.6744186 0.1627907 0.1627907
 Treated 0.3170732 0.1707317 0.5121951

> margin.table(mytable,2)
Improved
 None   Some Marked 
 42     14     28 
> prop.table(mytable,2)
 Improved
Treatment      None      Some    Marked
 Placebo 0.6904762 0.5000000 0.2500000
 Treated 0.3095238 0.5000000 0.7500000

> prop.table(mytable)
 Improved
Treatment       None       Some     Marked
 Placebo 0.34523810 0.08333333 0.08333333
 Treated 0.15476190 0.08333333 0.25000000

> addmargins(mytable)
 Improved
Treatment None Some Marked Sum
 Placebo   29    7      7  43
 Treated   13    7     21  41
 Sum       42   14     28  84
> addmargins(prop.table(mytable))
 Improved
Treatment       None       Some     Marked        Sum
 Placebo 0.34523810 0.08333333 0.08333333 0.51190476
 Treated 0.15476190 0.08333333 0.25000000 0.48809524
 Sum     0.50000000 0.16666667 0.33333333 1.00000000

> addmargins(prop.table(mytable,1),2)#仅添加了各行的和
 Improved
Treatment      None      Some    Marked       Sum
 Placebo 0.6744186 0.1627907 0.1627907 1.0000000
 Treated 0.3170732 0.1707317 0.5121951 1.0000000

> CrossTable(Arthritis$Treatment,Arthritis$Improved)

CrossTable()函数有很多选项，可以做许多事情：计算（行、列、单元格）的百分比；指

3.多维列联表

> mytable<-xtabs(~Treatment+Sex+Improved,data=Arthritis)
, , Improved = None
 Sex
Treatment Female Male
 Placebo     19   10
 Treated      6    7
, , Improved = Some
 Sex
Treatment Female Male
 Placebo      7    0
 Treated      5    2
, , Improved = Marked
 Sex
Treatment Female Male
 Placebo      6    1
 Treated     16    5
 
> ftable(mytable)
 Improved None Some Marked
Treatment Sex 
Placebo   Female            19    7      6
 Male              10    0      1
Treated   Female             6    5     16
 Male               7    2      5

> margin.table(mytable,c(1,3))#治疗情况（Treatment× 改善情况（Improved）的边际频数

 Improved
Treatment None Some Marked
 Placebo   29    7      7
 Treated   13    7     21

7.2.2独立性检验

1. 卡方独立性检验

> library(vcd)
> mytable<-xtabs(~Treatment+Improved,data=Arthritis)
> chisq.test(mytable)
 Pearson's Chi-squared test
data:  mytable
X-squared = 13.055, df = 2, p-value = 0.001463#治疗情况和改善情况不独立

2. Fisher精确检验

> fisher.test(mytable)
 Fisher's Exact Test for Count Data
data:  mytable
p-value = 0.001393
alternative hypothesis: two.sided

3.Cochran-Mantel—Haenszel检验

mantelhaen.test()函数可用来进行Cochran—Mantel—Haenszel卡方检验，其原假设是，两

 > mantelhaen.test(mytable)
 Cochran-Mantel-Haenszel test
data:  mytable
Cochran-Mantel-Haenszel M^2 = 14.6323, df = 2,
p-value = 0.0006647

7.2.3 相关性的度量

> mytable<-xtabs(~Treatment+Improved,data=Arthritis)
> assocstats(mytable)
 X^2 df  P(> X^2)
Likelihood Ratio 13.530  2 0.0011536
Pearson          13.055  2 0.0014626
Phi-Coefficient   : 0.394 
Contingency Coeff.: 0.367 
Cramer's V        : 0.394 

7.2.5将表转换为扁平格式

> table2flat<-function(mytable){
+ df<-as.data.frame(mytable)
+ rows<-dim(df)[1]
+ cols<-dim(df)[2]
+ x<-NULL
+ for(i in 1:rows){
+ for(j in 1:df\$Freq[i]){
+ row<-df[i,c(1:(cols-1))]
+ x<-rbind(x,row)
+ }
+ }
+ row.names(x)<-c(1:dim(x)[1])
+ return(x)
+ }

> treatment<-rep(c("Placebo","Treated"),times=3)
> improved<-rep(c("None","Some","marked"),each=2)
> Freq<-c(29,13,7,17,7,21)
> mytable<-as.data.frame(cbind(treatment,improved,Freq))
> mydata<-table2flat(mytable)
> head(mydata)
 treatment inmproved
1   Placebo      None
2   Placebo      None
3   Placebo      None
4   Placebo      None
5   Treated      None
6   Placebo      Some

0
0 收藏

0 评论
0 收藏
0