[R] 반복측정 분산분석 및 사후분석 (tukey, bonferroni)

반복측정 분산분석이란?

반복측정 분산분석은 서로 연관이 있는 셋 이상의 집단비교에 사용됩니다. 예를 들어봅시다. 피험자 30명을 모집하고 세 종류의 초콜릿을 먹고 각 초콜릿의 점수를 0~100점 사이로 매기라고 하였습니다. 이때 초콜릿 점수의 차이가 있는지를 알고 싶을 경우 반복측정 분산분석을 사용합니다.

세가지 초콜릿을 먹은 피험자 집단이 같기 때문에 세 초콜릿의 점수 데이터는 서로 독립이 아닙니다. 그룹이 2개라면 paired t-test를 해야하는 상황입니다. 이런 경우에 사용하는 분석방법이 반복측정 분산분석입니다. 영

어로는 repeated measure ANOVA, 줄여서 RM ANOVA 라고 부릅니다.

반복측정 분산분석의 조건

아래 세가지 조건을 만족해야 합니다.

1. 정규성
2. 등분산성
3. 구형성

예시

1) 구형성 가정을 만족하는 경우

#1. 데이터 생성
choco_A=sample(30:100,30,replace=TRUE)
choco_B=sample(0:70,30,replace=TRUE)
choco_C=sample(10:80,30,replace=TRUE)

#2. 데이터를 stack 형태의 데이터프레임으로 변형
classA=data.frame(id=1:30,rep("A",length(choco_A)),choco_A)
classB=data.frame(id=1:30,rep("B",length(choco_B)),choco_B)
classC=data.frame(id=1:30,rep("C",length(choco_C)),choco_C)

colnames(classA)=c("id","x","y")
colnames(classB)=c("id","x","y")
colnames(classC)=c("id","x","y")

md=rbind(classA,classB,classC)
md$id=factor(md$id) #as factor
md$x=factor(md$x) #as factor


#3. 정규성검정

shapiro.test(choco_A)
shapiro.test(choco_B)
shapiro.test(choco_C)


#3. 등분산검정
library(lawstat)
levene.test(md$y,md$x,location="mean")


#4. 구형성 검정 (Mauchly's test)

library(rstatix)
res1=anova_test(data=md,dv=y,wid=id,within=x) 
get_anova_table(res1)


#3. 반복측정 분산분석
res=aov(y~x+Error(id/x),md)
summary(res) #p-value


#4. 사후분석

##bonferroni
with(md,pairwise.t.test(y,x,p.adjust.method="bonferroni",paired=T))


##tukey
library("nlme")
res2=lme(y~x, random =~1|id/x,md)

library("multcomp")
summary(glht(res2,linfct=mcp(x="Tukey")))

aov 결과

> summary(res) #p-value

Error: id
          Df Sum Sq Mean Sq F value Pr(>F)
Residuals 29  13142   453.2               

Error: id:x
          Df Sum Sq Mean Sq F value  Pr(>F)    
x          2  15805    7902   19.73 2.9e-07 ***
Residuals 58  23226     400                    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

본페로니 사후분석 결과

> with(md,pairwise.t.test(y,x,p.adjust.method="bonferroni",paired=T))

	Pairwise comparisons using paired t tests 

data:  y and x 

  A      B     
B 3e-06  -     
C 0.0016 0.0985

P value adjustment method: bonferroni

튜키 사후분석 결과

> summary(glht(res2,linfct=mcp(x="Tukey")))

	 Simultaneous Tests for General Linear Hypotheses

Multiple Comparisons of Means: Tukey Contrasts


Fit: lme.formula(fixed = y ~ x, data = md, random = ~1 | id/x)

Linear Hypotheses:
           Estimate Std. Error z value Pr(>|z|)    
B - A == 0  -31.967      5.167  -6.187   <0.001 ***
C - A == 0  -20.867      5.167  -4.039   <0.001 ***
C - B == 0   11.100      5.167   2.148   0.0804 .  
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Adjusted p values reported -- single-step method)

tukey 사후검정을 하려면 aov 함수를 써야하는데, aov 함수의 경우 구형성 가정이 깨지면 교정이 불가능하다. rstatix 패키지의 anova_test 를 이용하면 구형성에 대한 교정이 자동으로 이루어지는데 tukey 사후검정이 없다.

2) 구형성 가정을 만족하지 않는 경우

#1. 데이터 생성
choco_A=sample(30:100,30,replace=TRUE)
choco_B=sample(0:70,30,replace=TRUE)
choco_C=sample(10:80,30,replace=TRUE)

#2. 데이터를 stack 형태의 데이터프레임으로 변형
classA=data.frame(id=1:30,rep("A",length(choco_A)),choco_A)
classB=data.frame(id=1:30,rep("B",length(choco_B)),choco_B)
classC=data.frame(id=1:30,rep("C",length(choco_C)),choco_C)

colnames(classA)=c("id","x","y")
colnames(classB)=c("id","x","y")
colnames(classC)=c("id","x","y")

md=rbind(classA,classB,classC)
md$id=factor(md$id) #as factor
md$x=factor(md$x) #as factor


#3. 정규성검정

shapiro.test(choco_A)
shapiro.test(choco_B)
shapiro.test(choco_C)


#4. 등분산검정
library(lawstat)
levene.test(md$y,md$x,location="mean")


#5. 구형성 검정 (Mauchly's test) correction and anove

res1=anova_test(data=md,dv=y,wid=id,within=x) 


#6. 사후분석

##bonferroni
pairwise_t_test(md,y~x,paired=TRUE, p.adjust.method = "bonferroni" )

aov 결과

> res1
ANOVA Table (type III tests)

$ANOVA
  Effect DFn DFd      F       p p<.05   ges
1      x   2  58 19.734 2.9e-07     * 0.303

$`Mauchly's Test for Sphericity`
  Effect     W     p p<.05
1      x 0.992 0.893      

$`Sphericity Corrections`
  Effect   GGe      DF[GG]    p[GG] p[GG]<.05   HFe      DF[HF]   p[HF] p[HF]<.05
1      x 0.992 1.98, 57.54 3.19e-07         * 1.065 2.13, 61.74 2.9e-07         *

본페로니 사후분석 결과

> pairwise_t_test(md,y~x,paired=TRUE, p.adjust.method = "bonferroni" )
# A tibble: 3 x 10
  .y.   group1 group2    n1    n2 statistic    df          p      p.adj p.adj.signif
* <chr> <chr>  <chr>  <int> <int>     <dbl> <dbl>      <dbl>      <dbl> <chr>       
1 y     A      B         30    30      6.16    29 0.00000101 0.00000303 ****        
2 y     A      C         30    30      3.90    29 0.000529   0.002      **          
3 y     B      C         30    30     -2.24    29 0.033      0.098      ns

저작자표시 비영리 변경금지

내 블로그 - 관리자 홈 전환	`Q` `Q`
새 글 쓰기	`W` `W`

글 수정 (권한 있는 경우)	`E` `E`
댓글 영역으로 이동	`C` `C`

이 페이지의 URL 복사	`S` `S`
맨 위로 이동	`T` `T`
티스토리 홈 이동	`H` `H`
단축키 안내	`Shift` + `/` `⇧` + `/`

R 통계분석

[R] 반복측정 분산분석 및 사후분석 (tukey, bonferroni)

반복측정 분산분석이란?

반복측정 분산분석의 조건

예시

1) 구형성 가정을 만족하는 경우

2) 구형성 가정을 만족하지 않는 경우

댓글

티스토리툴바

개인정보

단축키

내 블로그

블로그 게시글

모든 영역