DATOR


iris data 분석 II

rpart(), tree::tree() ctree() 비교

input variable is double,
target variable is factor


(1)tree()
> library(tree)
> ir.tr <- tree(Species ~., iris)
ir.tr

> ir.tr

node), split, n, deviance, yval, (yprob)

* denotes terminal node

 

1) root 150 329.600 setosa ( 0.33333 0.33333 0.33333 )

2) Petal.Length < 2.45 50 0.000 setosa ( 1.00000 0.00000 0.00000 ) *

3) Petal.Length > 2.45 100 138.600 versicolor ( 0.00000 0.50000 0.50000 )

6) Petal.Width < 1.75 54 33.320 versicolor ( 0.00000 0.90741 0.09259 )

12) Petal.Length < 4.95 48 9.721 versicolor ( 0.00000 0.97917 0.02083 )

24) Sepal.Length < 5.15 5 5.004 versicolor ( 0.00000 0.80000 0.20000 ) *

25) Sepal.Length > 5.15 43 0.000 versicolor ( 0.00000 1.00000 0.00000 ) *

13) Petal.Length > 4.95 6 7.638 virginica ( 0.00000 0.33333 0.66667 ) *

7) Petal.Width > 1.75 46 9.635 virginica ( 0.00000 0.02174 0.97826 )

14) Petal.Length < 4.95 6 5.407 virginica ( 0.00000 0.16667 0.83333 ) *

15) Petal.Length > 4.95 40 0.000 virginica ( 0.00000 0.00000 1.00000 ) *

 

> summary(ir.tr)

Classification tree:

tree(formula = Species ~ ., data = iris)

Variables actually used in tree construction:

[1] "Petal.Length" "Petal.Width" "Sepal.Length"

Number of terminal nodes: 6

Residual mean deviance: 0.1253 = 18.05 / 144

Misclassification error rate: 0.02667 = 4 / 150


> plot(ir.tr); text(ir.tr)

 


(2) rpart(), rpart.plot::prp()
 
> library(rpart)
> (m <- rpart(Species ~., data = iris ))
n= 150
 
node), split, n, loss, yval, (yprob)
* denotes terminal node
 
1) root 150 100 setosa (0.33333333 0.33333333 0.33333333)
2) Petal.Length< 2.45 50 0 setosa (1.00000000 0.00000000 0.00000000) *
3) Petal.Length>=2.45 100 50 versicolor (0.00000000 0.50000000 0.50000000)
6) Petal.Width< 1.75 54 5 versicolor (0.00000000 0.90740741 0.09259259) *
7) Petal.Width>=1.75 46 1 virginica (0.00000000 0.02173913 0.97826087) *
 
compress게 플롯
margin, cex자의
> plot(m, compress =TRUE , margin =.2)
 
> text(m, cex =1.5)
#Petal.Length < 2.45setosa
5050setosa이다.
#Petal.Length2.45, Petal.Width < 1.8 versicolor54개 행중 49개가 versicolor.
> par(mfrow=c(2,2))
> library(rpart.plot)
> prp(m, type =4, extra =2, main="extra=2")
 prp(m, type =4, extra =1,main="extra=1")
 
 4 Class models: probability per class of observations in the node
> prp(m, type =4, extra =4,main="extra=4")
> prp(m, type =4, extra =104)


(3) ctree() iris data,훈련dataall
> library(party)
필요한 패키지를 로딩중입니다: grid
필요한 패키지를 로딩중입니다: mvtnorm
필요한 패키지를 로딩중입니다: modeltools
필요한 패키지를 로딩중입니다: stats4
필요한 패키지를 로딩중입니다: strucchange
필요한 패키지를 로딩중입니다: zoo
 
다음의 패키지를 부착합니다: ‘zoo’
 
The following objects are masked from ‘package:base’:
 
as.Date, as.Date.numeric
 
필요한 패키지를 로딩중입니다: sandwich
 
> (m <- ctree(Species~., data = iris ))
 
Conditional inference tree with 4 terminal nodes
 
Response: Species
Inputs: Sepal.Length, Sepal.Width, Petal.Length, Petal.Width
Number of observations: 150
 
1) Petal.Length <= 1.9; criterion = 1, statistic = 140.264
2)* weights = 50
1) Petal.Length > 1.9
3) Petal.Width <= 1.7; criterion = 1, statistic = 67.894
4) Petal.Length <= 4.8; criterion = 0.999, statistic = 13.865
5)* weights = 46
4) Petal.Length > 4.8
6)* weights = 8
3) Petal.Width > 1.7
7)* weights = 46
 
> plot(m)


Tag

Leave Comments