In order to illustrate the problem of chosing a classification model consider some simulated data, > n = 500 > set.seed(1) > X = rnorm(n) > ma = 10-(X+1.5)^2*2 > mb = -10+(X-1.5)^2*2 > M = cbind(ma,mb) > set.seed(1) > Z = sample(1:2,size=n,replace=TRUE) > Y = ma*(Z==1)+mb*(Z==2)+rnorm(n)*5 > df = data.frame(Z=as.factor(Z),X,Y) A first strategy is to split the dataset in two parts, a training dataset, and a testing dataset. > df1 = training = df[1:300,] > df2 = testing = df[301:500,] The Holdout Method: … Continue reading Choosing a Classifier →
from R-bloggers http://ift.tt/1TPDOne
via IFTTT
Suscribirse a:
Enviar comentarios (Atom)
No hay comentarios:
Publicar un comentario