Category: Machine Learning

Python Variable Selection

1/27/2015

This post we will look at variable selection using the scikit-learn forest module.

In this post we will look at one of the easiest yet effective supervised machine learning algorithms. KNN is what we call lazy learning because there is no test and train phase.

# Load Package
library(caret)

# attach data
data(iris)
attach(iris)

# split in to test and train NOTE: This is a lazy algorithm and necessarily we do not need test and train
ind <- sample(2,nrow(iris),replace=TRUE,prob=c(0.7,0.3))
trainData <- iris[ind==1,]
testData <- iris[ind==2,]

# Build Model
model_knn3 <- knn3(Species~., k=5, data=trainData)
model_knn3

# Model get best class
predict(model_knn3, testData, )

# compare prediction with real classes (confusion table)
table(true=testData$Species, predicted=predict(model_knn3, testData, type='class'))

Using the above code we can correctly classify the iris data set correctly 99% of the time.

1 Comment

Python Variable Selection

KNN

Author

Archives

Categories