This post we will look at variable selection using the scikit-learn forest module.
In this post we will look at one of the easiest yet effective supervised machine learning algorithms. KNN is what we call lazy learning because there is no test and train phase.
# Load Package library(caret) # attach data data(iris) attach(iris) # split in to test and train NOTE: This is a lazy algorithm and necessarily we do not need test and train ind <- sample(2,nrow(iris),replace=TRUE,prob=c(0.7,0.3)) trainData <- iris[ind==1,] testData <- iris[ind==2,] # Build Model model_knn3 <- knn3(Species~., k=5, data=trainData) model_knn3 # Model get best class predict(model_knn3, testData, ) # compare prediction with real classes (confusion table) table(true=testData$Species, predicted=predict(model_knn3, testData, type='class')) Using the above code we can correctly classify the iris data set correctly 99% of the time. |