A4 – Machine Learning

Description

  • In assignment 4 you shall implement the Naïve Bayes machine learning algorithm and use it on some datasets
  • It can be implemented in any programming language you like
  • You can work alone or in group of two students
  • You shall present your application and code at an oral examination
  • Note that you are not required to build a REST web service for this assignment

 

Requirements

Grade Requirements
E
  • Implement the Naïve Bayes algorithm, using the code structure below (you are allowed to add more classes and methods if needed)
  • Train the model on the Iris and Banknote authentication datasets (see Datasets page)
  • Calculate classification accuracies for both datasets (use all data for both training and testing)
C-D
  • Implement code for generating confusion matrices, using the code structure below
A-B
  • Implement code for n-fold cross-validation, using the code structure below
  • It shall be possible to use 3, 5 or 10 folds (it is okay if your implementation supports other folds)
  • Calculate accuracy score for 5-fold cross-validation on both datasets

 

Code structure requirements

NaiveBayes class
void fit ( X:float[][], y:int[] ) Trains the model on input examples X and labels y
preds:int[] predict ( X:float[][] ) Classifies examples X and returns a list of predictions

Other methods
accuracy:float accuracy_score ( preds:int[], y:int[] ) Calculates accuracy score for a list of predictions
conf_matrix:int[][] confusion_matrix ( preds:int[], y:int[] ) Generates a confusion matrix and returns the result as an integer matrix
preds:int[] crossval_predict ( X:float[][], y:int[], folds:int ) Runs n-fold cross-validation and returns a list of predictions

 

Test cases

You can verify your results with the results in Web ML Experimenter. The Iris dataset is built-in in Web ML (click the Try Iris dataset button), and the Banknote authentication can be uploaded from the csv file. Note that the cross-validation results can differ slightly due to differences in how the data is split into folds, but the accuracy you get should be almost similar to the accuracy in Web ML.

Welcome to CoursePress

en utav Linnéuniversitets lärplattformar. Som inloggad student kan du kommunicera, hålla koll på dina kurser och mycket mer. Du som är gäst kan nå de flesta kurser och dess innehåll utan att logga in.

Läs mer lärplattformar vid Linnéuniversitetet

Student account

To log in you need a student account at Linnaeus University.

Read more about collecting your account

Log in LNU