In assignment 4 you shall classify some data sets using different machine learning algorithms. You can work alone or in group of two students. You shall present your application and code at an oral examination.
In this assigment you can either choose to use Java and the WEKA library, or Python and Scikit-learn. For the highest grades you need to use Python. For Python code it is recommended to use Jupyter Notebook.
- Classify the Spiral data set using a Linear and a Neural Network classifier
- You can either write Java code and use the Weka library, or Python code and use the Sciki-learn framework
- In Weka you shall use Logistic and MultilayerPerceptron, and in Scikit-learn SGDClassifier and MLPClassifier
- For the Neural Network classifier you shall use a single hidden layer with 72 nodes
- The Spiral data set is two-dimensional with three classes
- Plot the data set in a scatterplot (using for example JavaFX Charts in Java or Matplotlib in Python, Excel or similar is not allowed)
- Why is the accuracy of the Linear classifier very low compared to the Neural Network classifier?
- Write code to show the confusion matrix for both the Linear and Neural Network classifier
- How shall the confusion matrix be interpreted? What does it tell us that the accuracy metric doesn’t?
- Classify the MNIST hand written digits data set using Keras
- Use both a Linear and a Convolutional Neural Network (ConvNet) classifier
- The data set is available in Keras using the mnist.load_data() function call
- Use the training data for training the classifier and the test data to evaluate accuracy
- Train the Linear classifier for 15 epochs and the ConvNet classifier for 10 epochs
- Measure the training times for both algorithms and check how much longer it takes to train the ConvNet classifier