COMP6321_ML_Assignment1

pdf

School

Concordia University *

*We aren’t endorsed by this school

Course

6321

Subject

Computer Science

Date

Jan 9, 2024

Type

pdf

Pages

7

Uploaded by PrivateWaterBuffalo17358

Report
COMP6321 Machine Learning (Fall 2023) Major Assignment #1 Due: 11:59PM, October 26th, 2023 Note Your will be submitting two separate files from this assignment as follows: (a) One(1) .pdf file: containing answers to Question as well as reported results from coding you develop. Include snapshots of the pieces of code you developed in the appendix. (b) One(1) .zip folder: containing all developed Python codes including a README.txt file on explaining how to run your code. 1
Theoretical Questions Question 1 Answer, in a detailed manner, each of the following questions: (a) Define the Turing Test. (b) What is the difference between Classification and Regression in Machine Learning? (c) What are the basic components of Machine Learning? Give a clear expla- nation for each component. (d) What is the difference between Supervised and Unsupervised learning? Give examples for each type. (e) What is the difference between Overfitting and Underfitting? (f) What is the learning rate when training a ML model? and how does it affect the learning process? (g) What is the difference between Gradient Descent and Stochastic Gradient Descent? (h) Explain the fundamental building blocks of a neural network. What are neurons, weights, biases, and activation functions? How do they work together to make predictions? (i) What is the purpose of the activation function in a neural network? Pro- vide examples of commonly used activation functions and describe their characteristics. (j) Define the Vanishing and Exploding Gradient problems. Question 2 (a) Consider a linear regression problem with the absolute error (or L1 error) function. The error associated with a single training sample with input x and target value y is given as: J ( w ) = y w T x = y ( w 0 x 0 + ... + w i x i + ... + w n x n )∣ (1) You are tasked with developing a gradient-descent learning rule for the above objective function. Your rule should be in the form: w i w i η ??? (2) (b) Assume you have a problem with a dataset of two data points x 1 = [0.5, 0.3, 0.8, 0.9] and x 2 = [0.9, 0.6, 0.3, 0.4] and targets y 1 = 0.6 and y 2 = 0.2. You aim to train a linear regression model using the absolute error function, with the initial weights w = [0.5, 0.5, 0.5, 0.5] and a learning rate of 0.1. Assume that the weights are updated after processing each data point, and that your model is trained with two epochs (meaning that your weights are updated 4 times). In each step of the training process, compute the output, the error, and the updated weights. Show your work. 2
Question 3 Consider a min-dataset of two data points x 1 = [2, 3, 5, 1] and x 2 = [1, 0, 1, 2] and targets y 1 = 0 and y 2 = 1. Your goal is to train a logistic regression model with the binary cross-entropy (BCE) loss function, a learning rate of 0.1, and all initial weights having a value of 0. You are to consider biases in your calculations. (a) Given input x i , what is the output function of a logistic regression model? What is the BCE loss function? What is the gradient of the BCE function? (b) Train a logistic regression model using the provided dataset. Use a batch of size 2 to train your model in 2 epochs (i.e. each weight is updated twice). Show your work. Question 4 Consider the following three neural networks: where σ is the sigmoid activation function. (a) For each neural network, derive an expression for the output of the neural network in terms of the input and the weights. (b) Assume that you are using the following loss functions: l = n i 1 2 ( y i t i ) 2 (3) where t is the target and n is the number of outputs for a given input. For each neural network, compute ∂l ∂w i for each w i (i.e. ∂l ∂w 1 , ∂l ∂w 2 , etc ...). (hint: use the chain rule) 3
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
(c) Given the table below of inputs and weights, compute the output of each network. Show your work. Input Weights Network 1 x = 0 . 5 w = [ 0 . 3 , 0 . 6 ] Network 2 x = 0 . 7 w = [ 0 . 5 , 0 . 2 , 0 ] Network 3 x = [ 0 . 5 , 0 . 9 ] w = [ 1 , 0 . 8 , 0 . 2 , 0 , 0 , 0 . 1 ] Question 5 Consider the 3rd network given in question 4, the aim in this question is to perform Backpropagation. Assume that the weights are initialized as ( w 1 = 0 . 5 , w 2 = 0 . 9 , w 3 = 1 . 0 , w 4 = 1 . 2 , w 5 = 0 . 3 , w 6 = 0 . 1 ), and that you are using sigmoid as the activation function at each neuron, with a learning rate of η = 0 . 1 . (a) Perform a forward pass through the network, showing the final output as well as the output at each hidden unit. (b) For each output unit k , the error term δ k is calculated as δ k g ( x k ) × Err k , where g ( x ) is the activation function, x k is the input to that unit, and Err k is the error between the output O k and the target T k , given as Err k = O k T k . Derive an expression for δ k in terms of O k and T k , and use this expression to compute δ k for the output of your network. (c) For each hidden unit h , the error term δ h is calculated as δ h g ( x h ) × Err h , where g ( x ) is the activation function, x h is the input to that unit, and Err h is the error of the output O h , given as Err h = k outputs w hk δ k . Derive an expression for δ h in terms of O h , w hk , and δ k , and use this expression to compute δ h for each of the hidden units. (d) Each weight w ij connecting nodes i and j is updated as w ij w ij + w ij , where w ij = ηδ j O i . Update all the weights in your network. (e) Explain, in your own words, what would change in this process above if you are to use a ReLU activation function instead of sigmoid. Question 6 For each of the following 2D maps (images), apply convolution using the given filter with the specified parameters (assume no padding): (a) Stride = 1 4
(b) Stride = 2 (c) MaxPooling with kernel size = 2 Question 7 For each of the architectures given below, write down the output dimensions (in H × W × N format) of each layer. Refer to PyTorch documentations on how to handle cases with fractions. (note: in this question, pooling refers to Max Pooling layer with a 2x2 kernel and a stride of 2). (a) (b) Question 8 Design a Convolutional Neural Network (CNN) for image classification with the following requirements: Input image size is 224 × 224 × 3 . 5
The CNN encoder contains exactly four(4) convolutional layers. You are allowed a maximum of two MaxPool layers, each with a 2 × 2 kernel and a stride of 2. The output dimensions before flattening is 7 × 7 × 256 . For the convolutional layers, filters (kernels) have to be of odd dimensions with a maximum size of 7 × 7 . The maximum padding allowed is 2 and the maximum stride allowed is 2. The classifier contains one Fully-Connected (FC) layer with 5 output pre- diction classes. You are required to draw/sketch the whole CNN pipeline describing the de- tails of each layer in the network (kernel size, padding, stride, input/output dimensions). Implementation Questions Question 1 You are tired of paying exorbitant health insurance premiums every year. Your goal is to train a machine learning model that can accurately predict health insurance prices for individuals based on attributes such as age, sex, region, etc. You can use this dataset to train your model. (a) Use statistical methods and graphs/plots to describe your daataset. (b) Split your dataset into train and test sets with a 7:3 ratio. Use the train_test_split tool from scikit-learn. (c) Build and train a Linear Regression model using scikit-learn. Explore the parameters of the model in scikit-learn, and aim for higher accuracies. (d) Evaluate the performance of your model on both the train and test sets (separately). You can use scikit-learn’s mean squared error tool. Question 2 Your tasked with developing a ML model for lung cancer prediction. Given information about the patient, such as their sex, age, allergies, etc, your model should predict whether or not they have lung cancer. You can use this dataset to train your model. (a) Use statistical methods and graphs/plots to describe your daataset. (b) Split your dataset into train and test sets with a 7:3 ratio. Use the train_test_split tool from scikit-learn. (c) Build and train a Logistic Regression model using scikit-learn. Explore the parameters of the model in scikit-learn, and aim for higher classification accuracies. (d) Report and discuss the performance of your developed model on both the train and test sets (separately). You can use scikit-learn’s classification report tool. 6
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Question 3 Design and train two neural networks to tackle the regression and classification tasks in Question 1 and Question 2. Use the same datasets and train-test split ratio. Build and train neural networks using PyTorch. Report on the performance of the NN models, and compare them with the models developed using Linear Regression and Logistic Regression. Question 4 Design and implement a CNN to be used in a task of Medical Image Classifica- tion. Given an image of an MRI/CT/X-ray scan, your model is to predict the body part being scanned. Some sample images are shown in the figure below. You should use and download the dataset from Kaggle, which you can reduce to 1000 data points per class. You are required to split the data into train and test sets with 7:3 ratio. You should design and build a simple CNN using Py- Torch, and train it on the given dataset for 10 epochs. Report on your design, your choice of hyperparameters, as well as the training accuracy/loss plots. You can use scikit-learn’s classification report tool for numerical analysis. You can use the code segments below to guide you in loading the data (doc- umentations: ImageFolder , ToTensor , torch.utils.data ). Make sure that your dataset contains folders corresponding to the different classes. Refer to PyTorch documentations for the required libraries to be imported, and for examples on training CNN. =========================================== dataset=ImageFolder(path,ToTensor()) # loads dataset from path train_set,test_set=torch.utils.data.random_split(dataset,[0.7,0.3]) # splits dataset into spec- ified ratios train_loader=DataLoader(train_set,shuffle=True,batch_size=16) # create train loader test_loader=DataLoader(test_set,batch_size=16) # create test loader ================================================== 7