homework6_4322

docx

School

University of Houston *

*We aren’t endorsed by this school

Course

4322

Subject

Mathematics

Date

Feb 20, 2024

Type

docx

Pages

9

Uploaded by hongyumei411

Report
1 Homework 6 - MATH 4322 Instructions 1. Due date: November 28, 2023 2. Answer the questions fully for full credit. 3. Scan or Type your answers and submit only one file. (If you submit several files only the recent one uploaded will be graded). 4. Preferably save your file as PDF before uploading. 5. Submit in Canvas. 6. These questions are based on the Neural Networks lectures. 7. The information in the gray boxes are R code that you can use to answer the questions. Problem 1 You are given: 𝑛 data samples x 𝑖 = ( 𝑥 1, 𝑖 , , 𝑥 , 𝑝 𝑖 ), 𝑖 = 1, … , 𝑛 𝑛 corresponding to true responses (or labels) 𝑦 𝑖 , 𝑖 = 1, … , 𝑛. and asked to train a single linear neuron “network” to approximate function (.) 𝑓 such that ( 𝑓 𝑥 𝑖 ) = 𝑦 𝑖 , 𝑖 = 1, … , 𝑛. Provided the train steps for your “network” by answering the following questions. a) What is the formula to calculate an output 𝑦 𝑖 ̂ from an input x 𝑖 ? What are the model parameters in that formula? y = i = 1 n w i x i + b b) What criteria do we need to optimize in order to estimate the model parameters? The criterion to optimize in order to estimate the model parameters is typically a loss function. A common choice for regression problems is the MSE loss function. MSE = 1 n i = 1 n ( ^ y i y i ) 2 c) What is the name of the method used to optimize this criteria in case you do not have access to an analytically solution?
2 The name of the method used to optimize the criteria (loss function) when an analytical solution is not available is typically numerical optimization, and a common approach within this category is gradient descent.
3 Problem 2 Presume that for a single linear neuron model with input variables 𝑥 1 , … , 𝑥 5 , you are given the following parameter values: weights: 𝑤 1 = 0.2, 𝑤 2 = −0.54, 𝑤 3 = −0.21, 𝑤 4 = −0.1, 𝑤 5 = 0.33 , bias: 𝑏 = 0.14 . a) Draw a mathematical model of this linear neuron that takes an arbitrary input vector x = ( 𝑥 1 , 𝑥 2 , 𝑥 3 , 𝑥 4 , 𝑥 5 ) . y = i = 1 5 wi x i + b b) Calculate the linear neuron output for the case of 𝑥 1 = 4, 𝑥 2 = −3, 𝑥 3 = 7, 𝑥 4 = 5, 𝑥 5 = −1 . Show your work. ^ y = wi x i + b = 0.2 4 + ( 0.54 ) ( 3 ) + ( 0.21 ) 7 + ( 0.1 ) 5 + 0.33 ( 1 ) + 0.14 = 0.8 + 1.62 1.47 0.5 Problem 3 You are given an artificial neural network (ANN) of linear neurons with Input layer of two neurons: 𝑥 1 , 𝑥 2 Fully-connected hidden layer of three neurons: ℎ 1 , 2 , 3 One output neuron, 𝑦. The following weight matrices are provided: 1) Between input & hidden layer: Hidde n 1 2 3 1 (bias) -0.3 0.5 0.5 Input 𝑥 1 0.6 -0.4 0.5 𝑥 2 -0.7 -0.3 0.2
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
4 2) Between hidden & output layer: Outpu t 𝑦 1 (bias) 0.2 Hidde n 1 -0.3 2 0.5 3 -0.7 a) Draw this ANN as was done in lecture slides. b) Calculate the output of this ANN for the case of 𝑥 1 = 10, 𝑥 2 = −5 . Show work. h 1 = 0.6 10 + ( 0.7 ) ( 5 ) + ( 0.3 ) = 6 + 3.5 0.3 = 9.2 h 2 = ( 0.4 ) 10 + ( 0.3 ) ( 5 ) + 0.5 =− 4 + 1.5 + 0.5 =− 2 h 3 = 0.5 10 + 0.2 ( 5 ) + 0.5 = 5 1 + 0.5 = 4.5 ^ y = ( 0.3 ) 9.2 + 0.5 ( 2 ) + ( 0.7 ) 4.5 + 0.2 = ( 2.76 ) + ( 1 ) + ( 3.15 ) + 0.2 =− 6.71
5 Problem 4 We want to predict the ‘medv’ value based on the input of the other thirteen variables. We will run a regression neural network for the Boston data set. We will split the data into training/testing by a 70/30 split. a) Type and run the following in R. library (neuralnet) library (MASS) data = Boston #renaming the Boston data set to "data" summary (data) What is the mean of age? What is the mean of ptratio? Mean of age = 68.57, Mean of ptratio = 18.46 b) Normalizing data It is recommended to normalize (or scale, or standardize, either works) features in order for all the variables to be on the same scale. With normalization, data units are eliminated, allowing you to easily compare data from different locations. This avoids unnecessary results or difficult training processes resulting in algorithm con- vergence problems. There are different methods for scaling the data. The z-normalization 𝑥 𝑥̄ 𝑥 𝑠𝑐𝑎𝑙𝑒 = 𝑠 The min-max scale And so forth The function in R is scale(x,center = ,scale = ) For this example we will use the min-max method to get all the scaled data in the range [0, 1]. In order to scale we need to find the minimum and maximum value for each of the columns in the data set. To do this we use the apply function. The apply function returns a vector or an array or a list of values obtained by applying
6 a function to margins of an array or matrix.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
7 Type and run the following: max_data = apply (data, 2 ,max) #2=columns, we are getting the maximum value from each column min_data = apply (data, 2 ,min) data_scaled = scale (data, center = min_data, scale = max_data - min_data) head (data_scaled) What is the scaled value of the first observation for medv? 0.4222222 c) Now we can split the data into training and testing data sets. We will use the 70/30 split set.seed ( 10 ) index = sample ( 1: nrow (data), round ( 0.7* nrow (data))) train_data = as.data.frame (data_scaled[index,]) test_data = as.data.frame (data_scaled[ - index,]) dim (train_data) How many observations do we have in the training data set? 354 observations d) Type and run the following set.seed ( 1 ) net_data = neuralnet (medv ~ ., data = train_data, hidden = 10 , linear.output = TRUE ) plot (net_data)
8 Apply the test data set to determine the MSE predict_net = predict (net_data,test_data) predict_net_start = predict_net * ( max (data $ medv) - min (data $ medv)) + min (data $ medv) test_data_start = test_data $ medv * ( max (data $ medv) - min (data $ medv)) + min (data $ medv) sum ((predict_net_start - test_data_start) ^2 ) / nrow (test_data) What is the test MSE for this model? 15.37819 e) Let us compare this test MSE to the linear regression model. Type and run the following: lm.boston = lm (medv ~ ., data = data, subset = index ) summary (lm.boston) test = data[ - index,] predict_lm = predict (lm.boston,test) sum ((predict_lm - test $ medv) ^2 ) / nrow (test) What is the training MSE for the linear model? 17.7737
9
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help