two_layer_nn

pdf

School

Georgia Institute Of Technology *

*We aren’t endorsed by this school

Course

7641

Subject

Computer Science

Date

Feb 20, 2024

Type

pdf

Pages

Uploaded by AdmiralWaterBuffalo3874

# Do not use packages that are not in standard distribution of python import numpy as np np.random.seed(1024) from ._base_network import _baseNetwork class TwoLayerNet(_baseNetwork): def __init__(self, input_size=28 * 28, num_classes=10, hidden_size=128): super().__init__(input_size, num_classes) self.hidden_size = hidden_size self._weight_init() def _weight_init(self): """ initialize weights of the network :return: None; self.weights is filled based on method - W1: The weight matrix of the first layer of shape (num_features, hidden_size) - b1: The bias term of the first layer of shape (hidden_size,) - W2: The weight matrix of the second layer of shape (hidden_size, num_classes) - b2: The bias term of the second layer of shape (num_classes,) """ # initialize weights self.weights['b1'] = np.zeros(self.hidden_size) self.weights['b2'] = np.zeros(self.num_classes) np.random.seed(1024) self.weights['W1'] = 0.001 * np.random.randn(self.input_size, self.hidden_size) np.random.seed(1024) self.weights['W2'] = 0.001 * np.random.randn(self.hidden_size, self.num_classes) # initialize gradients to zeros self.gradients['W1'] = np.zeros((self.input_size, self.hidden_size)) self.gradients['b1'] = np.zeros(self.hidden_size) self.gradients['W2'] = np.zeros((self.hidden_size, self.num_classes)) self.gradients['b2'] = np.zeros(self.num_classes) def forward(self, X, y, mode='train'): """

The forward pass of the two-layer net. The activation function used in between the two layers is sigmoid, which is to be implemented in self.,sigmoid. The method forward should compute the loss of input batch X and gradients of each weights. Further, it should also compute the accuracy of given batch. The loss and accuracy are returned by the method and gradients are stored in self.gradients :param X: a batch of images (N, input_size) :param y: labels of images in the batch (N,) :param mode: if mode is training, compute and update gradients;else, just return the loss and accuracy :return: loss: the loss associated with the batch accuracy: the accuracy of the batch self.gradients: gradients are not explicitly returned but rather updated in the class member self.gradients """ loss = None accuracy = None ###################################################################### ####### # TODO: # # 1) Implement the forward process: # # 1) Call sigmoid function between the two layers for non-linearity # # 2) The output of the second layer should be passed to softmax # # function before computing the cross entropy loss # # 2) Compute Cross-Entropy Loss and batch accuracy based on network # # outputs # ###################################################################### ####### first_layer=self.sigmoid(np.dot(X,self.weights['W1']) +self.weights['b1']) second_layer=np.dot(first_layer,self.weights['W2'] +self.weights['b2']) softmax=self.softmax(second_layer) loss = self.cross_entropy_loss(softmax,y) accuracy=self.compute_accuracy(softmax,y)

###################################################################### ####### # END OF YOUR CODE # ###################################################################### ####### ###################################################################### ####### # TODO: # # 1) Implement the backward process: # # 1) Compute gradients of each weight and bias by chain rule # # 2) Store the gradients in self.gradients # # HINT: You will need to compute gradients backwards, i.e, compute # # gradients of W2 and b2 first, then compute it for W1 and b1 # # You may also want to implement the analytical derivative of # # the sigmoid function in self.sigmoid_dev first # ###################################################################### ####### y_shape = y.shape[0] one_hot_encode = np.zeros(softmax.shape) one_hot_encode[range(y_shape),y] = 1.0 self.gradients['W2'] = np.dot(first_layer.T, (softmax- one_hot_encode))/y_shape self.gradients['b2'] = np.sum(softmax-one_hot_encode,axis=0)/ y_shape first_layer_dv=self.sigmoid_dev(np.dot(X,self.weights['W1']) +self.weights['b1'])* np.dot(softmax-one_hot_encode, self.weights['W2'].T) self.gradients['W1'] = np.dot(X.T, first_layer_dv)/y_shape self.gradients['b1'] = np.sum(first_layer_dv,axis=0)/y_shape

Your preview ends here

Eager to read complete document? Join bartleby learn and gain access to the full version

Access to all documents
Unlimited textbook solutions
24/7 expert homework help

Related Documents

3..png

Team Consulting Project - Instructions v2 (1).docx

MGMT-104-HW1.docx

2023w2-mt1:mt1-p5 exam solution.docx

Problem Set2.docx

05_pytorch_going_modular.md

Assignment 6 Analyze Skilled Dialogue and an IEP Meeting.pdf

midterm1-23w-soln.pdf

DAD 220 3-2 Table Joins.pdf

midterm1-23w.pdf

exam-11-february-winter-2013-questions-and-answers.pdf

CS5343 Dec 7, 2021 - Set 1.docx

Recommended textbooks for you

C++ Programming: From Problem Analysis to Program...

Computer Science

ISBN:9781337102087

Author:D. S. Malik

Publisher:Cengage Learning

EBK JAVA PROGRAMMING

Computer Science

ISBN:9781337671385

Author:FARRELL

Publisher:CENGAGE LEARNING - CONSIGNMENT

C++ for Engineers and Scientists

Computer Science

ISBN:9781133187844

Author:Bronson, Gary J.

Publisher:Course Technology Ptr

Microsoft Visual C#

Computer Science

ISBN:9781337102100

Author:Joyce, Farrell.

Publisher:Cengage Learning,

EBK JAVA PROGRAMMING

Computer Science

ISBN:9781305480537

Author:FARRELL

Publisher:CENGAGE LEARNING - CONSIGNMENT

Programming Logic & Design Comprehensive

Computer Science

ISBN:9781337669405

Author:FARRELL

Publisher:Cengage

SEE MORE TEXTBOOKS

Recommended textbooks for you

C++ Programming: From Problem Analysis to Program...
Computer Science
ISBN:9781337102087
Author:D. S. Malik
Publisher:Cengage Learning
EBK JAVA PROGRAMMING
Computer Science
ISBN:9781337671385
Author:FARRELL
Publisher:CENGAGE LEARNING - CONSIGNMENT
C++ for Engineers and Scientists
Computer Science
ISBN:9781133187844
Author:Bronson, Gary J.
Publisher:Course Technology Ptr
Microsoft Visual C#
Computer Science
ISBN:9781337102100
Author:Joyce, Farrell.
Publisher:Cengage Learning,
EBK JAVA PROGRAMMING
Computer Science
ISBN:9781305480537
Author:FARRELL
Publisher:CENGAGE LEARNING - CONSIGNMENT
Programming Logic & Design Comprehensive
Computer Science
ISBN:9781337669405
Author:FARRELL
Publisher:Cengage