Midterm previous qsns

docx

School

Concordia University Saint Paul *

*We aren’t endorsed by this school

Course

517

Subject

Computer Science

Date

Jan 9, 2024

Type

docx

Pages

7

Uploaded by Charitaa

Report
1) Suppose you already have a RNN model which takes a time series and predicts the next output, one simple strategy to predict several steps ahead is to use the prediction output as the new input to feed into the RNN so that it can continue to predict the next one. What is the drawback of this strategy? Group of answer choices a) This will cause vanishing gradient problem since the more steps ahead, the more likely to have the same prediction. b) This will cause exploding gradient problem since it gradually forgets the early values in the time series. c) Since we are using the predicted output as the new input, the errors might accumulate, so the more steps ahead, the less accurate it becomes. d) This causes the input to grow so that the model runs slower for later time steps. OPTION : C Which of the following is FALSE about training GANs? Group of answer choices A) The root cause of the difficulties of training GANs is because of multiple Nash equilibriums so that GANs cannot converge to the optimal one. B) Once the GAN becomes somewhat good at one category, it gradually switches to another category. It never becomes really good at any of them. C) The training process is unstable because the generator and the discriminator are constantly pushing against each other. D) Increasing epochs usually won't help GANs to generate more realistic data b) Which of the following is FALSE about PCA and Autoencoders? Group of answer choices A) Both PCA and Autoencoders are dimensionality reduction techniques B) PCA works well with non-linear data while Autoencoders are best suited for linear data C) PCA finds the orthogonal dimensions while Autoencoder doesn't have to. D) Autoencoder has the advantage of being able to handle large datasets with many features.
2 nd Option True or False The goal of autoencoder is to generate an output as realistic as the input. Group of answer choices True False Answer: False Which of the following statement is the best description of early stopping? Group of answer choices a) Add a momentum term to the weight update in the Generalized Delta Rule, so that training converges more quickly b) Train the network until a local minimum in the error function is reached c) A faster version of backpropagation, such as the `Quickprop’ algorithm d) Simulate the network on a test dataset after every epoch of training. Stop training when the generalization error starts to increase Answer: 4th option Which of the following is NOT true about image data augmentation? Group of answer choices a) We don't need extra storage to store the generated images. b) The data augmentation process can be part of the model c) We should try to apply every possible transformations to the original image to maximize the training set.
d) Image data augmentation doesn't always improve the performance of the model if the generated images are not expected to see in real life. Option: c) For a classification task, instead of random weight initializations in a neural network, we set all the weights to zero. Which of the following statements is true? Group of answer choices a) There will not be any problem and the neural network will train properly b) The neural network will not train as there is no net gradient change c) None of these d) The neural network will train but all the neurons will end up recognizing the same thing OPTION: 4 CT: B Which of following activation functions are non saturating? Choose all correct ones. Group of answer choices A) Exponential Linear Unit B) Leaky ReLU C) Sigmoid D) Tanh Options 1 & 2 Which of the following is FALSE about "Tying Weights" in training autoencoders? Group of answer choices a) The weights in the coding layers are the same as the corresponding decoding layers. b) Tying weights can reduce the risk of overfitting. c) The biases in the coding layers and decoding layers are independent. d) Tying weights will make training faster. Option : D
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Which of the following is FALSE about zero ("same") padding? Group of answer choices A) It is used to preserve edge information of the image B) if stride=1, then the output will be the same size as the input C) It will discard some pixels from the input image. D) It may pad different number of zeros towards each edge. Option : C Which of the following components make a neural network non-linear in nature? Group of answer choices a) nonsequential model b) activation functions c) weights and biases d) Hidden layers Option : B What's the reason(s) behind the choice of "embedding" vs "one-hot encoding" when preparing the IMDb reviews for training? Choose all that applies. Group of answer choices a) Embedding needs to be trained, so one-hot encoding should be used. b) we have more than 10000 unique words, the one-hot encoding will create one big array to hold each word, while embedding can represent each word in a more compact space. c) one-hot encoding tells you nothing about the meaning of each word, while embedding will learn to group related words together. Therefore, it makes more sense to use embedding for sentiment analysis. d) One-hot encoding makes each word an independent array, thus makes the training process more stable. option b and c a Batch Normalization is helpful because Group of answer choices a) It returns back the normalized mean and standard deviation of weights b) It normalizes (changes) all the input before sending it to the next layer c) It is a very efficient backpropagation technique d) 0None of these
OPTION b) Which of the following is FALSE about Kernels in CNN? Group of answer choices A) Kernels can be used in convolutional as well as in pooling layers B) Kernels extract simple features in initial layers and complex features in deeper layers C) Kernels lead to dimensionality reduction D) There is one kernel in one convolutional layer. option 4 What makes a neural network model become a deep learning model? Group of answer choices A) add more hidden layers and increase depth of neural network B) higher dimensionality of data C) output layer has more than one neuron D) has convolution layer OPTION A) Which of the following is FALSE about Batch Size in a neural network? Group of answer choices A) Batch size is limited by the memory size of the computer. B) It is a hyperparameter C) Batch size doesn't have to be divisible by the size of training set. D) Batch size is the number of times a sample passes through the network. option 4 Which of the following transformation does a convolutional neural network perform? 1. Rotation 2. Scaling 3. Convolving 4. Pooling Group of answer choices
3.4 All of them 3 2,4 OPTION A In MLP model, how many input nodes are required to process a grayscale image of 28X28? Group of answer choices A) 28x28 B) 56x1 C) 56x56 D) 28x1 Option A Each gate in LSTM is trained to output either 0 or 1 to indicate two opposite states: 0 for close and 1 for open. Group of answer choices True False False To improve the training efficiency in sentiment analysis of IMDb reviews, which of the following technique is FALSE? Group of answer choices We can ignore the other uncommon words. we can use embedding's encoding instead of one-hot encoding since there are too many unique words. We can truncate each review to keep the first 300 characters. We can limit the vocabulary size to only keep the 10000 most common words.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
option 1 Which of the following is FALSES about Dropout? Group of answer choices Dropout increases the accuracy and performance of the model Dropout is implemented per layer in a model. Dropout can make the training process more noisy Dropout rate is a trainable parameter in the model. option 4 Which of the following is NOT included in a TensorFlow's SavedModel format? Group of answer choices parameter values computation graph preprocessing layers HDF5 model. option 4